User Guide · v0.1-macos · macOS Edition
AceForge is a local AI music workstation powered by ACE-Step and a custom UI designed to make generating, tweaking, and curating your music a smooth and cohesive experience. This guide explains how to install AceForge on macOS, generate tracks, manage your library, and train LoRAs.
Local-first · macOS · ACE-Step text → music · LoRA training · Stem separation · Dataset tools · Apple Metal (MPS) GPU acceleration
- Overview
- System requirements
- Installation & first launch
- UI tour
- Generating music
- Vocal / instrumental stem control
- Training LoRAs
- Dataset mass-tagging tools
- MuFun-ACEStep analyzer (experimental)
- Troubleshooting & FAQ
AceForge is a local AI music workstation for people who actually like owning their tools. It runs on your macOS system, uses Apple Metal (MPS) GPU acceleration, and keeps all audio and prompts on your hardware.
- ACE-Step – the diffusion engine that turns prompts + lyrics into audio.
- PyTorch – the deep learning runtime used by ACE-Step and related models, with native Apple Metal (MPS) support.
- Qwen-like LLM backend (via your configured model) – used for the "Generate prompt / lyrics…" helper.
- audio-separator – used for post-process stem separation (vocals vs. instruments).
- MuFun-ACEStep (optional) – an analyzer that can auto-create prompt/lyrics files for datasets.
-
Sleek generation UI with Core and Advanced sections, presets, and clear tooltips so you don't have to memorize every ACE-Step knob.
-
Built-in music player + library view – browse, sort, favorite, and categorize every generated track.
-
Preset system – save, load, and share your favorite generation settings.
-
LoRA training UI – configure and kick off ACE-Step LoRA training runs without hand-editing Python scripts.
-
Dataset helpers – bulk create prompt/lyrics files or auto-generate them with MuFun-ACEStep.
- Launch AceForge → wait for first-time setup (venv + packages + ACE-Step models).
- Use Generate Track to create songs from prompts (optionally with lyrics).
- Browse, favorite, and categorize tracks in the Music Player.
- (Optional) Use stem controls to tweak vocal vs. instrumental levels.
- (Optional) Build datasets and use the Training tab to train custom LoRAs.
- macOS 12.0 (Monterey) or later
- Apple Silicon (M1/M2/M3) or Intel Mac with AMD GPU
- 16 GB unified memory (for Apple Silicon) or 16 GB RAM
- ~10–12 GB VRAM/unified memory (more gives more headroom)
- SSD with tens of GB free (models + audio + datasets)
- Python 3.10 or later
- Apple Silicon M1 Pro/Max/Ultra, M2 Pro/Max/Ultra, or M3 Pro/Max
- 32 GB+ unified memory
- Fast SSD for models and datasets
- Comfortable with terminal and reading console logs
Note: The very first launch does a lot: creates a virtual environment, installs Python packages, and downloads ACE-Step and related models. This can take a while. All of that work is reused on later launches.
Apple Metal (MPS) GPU acceleration is automatically enabled on compatible systems, providing excellent performance on Apple Silicon.
- Download
AceForge-macOS.dmgfrom the Releases page. - Open the DMG file.
- Drag
AceForge.appto your Applications folder. - Right-click the app and select "Open" (first time only, to bypass Gatekeeper).
- The application will start and open in your browser.
-
Ensure you have Python 3.10 or later installed:
# Check Python version python3 --version # If not installed, install via Homebrew brew install python@3.10
-
Clone the repository:
git clone https://github.com/audiohacking/CDMF-Fork.git cd CDMF-Fork -
Make the launcher script executable:
chmod +x CDMF.sh
-
Run the launcher:
./CDMF.sh
-
Launch AceForge from Applications or by running
./CDMF.sh. -
A terminal window titled "AceForge – Server Console" will appear. This window must stay open while AceForge runs.
-
AceForge immediately opens a loading page in your default browser while the backend is starting.
-
On first run, the console will:
- Create
venv_acein the app folder. - Install packages from
requirements_ace_macos.txt. - Install ACE-Step with PyTorch (including MPS support).
- Set up other helpers like
audio-separator.
- Create
-
When the server is ready, your browser will show the full AceForge UI.
Important: Don't close the terminal while it's working. If you see Python / pip errors, read the last messages carefully. Many issues (disk space, network connectivity) will show up here.
On later launches, AceForge will:
- Reuse the existing
venv_ace. - Skip package installs if everything is already in place.
- Skip large model downloads unless a feature needs a new one (e.g. MuFun).
At the top you'll see the AceForge titlebar:
- Logo on the left.
- App title and version (e.g.
v0.1-macos) on the right. - A short tagline: "Generate unlimited custom music with a simple prompt and style presets via ACE-Step."
The first main card is Music Player. It's your library view for generated tracks.
-
Folder: shows the current output directory.
-
Category filter chips: a row of colored chips lets you filter by category. These are driven by category labels on your tracks.
-
Track header row: sortable columns:
- ★ (favorite)
- Name
- Length
- Category
- Created
- Actions
Each track row shows:
- A favorite button (★) – click to toggle favorite state.
- The track name (based on the WAV filename).
- Metadata: length and category.
- A small trash icon to delete the file from disk.
Tip: You can use the header buttons to sort (e.g. by name or creation time), and use the category filter chips above the list to quickly narrow down to "lofi", "battle", "town", etc.
Below the track list you'll find:
- Time labels – current time and total duration.
- Seek bar – click / drag to move around in the track.
- Buttons: Rewind, Play, Stop, Loop, Mute.
- Volume slider – global playback volume.
Beneath the player is a small tab strip:
- Generate – the main text-to-music UI.
- Training – LoRA training and dataset tools.
Only one mode is visible at a time.
At the top of the Generate Track card, you'll see:
- A Generate button.
- A loading bar that animates while a generation is in progress.
- A model status notice if ACE-Step isn't downloaded yet. It will prompt you to click "Download Models" and warn that this is a large download.
The generation controls are split into:
- Core – most of what you need most of the time.
- Advanced – scheduler, CFG modes, repaint/extend, audio2audio, LoRA internals.
A good mental model: use the Core tab to get high-quality songs without touching anything you don't understand. The Advanced tab is for experiments and fine-tuning once you're comfortable.
Base filename (basename) is the prefix for your output WAV
files. AceForge will append numbers / timestamps as needed so they don't collide, but the base
name is what you'll see in the player.
The button "Generate prompt / lyrics…" opens a small modal where you can:
- Describe a song concept ("melancholic SNES overworld at night…").
- Choose to generate:
- Prompt only
- Lyrics only
- Prompt + lyrics
AceForge uses an LLM backend to fill in the Genre / Style Prompt box and/or the Lyrics box based on your selection.
When Instrumental is checked, the dialog will default to Prompt only. When it's unchecked, it leans toward Prompt + lyrics.
This is your main ACE-Step prompt. Use it to describe:
- Genre and instrument palette (e.g. "16-bit SNES snowfield, chiptune pads…").
- Tempo/mood ("slow, melancholic, wistful but hopeful").
- Context ("looping BGM for JRPG overworld").
Below the prompt field are two preset groups:
- Instrumental preset buttons shown when Instrumental is checked.
- Vocal preset buttons shown when Instrumental is unchecked.
Each preset sets a bundle of internal knobs (target seconds, steps, guidance, etc.) and may tweak internal "seed vibes" for different sound families. The Random buttons pick from a curated list to keep exploring without you having to think too hard.
-
When Instrumental is checked:
- Lyrics are not used for generation.
- ACE-Step receives a special
[inst]token so it focuses on backing tracks. - The Lyrics box is hidden to keep the UI clean.
-
When Instrumental is unchecked:
- The Lyrics panel appears.
- You can paste or write lyrics with markers like
[verse],[chorus],[solo], etc.
There's also a Clear button inside the Lyrics row to quickly wipe the lyrics field.
-
Target length (seconds) – slider + numeric box that tells ACE-Step roughly how long the track should be.
-
Fade in / Fade out (seconds) – small fades applied at the start/end of the final audio.
Tip: 0.5–2.0 seconds is a good fade range for most BGM tracks.
-
Inference steps – 50–125 is a good range. Higher is slower and may not always increase quality.
-
Guidance scale – how strongly ACE-Step follows your text. Extreme values can introduce noise.
-
BPM (optional) – if set, AceForge adds a hint like
tempo 120 bpmto the tags. -
Seed + Random checkbox:
- When Random is checked, AceForge picks a random seed each time.
- When unchecked, you can lock a specific seed to re-roll close variations.
At the end of the Core section you'll see:
- Vocals level (dB)
- Instrumental level (dB)
These are post-process gain adjustments created by running the track through
audio-separator and rebalancing stems.
Important: Using stem controls requires downloading a large stem separation model on first use and adds a heavy post-process step. For fastest iteration:
- Generate a track at neutral levels (0 dB / 0 dB).
- Find a track you like.
- Turn off Random Seed and keep other settings the same.
- Re-generate with adjusted vocal / instrumental gains.
The Advanced tab exposes more ACE-Step internals:
- Scheduler type (Euler, Heun, ping-pong).
- CFG mode (APG, CFG, CFG★) and related parameters.
- ERG switches (tag, lyric, diffusion).
- Repaint / extend:
- Task: text2music / retake / repaint / extend.
- Repaint start / end in seconds.
- Retake variance for variations.
- Audio2Audio:
- Ref strength
- Ref audio file upload
- Optional explicit source path
- LoRA adapter fields:
- Pick installed LoRAs from a dropdown.
- Browse for a LoRA folder under
custom_lora. - Set the LoRA weight (0–10).
In Cover and Audio→Audio modes you transform an existing track. The following parameters directly control how much the output follows the source audio vs your Style and Lyrics:
| Parameter | What it controls | Effect |
|---|---|---|
| Style of Music (caption) | Target style for the output | Describes the target genre, mood, instruments. Strongly influences the result when Cover Strength is lower. |
| Lyrics | Target lyrics for the output | The target lyric content and structure. Uncheck Instrumental to use them; otherwise the model gets an instrumental token. |
| Cover Strength (Source influence) | Balance: source vs your text | 1.0 = output follows the source closely (structure + character). Lower (e.g. 0.5–0.7) = more influence from your Style and Lyrics. 0.2 = loose style transfer. |
| Instrumental | Whether lyrics are used | When checked, lyrics are ignored and the model receives an instrumental token. Uncheck to apply your Lyrics. |
| Guidance scale | How strongly the model follows text | Higher = stronger adherence to your Style/Lyrics (and to the source when combined with high Cover Strength). |
Summary: For covers that reflect your own style and lyrics, set Style and Lyrics as desired, uncheck Instrumental if you use lyrics, and lower Cover Strength (e.g. 0.5–0.7) so your text has more influence. The (i) tooltips in the Create panel repeat this for quick reference.
If you're new to ACE-Step, you can ignore the Advanced tab entirely. The defaults were chosen to be safe and high quality out of the box.
At the bottom of the Generate card is a Saved presets block:
- My presets dropdown – shows your saved presets after you create some.
- Load – apply the selected preset to the current form.
- Save – capture the current knobs as a new preset.
- Delete – remove a preset.
Presets record both text fields (prompt, lyrics, etc.) and numerical fields (steps, seeds, gains, etc.), so you can quickly return to a particular "vibe kit" without screenshots or manual notes.
The Output directory field controls where WAVs are written. It defaults to the path shown in the Music Player header. If you change this, remember that:
- The player will look at the directory you specify.
- If you point it somewhere else, you may want to restart AceForge or refresh so the player sees it.
AceForge integrates audio-separator so you can rebalance vocals and instrumentals after generation:
- Vocals level (dB) – boosts or reduces the vocal stem relative to the original mix.
- Instrumental level (dB) – boosts or reduces the backing track stem.
Both use decibel adjustments:
0 dB– leave as-is.- Negative values – make that stem quieter.
- Positive values – make that stem louder.
On first use, AceForge will need to download the stem-separation model. This is large and adds a significant processing step. For quick sketching, leave both gains at 0 dB and only use stems once you're close to a final track.
Switch to the Training mode tab to see the LoRA controls:
-
Start Training – submits the training form to the backend and starts ACE-Step's trainer.
-
Pause / Resume / Cancel – control an in-progress run. These are wired up to backend endpoints that can pause, resume, or stop training.
-
Status indicator – small banner and loading bar that reflect the current state.
Pausing saves a checkpoint and allows resuming later. If you restart the server, the paused state is preserved and you'll be prompted to Resume or Cancel before starting a new run.
The Dataset Setup / Formatting section describes how training datasets should be structured:
-
Your dataset folder must live under:
<AceForge root>/training_datasets -
For each
foo.mp3(orfoo.wav) you should have:foo_lyrics.txt– lyrics or[inst]for instrumentals.foo_prompt.txt– ACE-Step tags for that track.
The UI provides:
- A Dataset folder text field.
- A Browse… button that uses a folder picker.
You can hand-create these files, use the Dataset Mass Tagging tool to generate them from a base prompt, or use MuFun-ACEStep to auto-tag.
-
Experiment / adapter name – a short name like
lofi_chiptunes_v1. Used for the output folder underace_trainingand the final adapter under yourcustom_lorahierarchy. -
LoRA config (JSON) – choose from JSON presets in the
training_configfolder. -
Max training steps – upper bound on optimization steps. Usually left high; you control real run length with epochs.
-
Max epochs – number of full passes over the dataset (e.g. 20).
-
Learning rate – default
1e-4, with1e-4–1e-5being common LoRA values. -
Max clip seconds – max length per audio example. Lowering this can reduce memory usage and speed up training.
-
SSL loss weight – weight for MERT/mHuBERT self-supervised losses. Set to 0 for pure instrumental / chiptune datasets.
-
Instrumental dataset – checkbox telling the trainer to freeze lyric/speaker-specific blocks and focus on music/texture layers.
-
Save LoRA every N steps – periodic checkpoint saving, with 0 disabling mid-run saves (but still writing a final adapter).
These map to PyTorch Lightning / ACE-Step trainer internals:
- Precision – 32-bit, 16-mixed, or bf16-mixed (note: MPS uses float32 by default).
- Grad accumulation – virtual batch size multiplier.
- Gradient clip value + algorithm – stability tuning.
- DataLoader reload frequency – how often to rebuild loaders.
- Validation check interval – how often validation runs.
- Devices – number of devices to use (typically 1 for MPS).
If you're not already used to debugging Lightning configs, leave these at their defaults. You'll get more mileage from good datasets and reasonable learning rates.
The LoRA config presets come in several families: light / medium / heavy, base_layers, extended_attn, heavy transformer, full_stack, etc. As a rule of thumb:
- Light / base_layers – safest, smaller adapters, subtle style shaping.
- Heavy / full_stack – much stronger imprinting and higher overfit risk.
Under Training mode you'll also see a card for Dataset Mass Tagging (Prompt / Lyrics Templates). This is for quickly building simple prompt/lyrics files without ML tagging.
-
Use the Dataset folder field and Browse… button to point to a folder under:
<AceForge root>/training_datasets -
Only
.mp3and.wavfiles in that folder will be affected.
The Base tags field is a short ACE-Step prompt snippet written into each
_prompt.txt. Example:
16-bit, 8-bit, SNES, retro RPG BGM, looping instrumental
-
Create prompt files – creates or updates
_prompt.txtfor each track using the base tags. -
Create [inst] lyrics files – creates or updates
_lyrics.txtfiles with just[inst]. -
Overwrite existing files – when checked, will overwrite existing prompt/lyrics files instead of skipping them.
A small status text and loading bar show when the tool is busy. Once complete, each track in the dataset should be ready to plug into the LoRA trainer.
The "Experimental – Analyze Dataset with MuFun-ACEStep" card lets you run a large MuFun model over a folder of audio to auto-generate prompts and lyrics.
-
Use the Install / Check button. AceForge will:
- Check whether the model is already present.
- Download it if needed into the AceForge models folder.
-
The model is large (tens of GB). Make sure you have enough disk space.
- Select a dataset folder under
training_datasets. - Optionally provide a Base tags string.
- Optionally check Instrumental to force all lyrics to
[inst]. - Click Analyze Folder.
MuFun will:
- Create
_prompt.txtand_lyrics.txtfiles next to each track. - Include your base tags plus its own tags when writing prompts.
- Show progress and results in the Results text area.
MuFun is powerful but not perfect. For high-stakes datasets, skim a few outputs and edit any bad tags or strange lyric outputs before training a LoRA.
- Check the terminal window for pip errors (network, disk, permissions).
- Ensure you have plenty of free disk space on your Mac.
- Slow networks will heavily impact model downloads.
- Generate a track first from the Generate tab.
- Confirm that the Output directory field points to the correct folder.
- Ensure you're running macOS 12.0+ for MPS support.
- Some operations may fall back to CPU if not yet supported on MPS.
- Try setting
ACE_PIPELINE_DTYPE=float32environment variable if you encounter precision issues:export ACE_PIPELINE_DTYPE=float32 ./CDMF.sh
- Reduce Target length for generation.
- Reduce Max clip seconds for training.
- Lower batch / grad accumulation values if you've changed them.
- On Apple Silicon, unified memory is shared between CPU and GPU, so closing other applications can help.
# Ensure you have Python 3.10 or later
python3 --version
# Install via Homebrew if needed
brew install python@3.10chmod +x CDMF.sh- Manually navigate to
http://127.0.0.1:5056/in your browser. - Check if the terminal shows any error messages.
# Remove existing venv and recreate
rm -rf venv_ace
./CDMF.shFrom pre-built app:
- Drag
AceForge.appfrom Applications to Trash. - Remove the data folder:
~/Library/Application Support/AceForge(if it exists).
From source:
- Delete the cloned repository folder.
- Remove
venv_acedirectory if it exists.
If you keep a lot of generated music, consider backing up your .wav files before uninstalling.
- Unified memory management: Apple Silicon Macs with unified memory can efficiently share memory between CPU and GPU.
- Batch sizes: Start with smaller batch sizes and gradually increase to find optimal performance.
- Model precision: The pipeline automatically selects appropriate precision for MPS (float32 instead of bfloat16).
- Generation length: Longer generation times may require more memory; start with shorter durations and scale up.
For more information and support, visit the GitHub repository.