Skip to content

ServeurpersoCom/acestep.cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

313 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

acestep.cpp

Local AI music generation server with browser UI, powered by GGML. Describe a song, get stereo 48kHz audio. Runs on CPU, CUDA, Metal, Vulkan.

Light Dark

Download models

Grab one GGUF of each type from Hugging Face and drop them in the models/ folder:

https://huggingface.co/Serveurperso/ACE-Step-1.5-GGUF/tree/main

Type Pick one Size
LM acestep-5Hz-lm-4B-Q8_0.gguf 4.2 GB
Text encoder Qwen3-Embedding-0.6B-Q8_0.gguf 748 MB
DiT acestep-v15-turbo-Q8_0.gguf 2.4 GB
VAE vae-BF16.gguf (always this one) 322 MB

Three LM sizes available: 0.6B (fast), 1.7B, 4B (best quality). Multiple DiT variants: turbo (8 steps), sft (50 steps, higher quality), base, shift1, shift3, continuous.

Alternative: ./models.sh downloads the default set automatically (needs pip install hf).

Build

git clone --recurse-submodules https://github.com/Serveurperso/acestep.cpp
cd acestep.cpp

Windows

Pre-built binaries (until CI is set up): https://www.serveurperso.com/temp/acestep.cpp-win64/

To build from source, install Visual C++ Build Tools (select "Desktop development with C++" workload) and optionally the CUDA Toolkit and/or the Vulkan SDK.

buildcuda.cmd     # NVIDIA GPU
buildvulkan.cmd   # AMD/Intel GPU (Vulkan)
buildall.cmd      # all backends (CUDA + Vulkan + CPU, runtime loading)

Linux / macOS

./buildcuda.sh    # NVIDIA GPU
./buildvulkan.sh  # AMD/Intel GPU (Vulkan)
./buildcpu.sh     # CPU only (with BLAS)
./buildall.sh     # all backends (CUDA + Vulkan + CPU, runtime loading)

macOS auto-enables Metal and Accelerate BLAS with any of the above.

Run

./server.sh       # Linux / macOS
server.cmd        # Windows

Open http://localhost:8085 in your browser. The WebUI handles everything: write a caption, set lyrics and metadata, generate, play, and download tracks.

Models are loaded on first request (zero GPU at startup) and swapped automatically when you pick a different one in the UI.

LoRA

Drop LoRA adapters in the loras/ folder and restart the server. Supports PEFT directories and ComfyUI single .safetensors files. Select the active LoRA from the WebUI.

Server options

--models <dir>       Model directory (required)
--loras <dir>        LoRA adapters directory
--host <addr>        Listen address (default: 127.0.0.1)
--port <N>           Listen port (default: 8080)
--max-batch <N>      LM batch limit 1-9 (default: 1)
--vae-chunk <N>      VAE tile size (default: 256, lower = less VRAM)
--mp3-bitrate <N>    MP3 kbps (default: 128)
API endpoints

The server exposes three POST endpoints and two GET endpoints:

POST /lm - Generate lyrics and audio codes from a caption. Returns JSON.

POST /synth - Render audio codes into MP3 or WAV (?wav=1). Accepts JSON or multipart (with source audio for cover/repaint modes).

POST /understand - Reverse pipeline: audio in, metadata + lyrics + codes out. Accepts multipart (audio file) or JSON (codes-only).

GET /health - Returns {"status":"ok"}.

GET /props - Available models, server config, default parameters.

See docs/ARCHITECTURE.md for the full API reference and AceRequest JSON specification.

CLI tools (advanced)

For scripting without the server, ace-lm and ace-synth work as a pipe:

# LM generates lyrics + codes
./build/ace-lm \
    --request /tmp/request.json \
    --lm models/acestep-5Hz-lm-4B-Q8_0.gguf

# DiT + VAE render to audio
./build/ace-synth \
    --request /tmp/request0.json \
    --embedding models/Qwen3-Embedding-0.6B-Q8_0.gguf \
    --dit models/acestep-v15-turbo-Q8_0.gguf \
    --vae models/vae-BF16.gguf

See docs/ARCHITECTURE.md for the full JSON reference, task types, batching, and understand pipeline.

Technical documentation

docs/ARCHITECTURE.md covers the complete AceRequest JSON reference, all task types (text2music, cover, repaint, lego, extract, complete), FSM constrained decoding, custom GGML operators, quantization, and architecture internals.

Community

ACE-Step official documentation

  • A Musician's Guide - non-technical guide for music makers
  • Tutorial - design philosophy, model architecture, input control, inference hyperparameters

Third-party UIs for acestep.cpp

Samples

GGML.mp4
DiT-Only-SFT.mp4
ProcessJellyfin.mp4
Instrumental.mp4
House-IA.mp4

Acknowledgements

Independent C++ implementation based on ACE-Step 1.5 by ACE Studio and StepFun. All model weights are theirs, this is just a native backend.

@misc{gong2026acestep,
	title={ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation},
	author={Junmin Gong, Yulin Song, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
	howpublished={\url{https://github.com/ace-step/ACE-Step-1.5}},
	year={2026},
	note={GitHub repository}
}

About

Portable C++17 implementation of ACE-Step 1.5 AI Music Generator using GGML. Text + lyrics in, stereo 48kHz MP3 or WAV out. Runs on CPU, CUDA, ROCm, Metal, Vulkan.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors