Skip to content

petmycat/ComfyUI-gen2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ComfyUI-Gen2 Custom Nodes

Custom ComfyUI nodes for QwenImage ControlNet and some other QoL nodes, designed to achieve 100% output compatibility with VideoX-Fun's diffusers pipeline while leveraging ComfyUI's efficient model loading system.

Why This Implementation?

We integrate with ComfyUI's model loading nodes (Load Diffusion Model, Load CLIP, Load VAE) but use our own sampler and conditioning nodes. This approach was chosen because:

  1. ComfyUI's model loading is highly optimized - fast loading, memory efficient, supports quantized models (fp8, GGUF)
  2. VideoX's sampling pipeline has specific requirements - custom RoPE calculation, True CFG with norm rescaling, and packed 3D latent format that differ from ComfyUI's standard sampler
  3. Exact output matching - by replicating VideoX's exact forward logic while using ComfyUI's loaded weights, we achieve near identical outputs with the same seed

Our nodes act as a bridge: ComfyUI handles the heavy lifting of model management, while we ensure the inference process matches VideoX exactly.

Credits

  • VideoX-Fun - The original QwenImage ControlNet implementation. Our pipeline logic is derived from their excellent work.
  • ComfyUI - The powerful and modular diffusion model GUI that makes this integration possible.

Installation

  1. Prerequisites - Install these custom node packs first:

    • VideoX-Fun - Required for model components and utilities
    • ComfyUI-GGUF - Required if using GGUF quantized models
  2. Install ComfyUI-Gen2:

    cd ComfyUI/custom_nodes
    git clone https://github.com/petmycat/ComfyUI-gen2.git
  3. Tokenizer - Download from Qwen-Image-2512 on HuggingFace:

    • Navigate to the model's files and download all files from the tokenizer/ folder
    • Place them in:
    ComfyUI/models/gen2/qwen_2512_tokenizer/
    

Example Workflow

Example workflow and reference images are located in:

  • workflows/qwen_control_example_workflow.json - Example ComfyUI workflow
  • assets/ - Reference images for testing (example (1).png, example (2).png)

Nodes

QwenImage ControlNet

Node Description
Gen2 Load QwenImage ControlNet Load ControlNet weights
Gen2 Load QwenImage VAE Load VAE with VideoX-compatible config
Gen2 Apply QwenImage ControlNet Prepare control context and wrap model
Gen2 QwenImage Text Encode VideoX-style text encoding (use instead of CLIPTextEncode)
Gen2 Load QwenImage LoRA Load LoRA for VideoX-style merging
Gen2 QwenImage Control Sampler VideoX-compatible sampling with True CFG

Utilities

Node Description
Gen2 DWpose with Threshold DWpose detector with configurable confidence thresholds for body/hand/face keypoints
Gen2 StringReplace Replace all occurrences of a search string with a replacement string (case-sensitive)
Gen2 Checkerboard Generate a checkerboard pattern image (1px black & white squares) at specified width × height

Dtype Support

Supports multiple precision modes:

  • bf16/fp16 - Full precision models
  • fp8 - Quantized models (automatic compute dtype detection)
  • GGUF - Quantized models via ComfyUI-GGUF

TODO

  • Add node parameter explanations for better user support (document what each parameter does in every node)
  • Integrate custom Load VAE node into ComfyUI system and add latent image input to sampler node
  • Decouple ControlNet node and sampler node
  • Add start and end step parameters to sampler node
  • Reorganize code for better maintenance — split into qwenimage/ (core + nodes) and misc_nodes/ (pose, string utils)

License

This project is licensed under the Apache License 2.0. It also follows the licensing requirements of its dependencies (VideoX-Fun, ComfyUI).

About

Some custom nodes made for ComfyUI to accommodate things like QwenImage 2512's ControlNet Union Fun model released by AlibabaPAI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages