Skip to content

qwen_image layercontrol v2#1306

Open
mi804 wants to merge 3 commits intomodelscope:mainfrom
mi804:layercontrol_v2
Open

qwen_image layercontrol v2#1306
mi804 wants to merge 3 commits intomodelscope:mainfrom
mi804:layercontrol_v2

Conversation

@mi804
Copy link
Collaborator

@mi804 mi804 commented Feb 24, 2026

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @mi804, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the new Qwen-Image-Layered-Control-V2 model into the DiffSynth-Studio project. This model significantly expands image generation capabilities by introducing brush-controlled layer separation, building upon the existing text-guided functionality. The changes encompass updates to the core image processing pipeline, the addition of comprehensive example scripts for inference and training, and thorough documentation updates to reflect the new model's features and usage.

Highlights

  • New Model Release: The DiffSynth-Studio/Qwen-Image-Layered-Control-V2 model has been released, enhancing the Qwen-Image pipeline with brush-controlled layer separation capabilities.
  • Core Pipeline Enhancement: The QwenImageUnit_ContextImageEmbedder in the Qwen-Image pipeline was updated to support new input parameters, specifically layer_input_image, and handle context_image conversion for the V2 model.
  • New Examples and Training Scripts: Dedicated inference scripts (standard and low VRAM) and a LoRA training script have been added for the Qwen-Image-Layered-Control-V2 model.
  • Documentation Updates: Both English and Chinese README.md files, along with the Qwen-Image.md documentation, have been updated to announce the new V2 model and include links to its examples.
  • Training Script Refinement: The general training script (train.py) was modified to correctly parse context_image for dataset processing and to ensure models are initialized on the accelerator device.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • README.md
    • Announced the release of the Qwen-Image-Layered-Control-V2 model, highlighting its new brush-controlled layer separation feature.
    • Added a new entry to the Qwen-Image example code table for DiffSynth-Studio/Qwen-Image-Layered-Control-V2, including links to its inference and training scripts.
  • README_zh.md
    • Announced the release of the Qwen-Image-Layered-Control-V2 model in Chinese, detailing its new brush-controlled layer separation capability.
    • Added a new entry to the Chinese Qwen-Image example code table for DiffSynth-Studio/Qwen-Image-Layered-Control-V2, with links to its inference and training scripts.
  • diffsynth/pipelines/qwen_image.py
    • Modified QwenImageUnit_ContextImageEmbedder to include layer_input_image in its input parameters.
    • Added logic to convert context_image to RGBA format if layer_input_image is provided, ensuring compatibility with layered control.
  • docs/en/Model_Details/Qwen-Image.md
    • Added a new row to the model table for DiffSynth-Studio/Qwen-Image-Layered-Control-V2, providing links to its inference and LoRA training examples.
  • docs/zh/Model_Details/Qwen-Image.md
    • Added a new row to the Chinese model table for DiffSynth-Studio/Qwen-Image-Layered-Control-V2, providing links to its inference and LoRA training examples.
  • examples/qwen_image/model_inference/Qwen-Image-Layered-Control-V2.py
    • Added a new Python script demonstrating standard inference with the Qwen-Image-Layered-Control-V2 model, including examples for text-guided and mask-guided generation.
  • examples/qwen_image/model_inference_low_vram/Qwen-Image-Layered-Control-V2.py
    • Added a new Python script demonstrating low VRAM inference with the Qwen-Image-Layered-Control-V2 model, including VRAM optimization configurations.
  • examples/qwen_image/model_training/lora/Qwen-Image-Layered-Control-V2.sh
    • Added a new shell script for initiating LoRA training of the Qwen-Image-Layered-Control-V2 model, specifying dataset paths, model configurations, and training parameters.
  • examples/qwen_image/model_training/train.py
    • Updated the qwen_image_parser to correctly handle context_image as an input, allowing for both string paths and None values.
    • Modified the model initialization logic to always set the device to accelerator.device, removing conditional CPU initialization.
  • examples/qwen_image/model_training/validate_lora/Qwen-Image-Layered-Control-V2.py
    • Added a new Python script for validating the LoRA trained Qwen-Image-Layered-Control-V2 model, demonstrating inference with a trained checkpoint.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Qwen-Image-Layered-Control-V2 model, including new example scripts for inference, low-VRAM inference, LoRA training, and validation. It also updates documentation and makes necessary adjustments to the pipeline and training scripts to accommodate the new model's features, such as brush-controlled layer separation.

My review found one potential issue in the training script where a change might break the initialize_model_on_cpu functionality, which is important for large model training setups. Otherwise, the changes look good and effectively add support for the new model.

mi804 and others added 2 commits February 24, 2026 15:26
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant