Skip to content

add: inference-time-scaling#1270

Open
yjy415 wants to merge 3 commits intomodelscope:mainfrom
yjy415:ses
Open

add: inference-time-scaling#1270
yjy415 wants to merge 3 commits intomodelscope:mainfrom
yjy415:ses

Conversation

@yjy415
Copy link
Collaborator

@yjy415 yjy415 commented Feb 6, 2026

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @yjy415, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Spectral Evolution Search (SES) algorithm, a novel inference-time scaling technique designed to enhance the quality of generated images without requiring model retraining. It integrates SES into several existing image generation pipelines, allowing users to optimize initial noise through an evolutionary search guided by reward models. This feature provides a trade-off between computational cost and image quality, offering greater control over the generation process and improving the fidelity of generated outputs.

Highlights

  • New Feature: Inference-Time Scaling (SES): Introduced the Spectral Evolution Search (SES) algorithm, allowing users to enhance image generation quality during inference by optimizing initial noise without retraining models. This feature trades computational cost for improved output quality.
  • Core SES Implementation: Added a new utility module diffsynth/utils/inference_time_scaling/ses.py which includes functions for wavelet transforms (split_dwt, reconstruct_dwt), a SESRewardScorer class to integrate various reward models (PickScore, CLIP, HPSv2), and the run_ses_cem function for Cross-Entropy Method-based latent optimization.
  • Pipeline Integration: Integrated SES functionality into several key image generation pipelines, including Flux2ImagePipeline, FluxImagePipeline, QwenImagePipeline, and ZImagePipeline. This involves adding new parameters (enable_ses, ses_reward_model, ses_eval_budget, ses_inference_steps) to their __call__ methods and incorporating the SES optimization logic.
  • Documentation and Examples: Provided comprehensive documentation in both English and Chinese (docs/en/Research_Tutorial/inference_time_scaling.md, docs/zh/Research_Tutorial/inference_time_scaling.md) explaining SES principles, usage, parameters, and supported models. Additionally, new example scripts demonstrate how to utilize SES with various models.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • diffsynth/pipelines/flux2_image.py
    • Imported run_ses_cem and SESRewardScorer for SES functionality.
    • Added enable_ses, ses_reward_model, ses_eval_budget, and ses_inference_steps parameters to the __call__ method.
    • Implemented conditional SES logic within __call__ to optimize initial latents using a defined callback and reward scorer.
  • diffsynth/pipelines/flux_image.py
    • Imported run_ses_cem and SESRewardScorer for SES functionality.
    • Added enable_ses, ses_reward_model, ses_eval_budget, and ses_inference_steps parameters to the __call__ method.
    • Implemented conditional SES logic within __call__ to optimize initial latents using a defined callback and reward scorer.
  • diffsynth/pipelines/qwen_image.py
    • Imported run_ses_cem and SESRewardScorer for SES functionality.
    • Added enable_ses, ses_reward_model, ses_eval_budget, and ses_inference_steps parameters to the __call__ method.
    • Implemented conditional SES logic within __call__ to optimize initial latents using a defined callback and reward scorer.
  • diffsynth/pipelines/z_image.py
    • Imported run_ses_cem and SESRewardScorer for SES functionality.
    • Added enable_ses, ses_reward_model, ses_eval_budget, and ses_inference_steps parameters to the __call__ method.
    • Implemented conditional SES logic within __call__ to optimize initial latents using a defined callback and reward scorer.
  • diffsynth/utils/inference_time_scaling/ses.py
    • New file created to house the core Spectral Evolution Search (SES) logic.
    • Includes split_dwt and reconstruct_dwt functions for discrete wavelet transforms.
    • Defines SESRewardScorer class to load and utilize various reward models (PickScore, CLIP, HPSv2) for image evaluation.
    • Contains the run_ses_cem function, which implements the Cross-Entropy Method for optimizing latent representations based on reward scores.
  • docs/en/Research_Tutorial/inference_time_scaling.md
    • New English documentation detailing the Inference-Time Scaling (SES) feature.
    • Explains the basic principles of SES, its search space, evolutionary strategy, and reward feedback mechanism.
    • Provides quick start instructions, core parameter details, a list of supported models, and effect demonstrations.
  • docs/zh/Research_Tutorial/inference_time_scaling.md
    • New Chinese documentation, a direct translation of the English SES tutorial.
    • Covers the same content as the English version, including principles, usage, parameters, supported models, and demonstrations.
  • examples/flux/model_inference/FLUX.1-dev-SES.py
    • New example script demonstrating how to use SES with the FLUX.1-dev pipeline.
  • examples/flux2/model_inference/FLUX.2-dev-SES.py
    • New example script demonstrating how to use SES with the FLUX.2-dev pipeline.
  • examples/qwen_image/model_inference/Qwen-Image-SES.py
    • New example script demonstrating how to use SES with the Qwen-Image pipeline.
  • examples/z_image/model_inference/Z-Image-SES.py
    • New example script demonstrating how to use SES with the Z-Image pipeline.
  • examples/z_image/model_inference/Z-Image-Turbo-SES.py
    • New example script demonstrating how to use SES with the Z-Image-Turbo pipeline.
Activity
  • The user yjy415 initiated this pull request to introduce inference-time scaling capabilities.
  • A new utility module diffsynth/utils/inference_time_scaling/ses.py was added, containing the core logic for Spectral Evolution Search (SES), including wavelet transforms, reward scoring, and Cross-Entropy Method optimization.
  • The SES feature has been integrated into multiple existing image generation pipelines (Flux, Qwen, Z-Image), allowing for dynamic quality improvement during inference.
  • Comprehensive documentation in both English and Chinese, along with several example scripts, were added to guide users on how to leverage this new functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot]

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant