Speed up FlowMatchEulerDiscreteScheduler index_for_timestep (#9417)#2
Open
srlynch1 wants to merge 1 commit into
Open
Speed up FlowMatchEulerDiscreteScheduler index_for_timestep (#9417)#2srlynch1 wants to merge 1 commit into
srlynch1 wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
FlowMatchEulerDiscreteScheduler.index_for_timestepto remove per-elementnonzero()calls in thescale_noisetraining hot pathscale_noisewith batched index lookupResolves huggingface#9417 (eval e2e run 2026-06-21-r2).
Test plan
pytest tests/schedulers/test_scheduler_flow_match_euler_discrete.py(4/4 pass)ruff check/ruff format --checkon changed filespython utils/check_copies.py(0 drift)Note
Low Risk
Changes are localized to one scheduler’s index lookup and noise scaling; behavior is guarded by parity tests against the prior implementation.
Overview
Speeds up training-style batch calls to
FlowMatchEulerDiscreteScheduler.scale_noiseby replacing per-timestep index lookups with a single batched path.index_for_timestepnow resolves schedule indices with vectorized equality/argmaxover a 1-D timestep tensor (still returning a scalarintfor scalar inputs), including the existing rule that picks the second matching index when a timestep appears more than once.scale_noisecalls that helper once for the full batch instead of building indices in a Python loop.New scheduler tests compare against a legacy
nonzero()reference for several shift settings, verify batchedscale_noiseoutput matches the old behavior, and assert the optimized path is faster on a large training-like batch.Reviewed by Cursor Bugbot for commit 3feb4d4. Bugbot is set up for automated code reviews on this repo. Configure here.