[Feat] Extend EMA schedules#26
Merged
Merged
Conversation
87003697
pushed a commit
to 87003697/Flow-Factory
that referenced
this pull request
Apr 17, 2026
…ctured rewards Replace the ad-hoc scalar UnifiedReward family with the structured pointwise family that mirrors upstream UnifiedReward-2.0's official ACS (image) and APS (video) API prompts, so Flow-Factory can drive GRPO/NFT/AWM training against a vLLM-served UnifiedReward-2.0-qwen3vl scorer. Rewards layer (src/flow_factory/rewards/unified_reward.py): - Keep UnifiedRewardAPIBase (OpenAI-compatible transport, semaphore, retries, FIFO text cache) and document the _replace_nan_with_mean batch-mean fallback as a Constraint X-GenGroup#26 fail-fast exemption matching Pref-GRPO's policy. - Delete the scalar family (UnifiedRewardScalarPointwiseBase, UnifiedRewardImageGenRewardModel, UnifiedRewardVideoGenRewardModel) -- no matching upstream 2.0 prompt and no users. - UnifiedRewardStructuredPointwiseBase now owns _pack_results: packs (aggregated, per_axis) tuples into RewardModelOutput, filling aggregated NaNs with the batch mean and exposing per-axis scores as {axis}_scores in extra_info. - UnifiedRewardImageGenACSRewardModel (unified_reward_image_acs): aligned to PointwiseRewardModel.__call__ signature, factored into _build_cache_key / _build_messages / _score_single. Prompt template uses a __PROMPT__ placeholder with str.replace so user captions with braces no longer KeyError. - UnifiedRewardVideoGenAPSRewardModel (unified_reward_video_aps): adds max_frames (default 16, matches the upstream APS script) and np.linspace-based uniform frame sampling; supports I2V by accepting condition_images and prepending the first reference image to the frame sequence (its hash is folded into the cache key so different references do not alias). Registry + examples: - registry.py drops the two deprecated keys; only unified_reward_image_acs / unified_reward_video_aps are exposed. - Add examples/grpo/lora/flux1_unified_reward_t2i.yaml (FLUX.1 T2I ACS) and examples/grpo/lora/wan21_i2v_unified_reward.yaml (Wan2.1 I2V APS with condition_images + max_frames: 16), both with async_reward: true and a documented prerequisite comment block for spinning up the vLLM server. Docs: - guidance/rewards.md: trim scalar-family rows from the reward table, simplify the class hierarchy diagram, add the APS max_frames note and the I2V condition_images behaviour. - .agents/knowledge/architecture.md: drop deprecated rows from the reward registry table. Verification: - Python smoke test (CUDA_VISIBLE_DEVICES="" against a live CodeGoat24/UnifiedReward-2.0-qwen3vl-8b vLLM server on :8080) returns rewards in [0,1] and non-zero alignment / coherence / style scores for both samples, no NaN warnings. - ff-train on examples/grpo/lora/flux1_unified_reward_t2i.yaml runs through the first epoch cleanly: * Step 0000 eval/reward_unified_reward_image_acs mean=0.6613 std=0.0421 * Step 0000 train/reward_unified_reward_image_acs mean=0.6399 std=0.0481 zero_std_ratio=0 * Async rewards path engaged; no UnifiedReward API failure warnings observed. Made-with: Cursor
Jayce-Ping
added a commit
that referenced
this pull request
Jun 27, 2026
… plugin layer (lossless) Introduce a registry-based acceleration plugin layer that respects the algorithm/model decoupling, plus the first lossless accelerators. - acceleration/: BaseAccelerator (safety/stage contract), registry with direct-path fallback, paradigm-gated validator, CompileAccelerator (torch.compile, regional/full), AttentionBackendAccelerator (exact backends). - hparams: AccelerationArguments with shared/rollout slots; wired into Arguments (field + nested_map) and exported. Off by default (backward compatible). - trainers: BaseTrainer builds and validates accelerators after prepare, applies the shared accelerator via setup() and wraps the Stage-3 rollout loop with the rollout accelerator context; per-trainer paradigm tags (coupled/decoupled/ distillation) drive the lossy-safety gate (constraints.md #7, #20a, #26).
4 tasks
Jayce-Ping
added a commit
to Jayce-Ping/Flow-Factory-Private
that referenced
this pull request
Jul 2, 2026
* Update ema
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.