[Feat] Extend EMA schedules by Jayce-Ping · Pull Request #26 · X-GenGroup/Flow-Factory

Jayce-Ping · 2026-02-05T12:02:20Z

No description provided.

…ctured rewards Replace the ad-hoc scalar UnifiedReward family with the structured pointwise family that mirrors upstream UnifiedReward-2.0's official ACS (image) and APS (video) API prompts, so Flow-Factory can drive GRPO/NFT/AWM training against a vLLM-served UnifiedReward-2.0-qwen3vl scorer. Rewards layer (src/flow_factory/rewards/unified_reward.py): - Keep UnifiedRewardAPIBase (OpenAI-compatible transport, semaphore, retries, FIFO text cache) and document the _replace_nan_with_mean batch-mean fallback as a Constraint X-GenGroup#26 fail-fast exemption matching Pref-GRPO's policy. - Delete the scalar family (UnifiedRewardScalarPointwiseBase, UnifiedRewardImageGenRewardModel, UnifiedRewardVideoGenRewardModel) -- no matching upstream 2.0 prompt and no users. - UnifiedRewardStructuredPointwiseBase now owns _pack_results: packs (aggregated, per_axis) tuples into RewardModelOutput, filling aggregated NaNs with the batch mean and exposing per-axis scores as {axis}_scores in extra_info. - UnifiedRewardImageGenACSRewardModel (unified_reward_image_acs): aligned to PointwiseRewardModel.__call__ signature, factored into _build_cache_key / _build_messages / _score_single. Prompt template uses a __PROMPT__ placeholder with str.replace so user captions with braces no longer KeyError. - UnifiedRewardVideoGenAPSRewardModel (unified_reward_video_aps): adds max_frames (default 16, matches the upstream APS script) and np.linspace-based uniform frame sampling; supports I2V by accepting condition_images and prepending the first reference image to the frame sequence (its hash is folded into the cache key so different references do not alias). Registry + examples: - registry.py drops the two deprecated keys; only unified_reward_image_acs / unified_reward_video_aps are exposed. - Add examples/grpo/lora/flux1_unified_reward_t2i.yaml (FLUX.1 T2I ACS) and examples/grpo/lora/wan21_i2v_unified_reward.yaml (Wan2.1 I2V APS with condition_images + max_frames: 16), both with async_reward: true and a documented prerequisite comment block for spinning up the vLLM server. Docs: - guidance/rewards.md: trim scalar-family rows from the reward table, simplify the class hierarchy diagram, add the APS max_frames note and the I2V condition_images behaviour. - .agents/knowledge/architecture.md: drop deprecated rows from the reward registry table. Verification: - Python smoke test (CUDA_VISIBLE_DEVICES="" against a live CodeGoat24/UnifiedReward-2.0-qwen3vl-8b vLLM server on :8080) returns rewards in [0,1] and non-zero alignment / coherence / style scores for both samples, no NaN warnings. - ff-train on examples/grpo/lora/flux1_unified_reward_t2i.yaml runs through the first epoch cleanly: * Step 0000 eval/reward_unified_reward_image_acs mean=0.6613 std=0.0421 * Step 0000 train/reward_unified_reward_image_acs mean=0.6399 std=0.0481 zero_std_ratio=0 * Async rewards path engaged; no UnifiedReward API failure warnings observed. Made-with: Cursor

… plugin layer (lossless) Introduce a registry-based acceleration plugin layer that respects the algorithm/model decoupling, plus the first lossless accelerators. - acceleration/: BaseAccelerator (safety/stage contract), registry with direct-path fallback, paradigm-gated validator, CompileAccelerator (torch.compile, regional/full), AttentionBackendAccelerator (exact backends). - hparams: AccelerationArguments with shared/rollout slots; wired into Arguments (field + nested_map) and exported. Off by default (backward compatible). - trainers: BaseTrainer builds and validates accelerators after prepare, applies the shared accelerator via setup() and wraps the Stage-3 rollout loop with the rollout accelerator context; per-trainer paradigm tags (coupled/decoupled/ distillation) drive the lossy-safety gate (constraints.md #7, #20a, #26).

* Update ema

Jayce-Ping force-pushed the main branch from a0a208d to a29e177 Compare February 5, 2026 12:25

Jayce-Ping added 2 commits February 5, 2026 20:35

Update ema

b278ad3

Add file path

2c6e819

Jayce-Ping force-pushed the ema branch from b15cf6a to 2c6e819 Compare February 5, 2026 12:38

Jayce-Ping added 4 commits February 5, 2026 21:19

remove comment

2b6424e

Update config

36eff11

Update config

cf59833

Fix filepath

664eb13

Jayce-Ping merged commit 857553b into main Feb 6, 2026

Jayce-Ping deleted the ema branch February 6, 2026 23:26

Jayce-Ping mentioned this pull request Jun 29, 2026

[acceleration] Model-agnostic acceleration plugin layer (torch.compile / attention backend / feature caching) #196

Open

4 tasks

Jayce-Ping added a commit to Jayce-Ping/Flow-Factory-Private that referenced this pull request Jul 2, 2026

[Feat] Extend EMA schedules (X-GenGroup#26)

71bca25

* Update ema

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat] Extend EMA schedules#26

[Feat] Extend EMA schedules#26
Jayce-Ping merged 6 commits into
mainfrom
ema

Jayce-Ping commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Jayce-Ping commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant