Skip to content

[docs,hparams] docs: sync agent harness (.agents/.cursor/guidance) with implementation#185

Merged
Jayce-Ping merged 1 commit into
mainfrom
docs/sync-harness-with-impl
Jun 14, 2026
Merged

[docs,hparams] docs: sync agent harness (.agents/.cursor/guidance) with implementation#185
Jayce-Ping merged 1 commit into
mainfrom
docs/sync-harness-with-impl

Conversation

@Jayce-Ping

Copy link
Copy Markdown
Collaborator

Summary

Resyncs the AI-agent harness with the current src/flow_factory/ implementation after plugin growth (now 9 trainers, 14 model adapters, 13 reward models). A full audit found ~40 discrepancies; this PR fixes registry drift, wrong config/API facts, and broken cross-references. Almost entirely docs/harness; one zero-risk type-hint correction in hparams/model_args.py.

Registry drift

  • architecture.md / AGENTS.md: add diffusion-opd trainer, clap/imagebind/geneval rewards, Bagel/LTX2 models; fix RationalRewardsT2IRewardModel / RationalRewardsEditRewardModel class names.

Wrong facts that could mislead agents or break YAML

  • guidance/algorithms.md: train.dynamics_type/train_steps/num_train_steps -> scheduler.dynamics_type/sde_steps/num_sde_steps (verified against examples/).
  • guidance/workflow.md: real sample() (delegates to generate_samples) and compute_advantages() snippets (removed nonexistent compute_advantage_weighted_sum); fixed prompt_embedds typo.
  • guidance/rewards.md: fixed trainer/grpo path, GenEval metadata-JSON convention, added audio param, added ocr/clap/imagebind.
  • skills: model_path -> model_name_or_path; target_module_map -> config-driven + default_target_modules; data.dataset -> data.datasets; rewards single-dict -> list; removed evaluate() skeleton.
  • constraints.md Optimize Groupwise Reward computation #11: evaluate() is concrete (not abstract).

Cross-reference rot

  • constraints.md index extended to #28-29; philosophy.md #27 ref fixed and "FSDP" -> Accelerate (DDP/DeepSpeed ZeRO-1-2/FSDP).
  • topics/samplers.md: corrected _resolve_sampler_type + AdvantageProcessor group_distributed path; parity_testing.md set_timesteps -> set_scheduler_timesteps; brittle line numbers -> symbol refs.
  • .cursor/rules/skills-reusable-only.mdc: register skills in AGENTS.md; CLAUDE.md now imports @AGENTS.md so it can't drift to a stale subset.

Code (1 line of intent)

  • hparams/model_args.py: model_type Literal now matches the 14 registry keys. Zero-risk — ArgABC is a plain dataclass with no Literal enforcement (examples already use sd3-5/bagel/ltx2_*).

Test plan

  • Registry tables in docs re-grepped against trainers/, models/, rewards/ registries.
  • model_args.py parses (AST) and has no linter errors.
  • Re-grep confirms no stale tokens remain (compute_advantage_weighted_sum, trainer/grpo, prompt_embedds, bare train_steps, set_timesteps().
  • Docs-only render check on GitHub.

Notes

  • black --check/isort --check fail on src/flow_factory/hparams/model_args.py, but this is pre-existing (the whole file predates formatting); avoided a full-file reformat to keep this PR scoped. Can format separately if desired.
  • Out of scope (code bug, not harness): my_reward_remote.py docstrings / reward_server/example_server.py reference flow_factory.rewards.remote instead of my_reward_remote.

Made with Cursor

Resync .agents/, .cursor/, guidance/, AGENTS.md and CLAUDE.md with the
current code after plugin growth (9 trainers, 14 model adapters, 13 reward
models). Fixes registry drift, wrong config/API facts and broken
cross-references found in a full audit.

- architecture.md/AGENTS.md: add diffusion-opd trainer, clap/imagebind/
  geneval rewards, Bagel/LTX2 models; fix RationalRewards* class names
- constraints.md: evaluate() is concrete (not abstract); index #28-29;
  paradigm (#7) and training-args (#16) lists; de-numbered line refs
- philosophy.md: Accelerate (DDP/DeepSpeed ZeRO-1-2/FSDP) backend; fix #27 ref
- guidance: scheduler.* config keys, real sample()/compute_advantages
  snippets, GenEval metadata convention, audio reward param, Bagel link
- skills: model_name_or_path, default_target_modules, data.datasets,
  rewards-as-list, 9 trainers; CLAUDE.md imports AGENTS.md to avoid drift
- topics/samplers.md: correct _resolve_sampler_type + AdvantageProcessor
  group_distributed paths; parity_testing set_scheduler_timesteps
- hparams/model_args.py: model_type Literal now matches registry keys

Co-authored-by: Cursor <cursoragent@cursor.com>
@Jayce-Ping Jayce-Ping merged commit 6e52dcc into main Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant