[docs,hparams] docs: sync agent harness (.agents/.cursor/guidance) with implementation#185
Merged
Merged
Conversation
Resync .agents/, .cursor/, guidance/, AGENTS.md and CLAUDE.md with the current code after plugin growth (9 trainers, 14 model adapters, 13 reward models). Fixes registry drift, wrong config/API facts and broken cross-references found in a full audit. - architecture.md/AGENTS.md: add diffusion-opd trainer, clap/imagebind/ geneval rewards, Bagel/LTX2 models; fix RationalRewards* class names - constraints.md: evaluate() is concrete (not abstract); index #28-29; paradigm (#7) and training-args (#16) lists; de-numbered line refs - philosophy.md: Accelerate (DDP/DeepSpeed ZeRO-1-2/FSDP) backend; fix #27 ref - guidance: scheduler.* config keys, real sample()/compute_advantages snippets, GenEval metadata convention, audio reward param, Bagel link - skills: model_name_or_path, default_target_modules, data.datasets, rewards-as-list, 9 trainers; CLAUDE.md imports AGENTS.md to avoid drift - topics/samplers.md: correct _resolve_sampler_type + AdvantageProcessor group_distributed paths; parity_testing set_scheduler_timesteps - hparams/model_args.py: model_type Literal now matches registry keys Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resyncs the AI-agent harness with the current
src/flow_factory/implementation after plugin growth (now 9 trainers, 14 model adapters, 13 reward models). A full audit found ~40 discrepancies; this PR fixes registry drift, wrong config/API facts, and broken cross-references. Almost entirely docs/harness; one zero-risk type-hint correction inhparams/model_args.py.Registry drift
architecture.md/AGENTS.md: adddiffusion-opdtrainer,clap/imagebind/genevalrewards, Bagel/LTX2 models; fixRationalRewardsT2IRewardModel/RationalRewardsEditRewardModelclass names.Wrong facts that could mislead agents or break YAML
guidance/algorithms.md:train.dynamics_type/train_steps/num_train_steps->scheduler.dynamics_type/sde_steps/num_sde_steps(verified againstexamples/).guidance/workflow.md: realsample()(delegates togenerate_samples) andcompute_advantages()snippets (removed nonexistentcompute_advantage_weighted_sum); fixedprompt_embeddstypo.guidance/rewards.md: fixedtrainer/grpopath, GenEvalmetadata-JSON convention, addedaudioparam, addedocr/clap/imagebind.model_path->model_name_or_path;target_module_map-> config-driven +default_target_modules;data.dataset->data.datasets; rewards single-dict -> list; removedevaluate()skeleton.constraints.mdOptimizeGroupwise Rewardcomputation #11:evaluate()is concrete (not abstract).Cross-reference rot
constraints.mdindex extended to#28-29;philosophy.md#27ref fixed and "FSDP" -> Accelerate (DDP/DeepSpeed ZeRO-1-2/FSDP).topics/samplers.md: corrected_resolve_sampler_type+ AdvantageProcessorgroup_distributedpath;parity_testing.mdset_timesteps->set_scheduler_timesteps; brittle line numbers -> symbol refs..cursor/rules/skills-reusable-only.mdc: register skills inAGENTS.md;CLAUDE.mdnow imports@AGENTS.mdso it can't drift to a stale subset.Code (1 line of intent)
hparams/model_args.py:model_typeLiteralnow matches the 14 registry keys. Zero-risk —ArgABCis a plain dataclass with noLiteralenforcement (examples already usesd3-5/bagel/ltx2_*).Test plan
trainers/,models/,rewards/registries.model_args.pyparses (AST) and has no linter errors.compute_advantage_weighted_sum,trainer/grpo,prompt_embedds, baretrain_steps,set_timesteps().Notes
black --check/isort --checkfail onsrc/flow_factory/hparams/model_args.py, but this is pre-existing (the whole file predates formatting); avoided a full-file reformat to keep this PR scoped. Can format separately if desired.my_reward_remote.pydocstrings /reward_server/example_server.pyreferenceflow_factory.rewards.remoteinstead ofmy_reward_remote.Made with Cursor