Interop with the diffusers AnyFlow pipeline (load · convert · docs) by Enderfga · Pull Request #2 · NVlabs/AnyFlow

Enderfga · 2026-05-14T10:06:50Z

With huggingface/diffusers#13745 now merged, the AnyFlowPipeline / AnyFlowFARPipeline / AnyFlowTransformer3DModel / AnyFlowFARTransformer3DModel / FlowMapEulerDiscreteScheduler classes ship in diffusers ≥ 0.36. This PR aligns this repository with that release on three axes — loading, conversion, and docs — so the same .pt weights and the same checkpoint URLs work through either entry point.

What changes

1. Loading: `from_pretrained` overrides + namespace bindings

The nvidia/AnyFlow-*-Diffusers checkpoints currently use the metadata layout written by the pipelines in this repository — model_index.json references far.* modules and transformer/config.json carries init_far_model, init_flowmap_model, chunk_partition, qk_norm, added_kv_proj_dim. After the upcoming metadata flip on the Hub repos, those fields will reference the diffusers class names instead.

FAR_Wan_Transformer3DModel.from_pretrained — when the config omits init_*_model, derive the flags from _class_name and fall back to chunk_partition=[1, 3, 3, 3, 3, 3, 3, 2] for the FAR variant (matches the 81-frame inference schedule).
WanAnyFlowPipeline.from_pretrained / FARWanAnyFlowPipeline.from_pretrained — pre-instantiate the transformer and scheduler with the classes defined here and pass them as kwargs, so DiffusionPipeline.from_pretrained skips its class-name lookup for those entries.
Module-level binding — diffusers.{AnyFlowTransformer3DModel, AnyFlowFARTransformer3DModel, FlowMapEulerDiscreteScheduler} are bound to the local classes only when diffusers doesn't already provide those names; idempotent on modern diffusers.

2. Conversion: `scripts/convert_model/convert_anyflow_to_diffusers.py`

The script no longer wraps the EMA .pt into WanAnyFlowPipeline / FARWanAnyFlowPipeline from this repository — it now imports AnyFlowPipeline / AnyFlowFARPipeline / AnyFlowTransformer3DModel / AnyFlowFARTransformer3DModel / FlowMapEulerDiscreteScheduler directly from diffusers and runs pipeline.save_pretrained(...). The output is the canonical diffusers directory layout — model_index.json references AnyFlowPipeline etc., and the directory loads via AnyFlowPipeline.from_pretrained(...) on a stock pip install diffusers with no compat shim.

Tensor keys are bit-exact between FAR_Wan_Transformer3DModel and the diffusers AnyFlow classes (verified against the released NVlabs checkpoints on H200, L2 = 0), so load_state_dict(strict=False) carries the EMA bookkeeping fields without dropping any real weights.

CLI surface (OmegaConf model_type / model_path / model_save_dir) is preserved.

3. Docs: README "Using with HF diffusers" section

End-to-end examples for the bidirectional and FAR-causal pipelines, including I2V via the video kwarg ((B, T, C, H, W) in [0, 1], aligned with VideoProcessor.preprocess_video's 5D contract).

Compatibility

Additive only:

Checkpoints with the existing config fields keep loading unchanged.
User-supplied kwargs always win over the derived defaults.
Namespace bindings are idempotent — they no-op once diffusers ships the AnyFlow classes.
The convert script raises a clear ImportError (with an upgrade hint) when run against diffusers < 0.36.

No signature or behavior change for any existing code path in the repository; demo.py and the training entry points are untouched.

…layout The nvidia/AnyFlow-*-Diffusers checkpoints reference the diffusers AnyFlow class names from `model_index.json` and omit `init_far_model`, `init_flowmap_model`, `chunk_partition`, `qk_norm`, and `added_kv_proj_dim` from `transformer/config.json`. - `FAR_Wan_Transformer3DModel`: when `init_*_model` is absent, derive the flags from `_class_name`. For the FAR variant, additionally fall back to `chunk_partition=[1, 3, 3, 3, 3, 3, 3, 2]` to match the 81-frame inference schedule. Configs that already set these fields are passed through unchanged; user kwargs always win. - `WanAnyFlowPipeline` / `FARWanAnyFlowPipeline`: pre-instantiate the transformer and scheduler with the classes defined here and pass them as kwargs, so `DiffusionPipeline.from_pretrained` skips the module lookup for those two entries. text_encoder / tokenizer / vae still load normally. - Bind `diffusers.{AnyFlowTransformer3DModel, AnyFlowFARTransformer3DModel, FlowMapEulerDiscreteScheduler}` at import time when not already defined; idempotent. - Export `FlowMapEulerDiscreteScheduler` as an alias for `FlowMapDiscreteScheduler`. Additive only — no class signature or behavior change for existing checkpoints.

Document the standard diffusers usage for the nvidia/AnyFlow-*-Diffusers checkpoints — AnyFlowPipeline for bidirectional T2V and AnyFlowFARPipeline for the FAR causal variant (T2V / I2V / V2V via the `context_sequence` argument).

@dg845

…s section The diffusers AnyFlow pipelines renamed the conditioning kwarg from ``context_sequence={"raw"/"latent"}`` to ``video`` / ``video_latents`` in huggingface/diffusers#13745 (review feedback from @dg845 — match ``WanVideoToVideoPipeline``'s API surface). Update the README to reflect the new kwarg and add a short I2V example showing how to pass the single-frame conditioning tensor. Only docs change; the in-repo ``WanAnyFlowPipeline`` / ``FARWanAnyFlowPipeline`` keep their original ``context_sequence`` kwarg.

VideoProcessor.preprocess_video's 5D contract is (B, T, C, H, W) — the diffusers AnyFlow PR aligned its docstring + EXAMPLE_DOC_STRING with this in the third review pass (huggingface/diffusers#13745, commits ffdc969 and downstream). This README's I2V example still showed (B, C, T, H, W) and the matching unsqueeze(2); update both so users following the README verbatim get a tensor the diffusers pipeline accepts.

…Flow classes The training pipeline (far/main.py:save_checkpoint) emits .pt files keyed by 'ema' / 'model_state_dict_g'; the diffusers pipelines load from a structured directory written by pipeline.save_pretrained(). Until now this conversion script wrapped the .pt into a pipeline built from this repository's WanAnyFlowPipeline / FARWanAnyFlowPipeline / FAR_Wan_Transformer3DModel / FlowMapDiscreteScheduler — so the resulting model_index.json referenced far.* paths that diffusers.from_pretrained couldn't resolve. Switch the conversion to the diffusers AnyFlow classes (introduced in huggingface/diffusers#13745): - AnyFlowTransformer3DModel (bidirectional T2V variants) - AnyFlowFARTransformer3DModel (FAR causal variants) - AnyFlowPipeline / AnyFlowFARPipeline - FlowMapEulerDiscreteScheduler Output directories now load via AnyFlowPipeline.from_pretrained(...) with no compat shim. The CLI surface (OmegaConf model_type / model_path / model_save_dir keys + auto-append of model_type to the save dir) is preserved. Tensor keys are unchanged across FAR_Wan_Transformer3DModel and the diffusers AnyFlow classes (bit-exact L2=0 against the released NVlabs checkpoints), so load_state_dict(strict=False) handles the EMA bookkeeping fields without dropping any real weights.

The scheduler hyperparameters were hardcoded (shift=5.0, num_train_timesteps=1000), matching the released AnyFlow distillation recipe but silently wrong for any future checkpoint trained with a different schedule (e.g. higher-resolution runs that re-tune shift). Make them OmegaConf CLI keys so the conversion stays correct without code edits. Also add a 'source' key so users can convert from the non-EMA generator weights (model_state_dict_g) instead of EMA — useful for ablation runs that want to diff EMA vs raw checkpoints without re-saving.

Enderfga · 2026-05-23T13:25:49Z

Superseded by #5, which takes the inverse approach: instead of shimming this repo's classes to load checkpoints written for the upstream diffusers AnyFlow layout, it realigns the in-repo scheduler / transformer / pipeline API with the upstream contract from huggingface/diffusers#13745. Closing this one.

Enderfga added 2 commits May 14, 2026 17:51

docs: add 'Using with HF diffusers' section to README

465aecc

Document the standard diffusers usage for the nvidia/AnyFlow-*-Diffusers checkpoints — AnyFlowPipeline for bidirectional T2V and AnyFlowFARPipeline for the FAR causal variant (T2V / I2V / V2V via the `context_sequence` argument).

Enderfga marked this pull request as ready for review May 14, 2026 10:09

Enderfga closed this May 14, 2026

Enderfga reopened this May 14, 2026

Enderfga marked this pull request as draft May 14, 2026 10:11

Enderfga mentioned this pull request May 20, 2026

Add AnyFlow Any-Step Video Diffusion Pipelines (Bidirectional + FAR Causal) huggingface/diffusers#13745

Merged

6 tasks

Enderfga marked this pull request as ready for review May 22, 2026 10:23

Enderfga added 2 commits May 22, 2026 18:33

Enderfga changed the title ~~Load nvidia/AnyFlow-* checkpoints from the diffusers AnyFlow metadata layout~~ Interop with the diffusers AnyFlow pipeline (load · convert · docs) May 22, 2026

Enderfga closed this May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interop with the diffusers AnyFlow pipeline (load · convert · docs)#2

Interop with the diffusers AnyFlow pipeline (load · convert · docs)#2
Enderfga wants to merge 6 commits into
NVlabs:mainfrom
Enderfga:diffusers-compat-wip

Enderfga commented May 14, 2026 •

edited

Loading

Uh oh!

Enderfga commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Enderfga commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes

1. Loading: from_pretrained overrides + namespace bindings

2. Conversion: scripts/convert_model/convert_anyflow_to_diffusers.py

3. Docs: README "Using with HF diffusers" section

Compatibility

Uh oh!

Enderfga commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Enderfga commented May 14, 2026 •

edited

Loading

1. Loading: `from_pretrained` overrides + namespace bindings

2. Conversion: `scripts/convert_model/convert_anyflow_to_diffusers.py`