Add KV Cache for Autoregressive Inference

Autoregressive Diffusion Techniques such as [Self Forcing](https://github.com/guandeh17/Self-Forcing/tree/main) rely on a rolling KV Cache across video frame chunks to transfer information from past context frames to the current frames being denoised.

This rolling KV Cache design (or variants similar to it) is likely to show up in other types of long video generation/ world models, so it would be good to see if we can support it natively in Diffusers. 

Tasks
- Implement rolling KV Cache seen in [Self Forcing](https://github.com/guandeh17/Self-Forcing/tree/main) using Diffusers' [cache hooks design](https://github.com/huggingface/diffusers/blob/b3e9dfced7c9e8d00f646c710766b532383f04c6/src/diffusers/hooks/first_block_cache.py#L34). 
- Add a Modular Block to Wan Modular Pipelines that uses this rolling KV Cache to perform autoregressive inference.     
        

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add KV Cache for Autoregressive Inference #12600

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add KV Cache for Autoregressive Inference #12600

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions