Fix #8673: SD3 attention masks for text padding tokens by srlynch1 · Pull Request #4 · srlynch1/diffusers

srlynch1 · 2026-06-21T11:46:25Z

Summary

Add Attention.prepare_joint_attention_mask() for SD3's [hidden_states, encoder_hidden_states] concat order
Wire mask into JointAttnProcessor2_0 and FusedJointAttnProcessor2_0
Add padding-invariance tests (full transformer + processor-level)

Test plan

pytest tests/models/test_sd3_joint_attention_mask.py -q
python utils/check_copies.py

Made with Cursor

Standalone test avoids full transformer import chain; verifies padding invariance at processor level. Co-authored-by: Cursor <cursoragent@cursor.com>

cursor

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Joint mask on image self-attention
- JointAttnProcessor2_0 now only prepares and applies the joint attention mask when encoder_hidden_states is present, so the attn2 image-only self-attention path ignores the text mask passed via joint_attention_kwargs.

Or push these changes by commenting:

@cursor push 01eaf44c8f

Preview (01eaf44c8f)

diff --git a/src/diffusers/models/attention_processor.py b/src/diffusers/models/attention_processor.py
--- a/src/diffusers/models/attention_processor.py
+++ b/src/diffusers/models/attention_processor.py
@@ -1513,7 +1513,10 @@
             value = torch.cat([value, encoder_hidden_states_value_proj], dim=2)
 
         if attention_mask is not None:
-            attention_mask = attn.prepare_joint_attention_mask(attention_mask, key.shape[2], key.dtype)
+            if encoder_hidden_states is not None:
+                attention_mask = attn.prepare_joint_attention_mask(attention_mask, key.shape[2], key.dtype)
+            else:
+                attention_mask = None
 
         hidden_states = F.scaled_dot_product_attention(
             query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False

_{You can send follow-ups to the cloud agent here.}

^{Reviewed by Cursor Bugbot for commit f41465a. Configure here.}

cursor · 2026-06-21T11:48:04Z


-        hidden_states = F.scaled_dot_product_attention(query, key, value, dropout_p=0.0, is_causal=False)
+        if attention_mask is not None:
+            attention_mask = attn.prepare_joint_attention_mask(attention_mask, key.shape[2], key.dtype)


Joint mask on image self-attention

High Severity

In JointAttnProcessor2_0, prepare_joint_attention_mask runs whenever attention_mask is set, even when encoder_hidden_states is None. SD3.5 dual-attention blocks pass the same joint_attention_kwargs (including the text mask) into the second JointAttnProcessor2_0 self-attention pass, so image-only keys get a wrongly padded joint mask and incorrect SDPA masking.

^{Reviewed by Cursor Bugbot for commit f41465a. Configure here.}

srlynch1 and others added 2 commits June 21, 2026 21:25

Fix huggingface#8673: SD3 attention masks for text padding tokens

c7aede3

Add processor-level SD3 joint attention mask test

f41465a

Standalone test avoids full transformer import chain; verifies padding invariance at processor level. Co-authored-by: Cursor <cursoragent@cursor.com>

cursor Bot reviewed Jun 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #8673: SD3 attention masks for text padding tokens#4

Fix #8673: SD3 attention masks for text padding tokens#4
srlynch1 wants to merge 2 commits into
mainfrom
e2e/diffusers-8673

srlynch1 commented Jun 21, 2026

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

cursor Bot Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srlynch1 commented Jun 21, 2026

Summary

Test plan

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 21, 2026

Choose a reason for hiding this comment

Joint mask on image self-attention

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cursor Bot left a comment •

edited

Loading