Skip to content

Fix atomic load access pointer lowering#2238

Closed
cklxx wants to merge 1 commit into
tile-ai:mainfrom
cklxx:fix-issue-2123-lower-access-ptr
Closed

Fix atomic load access pointer lowering#2238
cklxx wants to merge 1 commit into
tile-ai:mainfrom
cklxx:fix-issue-2123-lower-access-ptr

Conversation

@cklxx

@cklxx cklxx commented May 21, 2026

Copy link
Copy Markdown
Contributor

Fixes #2123.

Summary:

  • Preserve tl.access_ptr arg0 as address metadata during safe-memory rewriting.
  • Lower tl.access_ptr without recursively mutating the full call before reading its BufferLoad base.
  • Add a regression for T.atomic_load(status[look]) with a block-derived mutable loop index.

Validation:

  • cmake --build build -j14
  • python -m pytest testing/python/issue/test_tilelang_issue_2123.py testing/python/language/test_tilelang_language_access_ptr.py -q
  • pre-commit run --all-files

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced atomic load operation handling for dynamically-derived block indices during code generation.
  • Tests

    • Added test coverage for atomic load operations with complex dynamic indexing patterns.

Review Change Stack

@github-actions

Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai

coderabbitai Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a4635c97-7362-4047-8bfb-e105ffd37ca4

📥 Commits

Reviewing files that changed from the base of the PR and between 8f9cd13 and 429d376.

📒 Files selected for processing (3)
  • src/transform/legalize_safe_memory_access.cc
  • src/transform/lower_access_ptr.cc
  • testing/python/issue/test_tilelang_issue_2123.py

📝 Walkthrough

Walkthrough

This PR fixes a compilation failure when T.atomic_load uses a mutable block-derived loop variable. AccessPtrLowerer is hardened to check call operator identity before assuming argument shapes, SafeMemoryRewriter preserves BufferLoad inside tl::access_ptr arguments, and a regression test validates the pattern.

Changes

Access Pointer Lowering for Block-Derived Indices

Layer / File(s) Summary
AccessPtrLowerer refactoring for argument shape robustness
src/transform/lower_access_ptr.cc
AccessPtrLowerer::VisitExpr_ now checks op->op directly against tl::access_ptr before recursively visiting arguments, avoiding assumptions about first-argument shape after mutation. Non-matching calls delegate to the base mutator.
SafeMemoryRewriter preservation of access_ptr BufferLoad
src/transform/legalize_safe_memory_access.cc
SafeMemoryRewriter now overrides VisitExpr_ to intercept tl::access_ptr calls, keeping the first argument (BufferLoad) unchanged and only visiting extent and rw_mask arguments.
Regression test for atomic_load with block-derived index
testing/python/issue/test_tilelang_issue_2123.py
New test validates that T.atomic_load(status[look]) with look initialized from tile - 1 and mutated in a loop lowers correctly, producing no tl::access_ptr calls and proper tl::atomic_load_elem_op calls with tvm_access_ptr addresses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • tile-ai/tilelang#2072: Introduces many tl::access_ptr call sites for refactored ptx_ldmatrix API, directly benefiting from this PR's hardened access_ptr lowering and rewriting.
  • tile-ai/tilelang#1292: Prior refactor of SafeMemorysRewriter using IRMutatorWithAnalyzer, which this PR extends with special tl::access_ptr handling.
  • tile-ai/tilelang#1050: Earlier modifications to SafeMemorysRewriter in the same file, providing context for visitor-pattern changes.

Suggested reviewers

  • LeiWang1999

Poem

🐰 A pointer that dances through block-derived loops,
Now keeps its base load safe, no recursive droops.
SafeMemory guards it, while AccessPtr checks twice,
Atomic loads settle—the lowering plays nice! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title concisely summarizes the primary fix: improving atomic load access pointer lowering by addressing the BufferLoad downcast issue.
Linked Issues check ✅ Passed All code changes directly address issue #2123: SafeMemoryRewriter preserves address metadata, AccessPtrLowerer avoids recursive mutation before reading BufferLoad, and regression test validates the fix.
Out of Scope Changes check ✅ Passed All changes are scoped to the reported issue: two refactored lowering transformations and one regression test file with no extraneous modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LeiWang1999

Copy link
Copy Markdown
Member

doesn't make sense to me, checkout #2157 , closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] T.atomic_load fails in LowerAccessPtr when the address uses a block-derived loop variable

2 participants