[TIR][IR] Update to use tirx#2216
Conversation
- Added a new pass to hoist loop-invariant if statements out of loops, improving optimization opportunities. - Introduced classes for collecting and checking conditions, as well as for rewriting statements. - Integrated the new pass into the optimization pipeline and provided a corresponding API for usage.
…ache disabling in benchmark script - Introduced CallNodeChecker class to identify CallNode expressions in loop conditions, enhancing loop-invariant checks. - Updated IsLoopInvariant function to reject conditions containing CallNodes, preventing potential side effects. - Added tilelang.disable_cache() in benchmark_mha_sink_fwd.py to optimize performance during benchmarking.
- Added support for Let-bound variables in the WrittenBufferReadChecker to improve buffer read checks. - Introduced UsesLoopVarThroughLetBindings function to check if conditions depend on loop variables through Let bindings. - Updated IsLoopInvariant function to account for Let bindings when determining loop invariance. - Enhanced HoistableIfFinder to track Let bindings for variables bound to BufferLoad expressions. - Added debug print statements in the OptimizeForTarget function to visualize the module state before and after loop unswitching.
- Added CallCheckerExcludingIf class to ensure function calls outside of hoisted if statements are identified, preventing potential synchronization issues during loop unswitching. - Updated loop unswitching logic to incorporate the new call checker, enhancing safety and correctness. - Integrated debug print statements in OptimizeForTarget to visualize module state before and after loop unswitching. - Disabled tilelang cache in the benchmark script for improved performance.
…nts in OptimizeForTarget function
- Introduced a new configuration option `tl.disable_loop_unswitching` to allow users to disable the loop unswitching optimization. - Updated the Loop Unswitching pass to check this configuration and return the original function if the option is enabled. - Added relevant documentation in the PassConfigKey enumeration for clarity.
# Conflicts: # src/op/builtin.h # src/transform/loop_unswitching.cc # testing/python/transform/test_tilelang_transform_loop_unswitching.py # tilelang/transform/__init__.py # tilelang/transform/pass_config.py
- Removed references to `tvm.tir` and replaced them with `tvm.tirx` across various files, including examples and backend operations. - Updated target configuration in documentation to reflect the new usage of target config dictionaries instead of CLI-style strings. - Cleaned up `pyproject.toml` and `load_tvm.cmake` by removing obsolete paths. - Enhanced examples for dequantization and GEMM operations to utilize the new `tirx` constructs. - Adjusted various backend operations to ensure compatibility with the new `tirx` namespace.
|
Important Review skippedToo many files! This PR contains 300 files, which is 150 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (300)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
…tor/tirx-tvm-update # Conflicts: # testing/python/transform/test_tilelang_transform_producer_consumer_ws.py # tilelang/cuda/intrinsics/macro/tcgen05_macro_generator.py # tilelang/engine/lower.py # tilelang/utils/target.py
…dify target_host initialization in lower.py for compatibility with TVM's target handling.
…d library dependencies for TVM integration. Adjusted output targets and library names for better compatibility across platforms.
…h wildcard for better compatibility with dependency packages.
…tor/tirx-tvm-update
|
@regression-perf |
Performance Regression Test ReportTriggered by: @LeiWang1999 Results
Artifacts
|
…tor/tirx-tvm-update # Conflicts: # src/backend/cuda/op/gemm_sp.cc # src/op/gemm.cc # src/op/gemm_sp.cc # src/op/gemm_sp.h # src/op/gemm_sp_py.cc # src/op/gemm_sp_py.h # tilelang/cuda/op/gemm_sp/gemm_sp_mma.py # tilelang/ir.py # tilelang/language/experimental/gemm_sp.py # tilelang/tileop/gemm_sp/__init__.py
…ctor/tirx-tvm-update
…torization. Introduce new functions for preferred vectorized size and update existing reduction logic to handle packed operations for bfloat16 and float16 types. Add nan-aware min and max operations in CUDA and ROCm backends, and update related tests to validate functionality.
…tor/tirx-tvm-update
…r packed operations in reduce.h
…tor/tirx-tvm-update # Conflicts: # src/transform/legalize_negative_index.cc # tilelang/jit/adapter/cutedsl/wrapper.py
|
@regression-perf |
|
local test can pass, looking forward to the regression test. |
…tor/tirx-tvm-update
…pdated `BufferLoadNode` to `tirx::BufferLoadNode` in `GetBarrier` and `LowerCluster` methods to ensure compatibility with recent changes in the TIR API.
## Summary This PR adds a Z3 SMT solver backend to `tvm::arith::Analyzer` for stronger integer arithmetic proving. The integration is guarded by `USE_Z3`, which defaults to `AUTO`. In the default mode, TVM enables Z3 when the static Z3 development artifacts are available and otherwise builds the conservative stub implementation. When Z3 is enabled, `Analyzer::CanProve` runs the existing TVM arithmetic analysis path first, then falls back to Z3 only when the existing analyzers cannot prove the predicate and the requested strength is `kSymbolicBound`. Z3 is linked statically from the PyPI `z3-static` package, so `libtvm` does not need a runtime `libz3` dependency. ## Features - Z3 build support through `USE_Z3`, defaulting to `AUTO`. - A new `arith::Z3Prover` sub-analyzer owned by `arith::Analyzer`. - SMT-LIB2 export for debugging and external solver reproduction. - Python debug/config APIs: `Analyzer.get_smtlib2`, `Analyzer.set_z3_timeout_ms`, `Analyzer.set_z3_rlimit`, and `Analyzer.get_z3_stats`. - C++ APIs for proving, binding, constraints, stats, model inspection, and satisfying-value counting. - Scalar integer, unsigned integer, and boolean expression translation to Z3. - Support for arithmetic, comparisons, boolean operators, `min`, `max`, `select`, `if_then_else`, `let`, casts, truncated division/modulo, floor division/modulo, and selected bitwise/shift operations. - Deterministic solver control using Z3 `rlimit`, with `random_seed` fixed to `42`. - Thread-local Z3 context sharing to reduce initialization overhead while keeping thread safety. - A disabled-mode stub implementation that returns conservative results when Z3 is not built. ## Implementation Notes - The real and stub implementations live in `src/arith/z3_prover.cc`, selected by the `TVM_USE_Z3` macro from `cmake/modules/contrib/Z3.cmake`. - `cmake/modules/contrib/Z3.cmake` first resolves the PIC static `libz3` layout provided by `z3-static` using its `z3_static.get_cmake_dir()` helper, then falls back to a custom `Z3_DIR` or `CMAKE_PREFIX_PATH` installation. - `USE_Z3=ON` requires Z3 to be found, while `USE_Z3=AUTO` allows source builds and CI jobs without Z3 artifacts to continue with the stub. - The Z3 fallback is exception-safe and gated behind `kSymbolicBound`, so the common `kDefault` path does not pay solver cost. - TVM `Div` and `Mod` are translated with truncating helpers rather than Z3's Euclidean operators to stay sound for negative dividends. - Shift handling relies on Z3's native bit-vector semantics and does not add hard assertions to the shared solver. ## References The implementation is based on the Z3 analyzer integration used in TileLang's TVM fork, with the upstream port kept scoped to TVM's arithmetic analyzer. - [tile-ai/tilelang#1367](tile-ai/tilelang#1367) - [tile-ai/tilelang#1458](tile-ai/tilelang#1458) - [tile-ai/tilelang#2216](tile-ai/tilelang#2216) - [tile-ai#22](tile-ai#22) - [tile-ai#24](tile-ai#24) - [Original TileLang TVM commit](tile-ai@e633295) --------- Signed-off-by: Ubospica <ubospica@gmail.com>
Closes apache/tvm-ffi#464
Summary
This PR migrates TileLang to the updated TVM
tirxAPI and the newer TVM-FFI baseline. It replaces the oldtirparser/builder/type surface withtirx, updates the C++ compiler and backend code to compile against the new namespaces, and refreshes the Python language, JIT, examples, and tests accordingly.Major changes
tirtotirxacross C++ and Python:tvm::tir/tvm.tir->tvm::tirx/tvm.tirxtvm.script.parser.tir->tvm.tirx.script.parsertvm.script.ir_builder.tir->tvm.tirx.script.buildertvm.tir.*type imports such asPrimExpr,Buffer,BufferRegion,IndexMap, andVar->tvm.tirx.*tirxIR:LetStmtbuilder/frame usage withtirx.bind/tirx::Bind.LetStmtwrappers asSeqStmt({tirx::Bind(...), body})wheretirxrepresents the binding as a flat statement.register_let_value,clear_let_values) for cases that still need to recover aliases from bound values, such as buffer-region pointer handling.tirxstructured block APIs, includingSBlockFrame,SBlockAllocBuffer,alloc_buffer, andsblock_attr.tirx.transform.*.tir.*totirx.*, including vectorization, storage rewrite, async copy, noalias, lower-pass injection, and debug options.s_tirtransforms for passes that have not moved totirxyet.tirxnodes, visitors, intrinsic calls, and analysis helpers.src/support/ffi_aliases.hcompatibility shim and addsrc/support/check.hwrappers around the new TVM-FFI check macros.apache-tvm-ffidependency requirement.tirxsurface, including parser overrides, eager builder behavior, JIT adapters, TileOp templates, and transform tests.Validation
./format.shReview notes
This is a broad mechanical migration with some semantic updates around binding construction. Reviewers should pay particular attention to:
LetStmtwas converted toBind, especially code that depends on binding scope or alias recovery.tirxnamespace and visitor/type updates.tir.*keys.