Skip to content

[ADRENO] Revive and consolicate Adreno features#18867

Merged
tqchen merged 2 commits into
apache:mainfrom
srkreddy1238:rebase_adreno
Mar 4, 2026
Merged

[ADRENO] Revive and consolicate Adreno features#18867
tqchen merged 2 commits into
apache:mainfrom
srkreddy1238:rebase_adreno

Conversation

@srkreddy1238

Copy link
Copy Markdown
Contributor

Enable opencl target for gpu tests.
Consolidates all Adreno tests under tests/python/relax/backend/adreno
Changes to CLML corresponding to recent changes on json codegen/runtime.
Docker specification for Adreno (ci_gpu + Android SDK, Gradle).

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Adreno GPU support by updating the CLML SDK version, introducing a dedicated Docker-based CI environment, and extending the CLML backend with new capabilities for Large Language Model (LLM) inference. The changes streamline Adreno development and testing, improve CLML runtime efficiency, and expand the range of supported operations, particularly for quantized matrix multiplications.

Highlights

  • CLML SDK Version Update: The CLML SDK major version has been updated from 2 to 4 in the CMake configuration, reflecting advancements in the underlying library.
  • Adreno CI Environment Setup: A new Dockerfile and an associated installation script have been introduced to establish a dedicated Adreno CI environment, including the Android SDK and Gradle, streamlining continuous integration for Adreno targets.
  • Large Language Model (LLM) Support: The CLML backend now includes support for dequantize-matmul patterns, specifically tailored for optimizing Large Language Model (LLM) inference on Adreno GPUs.
  • CLML Runtime Enhancements: The CLML runtime has been significantly refactored to handle dynamic tensor dimensions, improve memory management, and introduce a new CLML function descriptor for more robust operator management.
  • Test Infrastructure Consolidation: All Adreno-related tests have been consolidated under a new tests/python/relax/backend/adreno directory, involving renaming existing test files and updating the test infrastructure for better organization and maintainability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cmake/modules/contrib/CLML.cmake
    • Updated the CLML major version from 2 to 4.
  • docker/Dockerfile.ci_adreno
    • Added a new Dockerfile for Adreno CI, including necessary tools like clang-format, adb, and uv.
  • docker/install/ubuntu_install_androidsdk.sh
    • Added a script to install the Android SDK, NDK, and CMake within the Docker environment.
  • python/tvm/relax/backend/adreno/clml.py
    • Introduced GlobalVarPattern and TuplePattern imports.
    • Added _check_dequantize_matmul and dequantize_matmul_patterns functions.
    • Implemented a new OpenCLMLOffLoadForLLM pass to support dequantize-matmul operations.
  • python/tvm/relax/backend/adreno/pipeline.py
    • Modified library_dispatch_passes to include the new OpenCLMLOffLoadForLLM pass when CLML is enabled.
  • src/relax/backend/contrib/clml/codegen.cc
    • Removed the SaveGlobalAttributes(node) call from the OpenCLMLJSONSerializer.
  • src/runtime/contrib/clml/clml_memory_planner.cc
    • Changed ssize_t to size_t for free_start, free_size, and last_found_size variables.
  • src/runtime/contrib/clml/clml_runtime.cc
    • Implemented dynamic tensor dimension handling.
    • Updated memory release logic.
    • Modified clEnqueueMLOpQCOM calls to use function[i].op.
    • Added SetTensorMemDesc for dynamic tensor updates.
    • Introduced CreateDequantMatmulLayer for dequantize-matmul operations.
    • Refactored layer_.function to store CLMLFunctionDesc objects and removed layer_.layer_names.
  • src/runtime/contrib/clml/clml_runtime.h
    • Added V5_API macro and new CLML_CALL macros for clSetMLTensorDimensionsQCOM, clUpdateMLOpQCOM, clGetMLOpDeviceMemoryRequirementsQCOM, clUpdateMLOpDeviceMemoryQCOM.
    • Updated CLML_CALL_clCreateMLTensorQCOM to support CLML v5.
    • Introduced CLMLFunctionDesc struct to store op, layer name, and properties.
    • Updated CachedLayer to use the new CLMLFunctionDesc struct.
  • src/runtime/contrib/clml/clml_utils.cc
    • Modified DeviceMakeCLMLTensor and MakeCLMLTensor functions to accept tensorProps for dynamic tensor properties.
    • Adjusted GetTensorDims to handle dynamic shapes by using fmax and checking shape.size().
  • src/runtime/contrib/clml/clml_utils.h
    • Updated function signatures for DeviceMakeCLMLTensor and MakeCLMLTensor to include tensorProps parameter.
  • tests/python/relax/backend/adreno/mod_utils.py
    • Renamed from tests/python/relax/backend/clml/mod_utils.py.
    • Updated codegen attribute formats from list-of-lists to single values or lists.
    • Added get_dequant_matmul_module and get_dequant_vec_matmul_module for LLM tests.
  • tests/python/relax/backend/adreno/test_clml_ops.py
    • Renamed from tests/python/relax/backend/clml/test_clml_codegen.py.
    • Updated test infrastructure to use utils.py.
    • Added CLML_VERSION and TARGET_CLML_VERSION checks.
    • Introduced verify_clml_codegen and a more flexible verify function.
    • Added new tests for dequant_matmul and dequant_vec_matmul.
    • Commented out some existing tests due to padding realization changes.
  • tests/python/relax/backend/adreno/test_texture_network.py
    • Renamed from tests/python/relax/texture/test_network.py.
    • Updated test verification to use verify_results from the new utils.py.
  • tests/python/relax/backend/adreno/test_texture_ops.py
    • Renamed from tests/python/relax/texture/test_ops.py.
    • Updated test verification to use verify_results from the new utils.py.
    • Adjusted some test modules by removing redundant add and relu operations.
  • tests/python/relax/backend/adreno/test_transform_annotate_custom_scope.py
    • Renamed from tests/python/relax/adreno/test_transform_annotate_custom_scope.py.
  • tests/python/relax/backend/adreno/test_transform_fold_vdevice_scope_change.py
    • Renamed from tests/python/relax/adreno/test_transform_fold_vdevice_scope_change.py.
  • tests/python/relax/backend/adreno/utils.py
    • Added a new utility file containing run_time_check, SessionManager, run_local, build_and_run, and verify_results functions to standardize Adreno test execution and verification.
  • tests/python/relax/backend/clml/conftest.py
    • Removed.
  • tests/python/relax/backend/clml/test_op_exec_clml_codegen.py
    • Removed.
  • tests/python/relax/backend/clml/utils.py
    • Removed.
  • tests/scripts/ci.py
    • Modified the generate_command function signature to make the 'tests' parameter optional.
  • tests/scripts/task_python_integration_gpuonly.sh
    • Removed the export of TVM_RELAY_OPENCL_TEXTURE_TARGETS.
  • tests/scripts/task_python_unittest_gpuonly.sh
    • Updated TVM_TEST_TARGETS to include OpenCL for Mali and Adreno.
    • Removed old CLML texture test execution.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request includes changes to CLML SDK version in CLML.cmake, adds a new Dockerfile.ci_adreno for Adreno GPU CI, and introduces ubuntu_install_androidsdk.sh to install the Android SDK. The code also adds a compiler pass OpenCLMLOffLoadForLLM that partitions the graph with dequant Matmul to CLML backend offload. The code changes in clml_runtime.cc and clml_utils.cc involve setting dynamic tensor dimensions and updating tensor memory descriptors. Review comments suggest addressing command injection vulnerabilities in ubuntu_install_androidsdk.sh by quoting the http_proxy variable, using mkdir -p for directory creation, and sanitizing command line arguments appended to /etc/profile. Additionally, the reviewer recommends consolidating apt-get update calls in Dockerfile.ci_adreno, removing a redundant COPY command, correcting a duplicate package listing, and updating the attributes format in expected codegen for test_dequant_matmul and test_dequant_vec_matmul.

Comment thread docker/install/ubuntu_install_androidsdk.sh Outdated
Comment thread docker/install/ubuntu_install_androidsdk.sh Outdated
Comment thread docker/Dockerfile.ci_adreno Outdated
Comment thread docker/Dockerfile.ci_adreno Outdated
Comment thread docker/Dockerfile.ci_adreno Outdated
Comment thread docker/install/ubuntu_install_androidsdk.sh Outdated
Comment thread tests/python/relax/backend/adreno/test_clml_ops.py Outdated
Comment thread tests/python/relax/backend/adreno/test_clml_ops.py Outdated
@tqchen

tqchen commented Mar 3, 2026

Copy link
Copy Markdown
Member

the consolidation looks good, maybe we can land those, then ci separately?

Enable opencl target for gpu tests.
Consolidates all Adreno tests under tests/python/relax/backend/adreno
Changes to CLML corresponding to recent changes on json codegen/runtime.
Docker specification for Adreno (ci_gpu + Android SDK).
@tqchen tqchen merged commit 21e5225 into apache:main Mar 4, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants