Skip to content

[Relax][TFLite] Introduce TensorFlow Lite frontend#18868

Merged
mshr-h merged 10 commits into
apache:mainfrom
srkreddy1238:tflite_frontend_rebase
Mar 23, 2026
Merged

[Relax][TFLite] Introduce TensorFlow Lite frontend#18868
mshr-h merged 10 commits into
apache:mainfrom
srkreddy1238:tflite_frontend_rebase

Conversation

@srkreddy1238

Copy link
Copy Markdown
Contributor

Verified for entire range of classification nets
Quantization is disabled at the moment
There exists few unsupported ops in conversion maps which is need to be mapped in future when relax op inventory grows.

Verified for entire range of classification nets
Quantization is disabled at the moment
There exists few unspoorted ops in convertion maps which is need to be
mapped in future when relax op inventory grows.
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the TFLite frontend for TVM's Relax framework by rebasing its implementation and expanding its capabilities. It introduces the fundamental functionality to convert TFLite models into Relax graphs, supporting a broader array of operations and model architectures. A crucial addition is the Flexbuffer decoder, which facilitates the interpretation of custom options embedded within TFLite models. These changes are rigorously validated through a comprehensive test suite, including end-to-end verification for various Keras classification networks, ensuring the accuracy and robustness of the model conversion process.

Highlights

  • TFLite Frontend Rebase: The TFLite frontend for TVM's Relax framework has been rebased, introducing core functionality to convert TFLite graphs into Relax graphs.
  • Flexbuffer Deserialization: A new module for partial Flexbuffer deserialization was added, enabling the parsing of custom options within TFLite models.
  • Expanded Operation Support: Extensive test coverage was added for a wide range of TFLite operations, including element-wise, binary, logical, reduction, convolution, and pooling operations.
  • Keras Network Verification: The TFLite frontend now includes end-to-end verification for several Keras classification networks, integrated into nightly CI builds.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/tvm/relax/frontend/tflite/init.py
    • Exported the from_tflite function to expose TFLite frontend functionality.
  • python/tvm/relax/frontend/tflite/tflite_flexbuffer.py
    • Added a new module for partial Flexbuffer deserialization.
    • Defined BitWidth and FlexBufferType enums for Flexbuffer schema.
    • Implemented methods for decoding keys, vectors, and maps from Flexbuffer.
  • tests/python/relax/test_frontend_tflite.py
    • Added a new test file dedicated to the TFLite frontend.
    • Included utility functions _get_mod_from_cfunc and verify for TFLite model conversion and end-to-end testing.
    • Implemented tests for various TFLite operations: Add, AddN, Split, Pack, Cast, ExpandDims, Transpose, Reshape, ConcatV2, MultiOutput, ELU, GELU, Swish, Fill, binary operations (add, subtract, multiply, divide, floormod, floordiv), Pow, Square, element-wise operations (relu, relu6, floor, ceil, tanh, sigmoid, abs, cos, sin, exp, negative, round, rsqrt, softmax, sqrt), comparison operations (less, less_equal, equal, not_equal), logical operations (logical_not, logical_or, logical_and), reduction operations (argmax, argmin), Conv2D, and Pool2D.
    • Added tests for several Keras classification networks (e.g., Xception, ResNet50, MobileNetV2, EfficientNetB0).
  • tests/scripts/task_python_nightly.sh
    • Updated the nightly CI script to include TFLite frontend tests.
    • Exported the CI_ENV_NIGHTLY environment variable to enable network tests within the CI.
Activity
  • No human activity (comments, reviews, etc.) was observed in the provided context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@srkreddy1238 srkreddy1238 requested review from mshr-h and tlopex March 4, 2026 03:18

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a TFLite frontend for Relax, including a flexbuffer parser and extensive tests. The changes look good overall, but there are some critical issues in the flexbuffer parser that could lead to incorrect behavior or crashes when handling different data types and byte widths. I've identified a couple of bugs in tflite_flexbuffer.py related to handling byte widths and data types during deserialization, and a minor issue with exception handling.

Note: Security Review is unavailable for this PR.

Comment on lines +81 to +86
unpack_str = ""
if byte_width == 1:
unpack_str = "<B"
elif byte_width == 4:
unpack_str = "<i"
assert unpack_str != ""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The indirect_jump function does not correctly handle all possible byte widths for offsets and uses a signed integer format for 4-byte offsets, which is incorrect for flexbuffer offsets. Flexbuffer offsets are unsigned and can have byte widths of 1, 2, 4, or 8. This implementation only supports 1 and 4-byte widths and incorrectly uses a signed format (<i) for 4-byte offsets. This can lead to incorrect parsing of flexbuffers.

        unpack_map = {1: "<B", 2: "<H", 4: "<I", 8: "<Q"}
        if byte_width not in unpack_map:
            raise NotImplementedError(f"Unsupported byte width for indirect jump: {byte_width}")
        unpack_str = unpack_map[byte_width]

def decode_map(self, end, byte_width, parent_byte_width):
"""Decodes the flexbuffer map and returns a dict"""
mid_loc = self.indirect_jump(end, parent_byte_width)
map_size = struct.unpack("<i", self.buffer[mid_loc - byte_width : mid_loc])[0]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The size of the map is hardcoded to be parsed as a 4-byte signed integer (<i), but the byte_width can vary (1, 2, 4, or 8). Additionally, map sizes should be unsigned integers. This will cause incorrect parsing for flexbuffers that use different byte widths for map sizes.

You can fix this by determining the unpack format based on byte_width:

        unpack_map = {1: "<B", 2: "<H", 4: "<I", 8: "<Q"}
        if byte_width not in unpack_map:
            raise NotImplementedError(f"Unsupported byte width for map size: {byte_width}")
        unpack_str = unpack_map[byte_width]
        map_size = struct.unpack(unpack_str, self.buffer[mid_loc - byte_width : mid_loc])[0]

elif value_type == FlexBufferType.FBT_FLOAT:
value = struct.unpack("<f", value_bytes)[0]
else:
raise Exception

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a generic Exception is not recommended as it can obscure the actual error. It's better to use a more specific exception type, like NotImplementedError, to provide more context about what went wrong. This helps with debugging and error handling.

                raise NotImplementedError(f"Flexbuffer type {value_type} is not supported for decoding.")

@mshr-h mshr-h self-assigned this Mar 4, 2026

@tlopex tlopex left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm. Since it is an initial pr. I think we can fix potential issues afterwards

@mshr-h

mshr-h commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Sorry for the late review. Thanks. LGTM.

@mshr-h mshr-h changed the title Tflite frontend rebase [Relax][TFLite] Introduce TensorFlow Lite frontend Mar 23, 2026
@mshr-h mshr-h merged commit 5e05f70 into apache:main Mar 23, 2026
14 checks passed
tlopex pushed a commit that referenced this pull request Jun 24, 2026
## Summary

This PR adds Relax TFLite frontend support for dynamic (runtime) scalar
bounds
in the `RANGE` operator, addressing the `RANGE` "fix partial
implementations"
item from #19412 section C.

`convert_range` previously lowered only **constant** `start`, `limit`,
and
`delta` to `relax.op.arange` and raised `OpNotImplemented` for runtime
scalar
bounds (the guard added in #19401). Models that compute RANGE bounds at
runtime
could therefore not be imported. This PR makes the dynamic path work for
both
integer and float bounds, ascending or descending, without adding a new
Relax
op. The change is limited to the `RANGE` converter and its test.

#19813 added a batch of missing TFLite operator mappings but did not
touch this
partial-implementation item; this PR closes it.

## Design

### Dynamic scalar bounds via count-lift

`relax.op.arange` only accepts compile-time `PrimExpr` bounds. The
frontend
already has a runtime-scalar -> symbolic-dimension bridge
(`relax.op.tensor_to_shape` + `match_cast`, as used by
`_get_shape_expr_from_tensor`), so no new op is needed.

Rather than feed symbolic bounds straight into `arange`, the converter
computes
the element **count** in-graph and lifts that single value to one
symbolic
output dimension `L`, then rebuilds the values as `arange(0, L) * delta
+ start`.
Lifting the count (instead of the bounds) keeps the declared and runtime
output
lengths equal by construction: `arange`'s struct-info length formula
(`InferTypeArange`) has no negative-step branch, so feeding symbolic
bounds
directly would mis-declare descending ranges relative to the TOPI
runtime
length.

The count is `max(0, ceil((limit - start) / delta))`, computed per
dtype:

- **integer**: `-floor_divide(start - limit, delta)` — exact,
sign-agnostic, and
free of float-precision loss; equal to `ceil((limit - start) / delta)`.
- **float**: `ceil((limit - start) / delta)`.

Constant (all-bounds-constant) RANGE keeps the existing direct-`arange`
path
unchanged.

## Operator Support

| Operator | TFLite inputs | Relax lowering | Supported subset |
|---|---|---|---|
| `RANGE` | scalar `start`, `limit`, `delta` | `relax.op.arange`
(constant bounds); count-lift + `arange(0, L) * delta + start` (dynamic
bounds) | int and float, constant or runtime scalar bounds, ascending or
descending |

## Tests

The dynamic test compiles the imported module and runs it on the Relax
VM,
comparing the output against `numpy.arange`. The constant-bound
structural test
is unchanged.

| Test | Coverage |
|---|---|
| `test_range` | constant scalar bounds (existing, unchanged) |
| `test_range_dynamic_scalar_inputs` | runtime scalar bounds: int and
float, ascending and descending |

Local validation:

```bash
python -m ruff format --check \
  python/tvm/relax/frontend/tflite/tflite_frontend.py \
  tests/python/relax/test_frontend_tflite.py

python -m ruff check \
  python/tvm/relax/frontend/tflite/tflite_frontend.py \
  tests/python/relax/test_frontend_tflite.py

python -m pytest \
  tests/python/relax/test_frontend_tflite.py -k range -q

python -m pytest \
  tests/python/relax/test_frontend_tflite.py -q
```

Result:

```text
ruff format --check: 2 files already formatted
ruff check: All checks passed
range tests: 12 passed, 536 deselected
full TFLite pytest: 548 passed
```

## References

- Issue #19412 section C: fix partial TFLite operator implementations
(`RANGE`)
- PR #19401: added the `RANGE` dynamic-scalar guard and its test
- PR #18868: introduced the Relax TFLite frontend and `convert_range`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants