[Relax][Frontend][TFLite] Add initial StableHLO builtin operator support#19536
Merged
Conversation
…upport Add frontend mapping for 8 basic StableHLO TFLite builtin operators as pure unary/binary elementwise ops: - STABLEHLO_ABS, STABLEHLO_NEGATE (unary) - STABLEHLO_ADD, STABLEHLO_SUBTRACT, STABLEHLO_MULTIPLY, STABLEHLO_DIVIDE, STABLEHLO_MAXIMUM, STABLEHLO_MINIMUM (binary) Implementation uses dedicated _convert_stablehlo_unary / _convert_stablehlo_binary helpers that intentionally bypass TFLite fused-activation and QNN code paths, since StableHLO ops carry no TFLite-specific quantization or fused-activation metadata in their flatbuffer representation. Test coverage: 8 structural-equal tests with tvm.ir.assert_structural_equal.
…h ternary SELECT Extend the StableHLO TFLite frontend with all remaining pure elementwise operators that require no attribute parsing: - Unary: STABLEHLO_COSINE (cos), STABLEHLO_EXPONENTIAL (exp), STABLEHLO_FLOOR (floor), STABLEHLO_LOG (log), STABLEHLO_LOGISTIC (sigmoid), STABLEHLO_RSQRT (rsqrt), STABLEHLO_TANH (tanh) - Binary: STABLEHLO_AND (logical_and), STABLEHLO_OR (logical_or), STABLEHLO_POWER (power), STABLEHLO_SHIFT_LEFT (left_shift) - Ternary: STABLEHLO_SELECT (where) with dedicated _convert_stablehlo_ternary helper The existing _convert_stablehlo_unary and _convert_stablehlo_binary helpers are reused; only STABLEHLO_SELECT needs the new ternary converter since R.where requires a 3-input signature with bool condition dtype. Test coverage: 20 structural-equal tests (12 new, 8 from previous commit). The SELECT test uses inline flatbuffer construction to set the condition input dtype to BOOL, matching the R.where requirement.
…nOptions2 support Introduce the first batch of StableHLO TFLite builtin operators that require BuiltinOptions2 attribute parsing: - STABLEHLO_CONVERT → R.astype (reads output dtype from tensor metadata) - STABLEHLO_CLAMP → R.minimum(R.maximum(x, min), max) (arg reordering) - STABLEHLO_CONCATENATE → R.concat with StablehloConcatenateOptions - STABLEHLO_BROADCAST_IN_DIM → R.broadcast_to with broadcast dimensions - STABLEHLO_IOTA → R.arange + R.reshape + R.broadcast_to - STABLEHLO_COMPARE → R.equal/greater/less/... with 6 comparison directions Add _get_stablehlo_options helper for parsing BuiltinOptions2 flatbuffers. R.clip was considered for CLAMP but rejected because it only accepts scalar PrimValue min/max, not tensor inputs. Test coverage: 32 structural-equal tests (20 previous + 12 new) passed.
Add frontend mapping for two StableHLO TFLite builtin operators that manipulate tensor shapes: - STABLEHLO_PAD → R.nn.pad with constant mode. Parses EdgePaddingLow, EdgePaddingHigh, and InteriorPadding from StablehloPadOptions. Raises OpNotImplemented when interior (dilation) padding is non-zero. - STABLEHLO_DYNAMIC_SLICE → R.dynamic_strided_slice. Reads SliceSizes from StablehloDynamicSliceOptions and start indices from scalar tensor inputs. Begin/end/strides are constructed as int64 1D tensors. Both ops extend the BuiltinOptions2 parsing infrastructure introduced in the previous commit, adding vector-attribute (PAD) and dynamic-input (DYNAMIC_SLICE) patterns. Test coverage: 33 structural-equal tests passed (31 previous + 2 new).
…nt subset) Add frontend mapping for STABLEHLO_GATHER with a conservative take-equivalent implementation: - Parses 6 attributes from StablehloGatherOptions (OffsetDims, CollapsedSliceDims, StartIndexMap, IndexVectorDim, SliceSizes, IndicesAreSorted) - Only supports single-axis gather with index vector dim == rank(indices)-1 and slice_sizes matching R.take semantics - Validates offset_dims layout, output shape, and collapsed dims against expected R.take behavior; raises OpNotImplemented otherwise - Reshapes indices from [N, 1] to [N] before calling R.take Tests: 3 new (2 take-equivalent parametrized for axis 0/1, 1 error path for multi-dimensional start_index_map). Total: 38 stablehlo tests passed.
Contributor
There was a problem hiding this comment.
Code Review
This pull request implements support for a wide range of StableHLO operators in the TFLite frontend for TVM Relax, covering unary, binary, ternary, and more complex operations like gather and dynamic slice. The changes include the core conversion logic and comprehensive unit tests. Feedback points out a bug in a test helper function regarding FlatBuffers vector generation and suggests removing a redundant reshape call in the dynamic slice implementation.
tlopex
approved these changes
May 11, 2026
tlopex
pushed a commit
that referenced
this pull request
May 21, 2026
…i-subgraph models (#19587) ## Summary This PR adds Relax TFLite frontend support for 10 additional StableHLO builtin operators from #19519 item I, building on the 29 ops merged in PR #19536. The first 5 ops are direct single-subgraph converters: `CBRT`, `REMAINDER`, `DYNAMIC_UPDATE_SLICE`, `DOT_GENERAL`, and `CONVOLUTION`. The remaining 5 ops are region/subgraph-based: `REDUCE`, `REDUCE_WINDOW`, `SORT`, `SCATTER`, and `COMPOSITE`. To support these, the TFLite frontend is extended to accept multi-subgraph models while still converting only `Subgraphs(0)` into the Relax main function. Region subgraphs are consumed by their parent op converters as needed. Relates to #19519. ## Changes 1. **Single-subgraph ops** - `CBRT` — sign-preserving composite expression: `where(x < 0, -power(-x, 1/3), power(x, 1/3))`. Float dtype only. - `REMAINDER` — truncating remainder via `x - y * trunc(x / y)`, matching StableHLO semantics (sign follows dividend). Float dtype only. - `DYNAMIC_UPDATE_SLICE` — static start indices + static shapes only, lowered to `R.scatter_nd` with a coordinate grid generated via `np.indices`. Runtime starts and out-of-bounds ranges raise `OpNotImplemented`. - `DOT_GENERAL` — canonical 2D matmul subset: no batching dims, `lhs_contracting=[1]`, `rhs_contracting=[0]`, lowered to `R.matmul`. - `CONVOLUTION` — canonical 2D NHWC/HWIO subset with `BatchGroupCount=1`, `FeatureGroupCount=1`, lowered to `R.nn.conv2d`. Non-canonical dimension numbers and grouped/depthwise conv raise `OpNotImplemented`. 2. **Multi-subgraph infrastructure** - Lift `from_tflite()` assertion from `model.SubgraphsLength() == 1` to `model.SubgraphsLength() >= 1`. Only `Subgraphs(0)` is converted into the Relax main function. - Limit `_input_type()` to `Subgraphs(0)` inputs, preventing region parameters from leaking as Relax main function parameters. - Add `_get_stablehlo_simple_body_op` helper for validating and extracting the single operator from a region body subgraph. - Extend test helper `_finish_tflite_model` with `extra_subgraphs` parameter for constructing multi-subgraph TFLite flatbuffers. 3. **Region/subgraph ops** - `REDUCE` — single-op reducer body subgraph. Supports `ADD` → `R.sum`, `MAXIMUM` → `R.max`, `MINIMUM` → `R.min`, `MULTIPLY` → `R.prod`. Init value must match the reducer identity element. - `SORT` — single-op comparator body subgraph. `LT` → ascending sort, `GT` → descending sort via `R.sort`. `IsStable` is not mapped. - `REDUCE_WINDOW` — NHWC 4D 2D-pooling subset with `MAXIMUM` reducer and identity init, lowered to `R.nn.max_pool2d`. BaseDilations must be all 1. - `SCATTER` — single-op update computation body subgraph. Supports `ADD`/`MAXIMUM`/`MINIMUM`/`MULTIPLY` → `R.scatter_nd` with the corresponding reduction mode. Only canonical point-update semantics (no window dims). - `COMPOSITE` — inlines a decomposition subgraph through a recursive `OperatorConverter` with an isolated `ExprTable`, so decomposition tensor bindings cannot overwrite main graph bindings. Only supports composites without `CompositeAttributes`. 4. **Not included** - `STABLEHLO_RESHAPE`, `STABLEHLO_TRANSPOSE`, and `STABLEHLO_SLICE` are left to another contributor. - `WHILE`, `CUSTOM_CALL`, and `RNG_BIT_GENERATOR` are deferred to follow-up PRs. 5. **Bug fix** - Fixed `DYNAMIC_UPDATE_SLICE` scatter_nd indices layout: `np.indices` returns `(rank, *update_shape)` but `scatter_nd` expects `(*update_shape, rank)`. Added `np.moveaxis` to transpose the coordinate axis from first to last position. ## Testing All tests use manually-built minimal TFLite flatbuffers with `tvm.ir.assert_structural_equal`. Region/subgraph tests construct the smallest valid body/comparator/update subgraphs. BuiltinOptions2 ops construct their options via the FlatBuffers schema API. ```bash python -m pytest tests/python/relax/test_frontend_tflite.py -k stablehlo -q ``` ## Result - 39 StableHLO operators registered in the Relax TFLite frontend (29 from PR #19536 + 10 from this PR). - 77 StableHLO test cases covering all registered ops, including structural-equal tests and unsupported/error-path checks: - `REMAINDER` truncating semantics - `DYNAMIC_UPDATE_SLICE` with dynamic starts and out-of-bounds starts - `DOT_GENERAL` with non-canonical contracting dimensions - `CONVOLUTION` with non-canonical dimension numbers and `FeatureGroupCount > 1` - `REDUCE` with unsupported reducer and non-identity init value - `SORT` with unsupported comparator and stable sort - `REDUCE_WINDOW` with unsupported reducer and base dilation - `SCATTER` with unsupported reducer and update window dims - `COMPOSITE` with composite attributes and scope isolation - Multi-subgraph model with unused subgraphs - All 77 StableHLO tests pass. ## References - Issue #19519 item I: StableHLO operators in TFLite - PR #19536: First batch of 29 StableHLO ops
This was referenced May 25, 2026
tlopex
pushed a commit
that referenced
this pull request
May 27, 2026
…19601) ## Summary This PR adds Relax TFLite frontend support for `UNIDIRECTIONAL_SEQUENCE_RNN` (BuiltinOperator 35), claimed in [#19519](#19519) Group A. The op executes a simple RNN cell over a time sequence. The converter unrolls the time steps at graph-construction time using Relax primitives. Cell equation: ``` h_t = fused_activation(x_t @ W.T + h_{t-1} @ Wr.T + b) ``` ## Changes - **Handler**: `convert_unidirectional_sequence_rnn` registered in `convert_map` (alphabetical, U-region after `UNPACK`) - **Inputs** (5): `input [batch, time, input_size]`, `input_weights [num_units, input_size]`, `recurrent_weights [num_units, num_units]`, `bias [num_units]`, `hidden_state [batch, num_units]` (variable, zero-initialised) - **Output**: `[batch, time, num_units]` (always batch-major) - **time_major=True**: input is transposed to batch-major before unrolling - **Activations**: NONE, RELU, RELU6, TANH, SIGMOID (via `convert_fused_activation_function`) - **Quantized**: raises `OpNotImplemented` (not yet supported) ## Testing Modern TF/Keras (2.x, Keras 3) no longer emits `UNIDIRECTIONAL_SEQUENCE_RNN`; `SimpleRNN` with `unroll=False` lowers to `WHILE`+TensorList ops, and `unroll=True` expands to elementwise ops. Tests therefore follow the same flatbuffer-construction pattern used by the StableHLO op PRs (#19536, #19587). Three tests added to `tests/python/relax/test_frontend_tflite.py`: - `test_unidirectional_sequence_rnn_none_activation` — `tvm.ir.assert_structural_equal` with identity weights / zero bias, NONE activation, time=1 - `test_unidirectional_sequence_rnn_relu_activation` — shape check, random weights, RELU activation, time=3 - `test_unidirectional_sequence_rnn_time_major` — shape check, `time_major=True` input layout ```bash python -m pytest tests/python/relax/test_frontend_tflite.py -k unidirectional_sequence_rnn -v ``` All 3 tests pass. pre-commit (ASF header, ruff check, ruff format) all pass. ## References - Issue [#19519](#19519) Group A: Sequence / recurrent model operators Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds initial Relax TFLite frontend support for 29 StableHLO builtin
operators from #19519 item I.
The covered subset includes pure elementwise ops, BuiltinOptions2 /
metadata-based ops, simple shape-manipulation ops, and a take-equivalent subset
of
STABLEHLO_GATHER.StableHLO builtins carry no TFLite-specific quantization or fused-activation
metadata, so the implementation uses dedicated converter helpers that bypass the
existing TFLite elemwise/QNN code paths.
Relates to #19519.
Changes
Zero-attribute elementwise helpers
_convert_stablehlo_unary,_convert_stablehlo_binary, and_convert_stablehlo_ternaryfor pure elementwise mapping.ABS,NEGATE,COSINE,EXPONENTIAL,FLOOR,LOG,LOGISTIC,RSQRT,TANH), binary (ADD,SUBTRACT,MULTIPLY,DIVIDE,MAXIMUM,MINIMUM,POWER), ternary (SELECT→R.where),and dtype-dispatched bitwise/logical ops (
AND/OR→ logical ops forbool or bitwise ops for integer,
SHIFT_LEFT→R.left_shiftfor integer).BuiltinOptions2 infrastructure
_get_stablehlo_optionshelper for parsingBuiltinOptions2flatbufferswith enum validation via
getattr(BuiltinOptions2, options_cls.__name__).CONVERT→R.astype,CLAMP→R.minimum(R.maximum(...)),CONCATENATE→R.concat,BROADCAST_IN_DIM→R.reshape+R.broadcast_to,IOTA→R.arange+R.broadcast_to, andCOMPARE→ 6 comparison directions(
TOTALORDERraisesOpNotImplemented).Shape-manipulation ops
PAD→R.nn.padin constant mode. The initial PAD path supportsnon-negative edge padding with zero interior padding and a constant scalar
padding value. Interior padding, negative padding, and dynamic padding
values raise
OpNotImplemented.DYNAMIC_SLICE→R.dynamic_strided_slice. The initial path supportsconstant, in-bound start indices only. Runtime start indices and
out-of-bounds StableHLO clamping semantics are deferred.
Indexing op
GATHER→R.takefor the take-equivalent subset only.StablehloGatherOptionsattributes needed to validatethis subset:
offset_dims,collapsed_slice_dims,start_index_map,index_vector_dim, andslice_sizes.output shape against the expected
R.takelayout. Multi-dimensional andnon-take-equivalent gather patterns raise
OpNotImplemented.Not included
STABLEHLO_RESHAPE,STABLEHLO_TRANSPOSE, andSTABLEHLO_SLICEare leftto another contributor who expressed interest in those ops.
CBRT,REMAINDER,SCATTER,CONVOLUTION,DOT_GENERAL,REDUCE,REDUCE_WINDOW,DYNAMIC_UPDATE_SLICE,COMPOSITE,CUSTOM_CALL,RNG_BIT_GENERATOR,SORT, andWHILE.STABLEHLO_GATHERpatterns are alsodeferred to follow-up work.
Testing
All tests use manually-built minimal TFLite flatbuffers with
tvm.ir.assert_structural_equal. BuiltinOptions2 ops construct their optionsvia the FlatBuffers schema API, modeled after the existing DILATE test pattern.
Result
29 StableHLO operators registered in the Relax TFLite frontend.
44 StableHLO test cases covering all registered ops, including
structural-equal tests and unsupported/error-path checks:
COMPAREwithTOTALORDERPADwith interior padding, negative padding, and dynamic padding valuesDYNAMIC_SLICEwith runtime starts and out-of-bounds startsGATHERAll StableHLO TFLite frontend tests pass locally.
References
in the TFLite frontend tests