Skip to content

[Relax][Frontend][TFLite] Add initial StableHLO builtin operator support#19536

Merged
tlopex merged 6 commits into
apache:mainfrom
Aharrypotter:stablehlo_tflite_ops
May 11, 2026
Merged

[Relax][Frontend][TFLite] Add initial StableHLO builtin operator support#19536
tlopex merged 6 commits into
apache:mainfrom
Aharrypotter:stablehlo_tflite_ops

Conversation

@Aharrypotter

Copy link
Copy Markdown
Contributor

Summary

This PR adds initial Relax TFLite frontend support for 29 StableHLO builtin
operators from #19519 item I.

The covered subset includes pure elementwise ops, BuiltinOptions2 /
metadata-based ops, simple shape-manipulation ops, and a take-equivalent subset
of STABLEHLO_GATHER.

StableHLO builtins carry no TFLite-specific quantization or fused-activation
metadata, so the implementation uses dedicated converter helpers that bypass the
existing TFLite elemwise/QNN code paths.

Relates to #19519.

Changes

  1. Zero-attribute elementwise helpers

    • Add _convert_stablehlo_unary, _convert_stablehlo_binary, and
      _convert_stablehlo_ternary for pure elementwise mapping.
    • Register 20 ops: unary (ABS, NEGATE, COSINE, EXPONENTIAL, FLOOR,
      LOG, LOGISTIC, RSQRT, TANH), binary (ADD, SUBTRACT, MULTIPLY,
      DIVIDE, MAXIMUM, MINIMUM, POWER), ternary (SELECTR.where),
      and dtype-dispatched bitwise/logical ops (AND / OR → logical ops for
      bool or bitwise ops for integer, SHIFT_LEFTR.left_shift for integer).
  2. BuiltinOptions2 infrastructure

    • Add _get_stablehlo_options helper for parsing BuiltinOptions2 flatbuffers
      with enum validation via getattr(BuiltinOptions2, options_cls.__name__).
    • Register 6 ops: CONVERTR.astype, CLAMP
      R.minimum(R.maximum(...)), CONCATENATER.concat,
      BROADCAST_IN_DIMR.reshape + R.broadcast_to, IOTA
      R.arange + R.broadcast_to, and COMPARE → 6 comparison directions
      (TOTALORDER raises OpNotImplemented).
  3. Shape-manipulation ops

    • PADR.nn.pad in constant mode. The initial PAD path supports
      non-negative edge padding with zero interior padding and a constant scalar
      padding value. Interior padding, negative padding, and dynamic padding
      values raise OpNotImplemented.
    • DYNAMIC_SLICER.dynamic_strided_slice. The initial path supports
      constant, in-bound start indices only. Runtime start indices and
      out-of-bounds StableHLO clamping semantics are deferred.
  4. Indexing op

    • GATHERR.take for the take-equivalent subset only.
    • Parses the relevant StablehloGatherOptions attributes needed to validate
      this subset: offset_dims, collapsed_slice_dims, start_index_map,
      index_vector_dim, and slice_sizes.
    • Validates the gather axis, collapsed dims, offset dims, slice sizes, and
      output shape against the expected R.take layout. Multi-dimensional and
      non-take-equivalent gather patterns raise OpNotImplemented.
  5. Not included

    • STABLEHLO_RESHAPE, STABLEHLO_TRANSPOSE, and STABLEHLO_SLICE are left
      to another contributor who expressed interest in those ops.
    • The remaining Issue [Tracking Issue][TFLite] Remaining builtin operator coverage beyond #19412 #19519 StableHLO items are deferred to follow-up PRs:
      CBRT, REMAINDER, SCATTER, CONVOLUTION, DOT_GENERAL, REDUCE,
      REDUCE_WINDOW, DYNAMIC_UPDATE_SLICE, COMPOSITE, CUSTOM_CALL,
      RNG_BIT_GENERATOR, SORT, and WHILE.
    • More general or multi-dimensional STABLEHLO_GATHER patterns are also
      deferred to follow-up work.

Testing

All tests use manually-built minimal TFLite flatbuffers with
tvm.ir.assert_structural_equal. BuiltinOptions2 ops construct their options
via the FlatBuffers schema API, modeled after the existing DILATE test pattern.

python -m pytest tests/python/relax/test_frontend_tflite.py -k stablehlo -q

Result

  • 29 StableHLO operators registered in the Relax TFLite frontend.

  • 44 StableHLO test cases covering all registered ops, including
    structural-equal tests and unsupported/error-path checks:

    • COMPARE with TOTALORDER
    • PAD with interior padding, negative padding, and dynamic padding values
    • DYNAMIC_SLICE with runtime starts and out-of-bounds starts
    • non-take-equivalent or multi-dimensional GATHER
  • All StableHLO TFLite frontend tests pass locally.

References

…upport

Add frontend mapping for 8 basic StableHLO TFLite builtin
operators as pure unary/binary elementwise ops:

- STABLEHLO_ABS, STABLEHLO_NEGATE (unary)
- STABLEHLO_ADD, STABLEHLO_SUBTRACT, STABLEHLO_MULTIPLY, STABLEHLO_DIVIDE,
  STABLEHLO_MAXIMUM, STABLEHLO_MINIMUM (binary)

Implementation uses dedicated _convert_stablehlo_unary / _convert_stablehlo_binary
helpers that intentionally bypass TFLite fused-activation and QNN code paths,
since StableHLO ops carry no TFLite-specific quantization or fused-activation
metadata in their flatbuffer representation.

Test coverage: 8 structural-equal tests with tvm.ir.assert_structural_equal.
…h ternary SELECT

Extend the StableHLO TFLite frontend with all remaining pure elementwise
operators that require no attribute parsing:

- Unary: STABLEHLO_COSINE (cos), STABLEHLO_EXPONENTIAL (exp),
  STABLEHLO_FLOOR (floor), STABLEHLO_LOG (log), STABLEHLO_LOGISTIC (sigmoid),
  STABLEHLO_RSQRT (rsqrt), STABLEHLO_TANH (tanh)
- Binary: STABLEHLO_AND (logical_and), STABLEHLO_OR (logical_or),
  STABLEHLO_POWER (power), STABLEHLO_SHIFT_LEFT (left_shift)
- Ternary: STABLEHLO_SELECT (where) with dedicated
  _convert_stablehlo_ternary helper

The existing _convert_stablehlo_unary and _convert_stablehlo_binary helpers
are reused; only STABLEHLO_SELECT needs the new ternary converter since
R.where requires a 3-input signature with bool condition dtype.

Test coverage: 20 structural-equal tests (12 new, 8 from previous commit).
The SELECT test uses inline flatbuffer construction to set the condition
input dtype to BOOL, matching the R.where requirement.
…nOptions2 support

Introduce the first batch of StableHLO TFLite builtin operators that
require BuiltinOptions2 attribute parsing:

- STABLEHLO_CONVERT → R.astype (reads output dtype from tensor metadata)
- STABLEHLO_CLAMP → R.minimum(R.maximum(x, min), max) (arg reordering)
- STABLEHLO_CONCATENATE → R.concat with StablehloConcatenateOptions
- STABLEHLO_BROADCAST_IN_DIM → R.broadcast_to with broadcast dimensions
- STABLEHLO_IOTA → R.arange + R.reshape + R.broadcast_to
- STABLEHLO_COMPARE → R.equal/greater/less/... with 6 comparison directions

Add _get_stablehlo_options helper for parsing BuiltinOptions2 flatbuffers.
R.clip was considered for CLAMP but rejected because it only accepts
scalar PrimValue min/max, not tensor inputs.

Test coverage: 32 structural-equal tests (20 previous + 12 new) passed.
Add frontend mapping for two StableHLO TFLite builtin operators
that manipulate tensor shapes:

- STABLEHLO_PAD → R.nn.pad with constant mode. Parses EdgePaddingLow,
  EdgePaddingHigh, and InteriorPadding from StablehloPadOptions.
  Raises OpNotImplemented when interior (dilation) padding is non-zero.
- STABLEHLO_DYNAMIC_SLICE → R.dynamic_strided_slice. Reads SliceSizes
  from StablehloDynamicSliceOptions and start indices from scalar
  tensor inputs. Begin/end/strides are constructed as int64 1D tensors.

Both ops extend the BuiltinOptions2 parsing infrastructure introduced
in the previous commit, adding vector-attribute (PAD) and dynamic-input
(DYNAMIC_SLICE) patterns.

Test coverage: 33 structural-equal tests passed (31 previous + 2 new).
…nt subset)

Add frontend mapping for STABLEHLO_GATHER with a conservative
take-equivalent implementation:

- Parses 6 attributes from StablehloGatherOptions (OffsetDims,
  CollapsedSliceDims, StartIndexMap, IndexVectorDim, SliceSizes,
  IndicesAreSorted)
- Only supports single-axis gather with index vector dim == rank(indices)-1
  and slice_sizes matching R.take semantics
- Validates offset_dims layout, output shape, and collapsed dims against
  expected R.take behavior; raises OpNotImplemented otherwise
- Reshapes indices from [N, 1] to [N] before calling R.take

Tests: 3 new (2 take-equivalent parametrized for axis 0/1,
1 error path for multi-dimensional start_index_map).
Total: 38 stablehlo tests passed.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for a wide range of StableHLO operators in the TFLite frontend for TVM Relax, covering unary, binary, ternary, and more complex operations like gather and dynamic slice. The changes include the core conversion logic and comprehensive unit tests. Feedback points out a bug in a test helper function regarding FlatBuffers vector generation and suggests removing a redundant reshape call in the dynamic slice implementation.

Comment thread tests/python/relax/test_frontend_tflite.py Outdated
Comment thread python/tvm/relax/frontend/tflite/tflite_frontend.py Outdated
@Aharrypotter Aharrypotter marked this pull request as ready for review May 11, 2026 12:17
@tlopex tlopex merged commit c0406a5 into apache:main May 11, 2026
9 of 10 checks passed
tlopex pushed a commit that referenced this pull request May 21, 2026
…i-subgraph models (#19587)

## Summary

This PR adds Relax TFLite frontend support for 10 additional StableHLO
builtin
operators from #19519 item I, building on the 29 ops merged in PR
#19536.

The first 5 ops are direct single-subgraph converters: `CBRT`,
`REMAINDER`,
`DYNAMIC_UPDATE_SLICE`, `DOT_GENERAL`, and `CONVOLUTION`. The remaining
5 ops
are region/subgraph-based: `REDUCE`, `REDUCE_WINDOW`, `SORT`, `SCATTER`,
and
`COMPOSITE`. To support these, the TFLite frontend is extended to accept
multi-subgraph models while still converting only `Subgraphs(0)` into
the
Relax main function. Region subgraphs are consumed by their parent op
converters as needed.

Relates to #19519.

## Changes

1. **Single-subgraph ops**
   - `CBRT` — sign-preserving composite expression:
     `where(x < 0, -power(-x, 1/3), power(x, 1/3))`. Float dtype only.
- `REMAINDER` — truncating remainder via `x - y * trunc(x / y)`,
matching
     StableHLO semantics (sign follows dividend). Float dtype only.
- `DYNAMIC_UPDATE_SLICE` — static start indices + static shapes only,
lowered
to `R.scatter_nd` with a coordinate grid generated via `np.indices`.
     Runtime starts and out-of-bounds ranges raise `OpNotImplemented`.
   - `DOT_GENERAL` — canonical 2D matmul subset: no batching dims,
`lhs_contracting=[1]`, `rhs_contracting=[0]`, lowered to `R.matmul`.
- `CONVOLUTION` — canonical 2D NHWC/HWIO subset with
`BatchGroupCount=1`,
`FeatureGroupCount=1`, lowered to `R.nn.conv2d`. Non-canonical dimension
     numbers and grouped/depthwise conv raise `OpNotImplemented`.

2. **Multi-subgraph infrastructure**
- Lift `from_tflite()` assertion from `model.SubgraphsLength() == 1` to
`model.SubgraphsLength() >= 1`. Only `Subgraphs(0)` is converted into
the
     Relax main function.
   - Limit `_input_type()` to `Subgraphs(0)` inputs, preventing region
     parameters from leaking as Relax main function parameters.
- Add `_get_stablehlo_simple_body_op` helper for validating and
extracting
     the single operator from a region body subgraph.
- Extend test helper `_finish_tflite_model` with `extra_subgraphs`
parameter
     for constructing multi-subgraph TFLite flatbuffers.

3. **Region/subgraph ops**
- `REDUCE` — single-op reducer body subgraph. Supports `ADD` → `R.sum`,
     `MAXIMUM` → `R.max`, `MINIMUM` → `R.min`, `MULTIPLY` → `R.prod`.
     Init value must match the reducer identity element.
   - `SORT` — single-op comparator body subgraph. `LT` → ascending sort,
     `GT` → descending sort via `R.sort`. `IsStable` is not mapped.
- `REDUCE_WINDOW` — NHWC 4D 2D-pooling subset with `MAXIMUM` reducer and
identity init, lowered to `R.nn.max_pool2d`. BaseDilations must be all
1.
   - `SCATTER` — single-op update computation body subgraph. Supports
     `ADD`/`MAXIMUM`/`MINIMUM`/`MULTIPLY` → `R.scatter_nd` with the
     corresponding reduction mode. Only canonical point-update semantics
     (no window dims).
   - `COMPOSITE` — inlines a decomposition subgraph through a recursive
`OperatorConverter` with an isolated `ExprTable`, so decomposition
tensor
bindings cannot overwrite main graph bindings. Only supports composites
     without `CompositeAttributes`.

4. **Not included**
- `STABLEHLO_RESHAPE`, `STABLEHLO_TRANSPOSE`, and `STABLEHLO_SLICE` are
     left to another contributor.
- `WHILE`, `CUSTOM_CALL`, and `RNG_BIT_GENERATOR` are deferred to
follow-up
     PRs.

5. **Bug fix**
- Fixed `DYNAMIC_UPDATE_SLICE` scatter_nd indices layout: `np.indices`
     returns `(rank, *update_shape)` but `scatter_nd` expects
`(*update_shape, rank)`. Added `np.moveaxis` to transpose the coordinate
     axis from first to last position.

## Testing

All tests use manually-built minimal TFLite flatbuffers with
`tvm.ir.assert_structural_equal`. Region/subgraph tests construct the
smallest
valid body/comparator/update subgraphs. BuiltinOptions2 ops construct
their
options via the FlatBuffers schema API.

```bash
python -m pytest tests/python/relax/test_frontend_tflite.py -k stablehlo -q
```

## Result

- 39 StableHLO operators registered in the Relax TFLite frontend (29
from
  PR #19536 + 10 from this PR).
- 77 StableHLO test cases covering all registered ops, including
  structural-equal tests and unsupported/error-path checks:

  - `REMAINDER` truncating semantics
  - `DYNAMIC_UPDATE_SLICE` with dynamic starts and out-of-bounds starts
  - `DOT_GENERAL` with non-canonical contracting dimensions
- `CONVOLUTION` with non-canonical dimension numbers and
`FeatureGroupCount > 1`
  - `REDUCE` with unsupported reducer and non-identity init value
  - `SORT` with unsupported comparator and stable sort
  - `REDUCE_WINDOW` with unsupported reducer and base dilation
  - `SCATTER` with unsupported reducer and update window dims
  - `COMPOSITE` with composite attributes and scope isolation
  - Multi-subgraph model with unused subgraphs
- All 77 StableHLO tests pass.

## References

- Issue #19519 item I: StableHLO operators in TFLite
- PR #19536: First batch of 29 StableHLO ops
tlopex pushed a commit that referenced this pull request May 27, 2026
…19601)

## Summary

This PR adds Relax TFLite frontend support for
`UNIDIRECTIONAL_SEQUENCE_RNN` (BuiltinOperator 35), claimed in
[#19519](#19519) Group A.

The op executes a simple RNN cell over a time sequence. The converter
unrolls the time steps at graph-construction time using Relax
primitives.

Cell equation:
```
h_t = fused_activation(x_t @ W.T + h_{t-1} @ Wr.T + b)
```

## Changes

- **Handler**: `convert_unidirectional_sequence_rnn` registered in
`convert_map` (alphabetical, U-region after `UNPACK`)
- **Inputs** (5): `input [batch, time, input_size]`, `input_weights
[num_units, input_size]`, `recurrent_weights [num_units, num_units]`,
`bias [num_units]`, `hidden_state [batch, num_units]` (variable,
zero-initialised)
- **Output**: `[batch, time, num_units]` (always batch-major)
- **time_major=True**: input is transposed to batch-major before
unrolling
- **Activations**: NONE, RELU, RELU6, TANH, SIGMOID (via
`convert_fused_activation_function`)
- **Quantized**: raises `OpNotImplemented` (not yet supported)

## Testing

Modern TF/Keras (2.x, Keras 3) no longer emits
`UNIDIRECTIONAL_SEQUENCE_RNN`; `SimpleRNN` with `unroll=False` lowers to
`WHILE`+TensorList ops, and `unroll=True` expands to elementwise ops.
Tests therefore follow the same flatbuffer-construction pattern used by
the StableHLO op PRs (#19536, #19587).

Three tests added to `tests/python/relax/test_frontend_tflite.py`:

- `test_unidirectional_sequence_rnn_none_activation` —
`tvm.ir.assert_structural_equal` with identity weights / zero bias, NONE
activation, time=1
- `test_unidirectional_sequence_rnn_relu_activation` — shape check,
random weights, RELU activation, time=3
- `test_unidirectional_sequence_rnn_time_major` — shape check,
`time_major=True` input layout

```bash
python -m pytest tests/python/relax/test_frontend_tflite.py -k unidirectional_sequence_rnn -v
```

All 3 tests pass. pre-commit (ASF header, ruff check, ruff format) all
pass.

## References

- Issue [#19519](#19519) Group A:
Sequence / recurrent model operators

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants