Skip to content

[Relax][ONNX] Preserve NaN in Sign to align with ONNX Runtime#19674

Merged
tlopex merged 4 commits into
apache:mainfrom
cchung100m:issue-19543
Jun 6, 2026
Merged

[Relax][ONNX] Preserve NaN in Sign to align with ONNX Runtime#19674
tlopex merged 4 commits into
apache:mainfrom
cchung100m:issue-19543

Conversation

@cchung100m

@cchung100m cchung100m commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Hi Committers,

This PR fixes issues #19543 and #19572. Any suggestions would be appreciated if you are available.

Root cause:

The ONNX frontend Sign converter directly returned relax.op.sign(x). After legalization, this maps to topi.sign, which is implemented via comparisons (x < 0 ? -1 : x > 0 ? 1 : 0). For NaN, both comparisons are false, so TVM produced 0, while ONNX Runtime preserves NaN. This created a frontend semantic mismatch for imported ONNX models.

Solution:

Apply a minimal ONNX-frontend-only fix in onnx_frontend.py:

  • For floating-point inputs, lower Sign as where(isnan(x), x, sign(x)).
  • Keep non-floating inputs unchanged (sign(x)).

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the ONNX frontend's Sign operator implementation to correctly handle NaN values for floating-point inputs by returning NaN instead of the default sign output. The review feedback suggests simplifying the extraction of the input's data type by directly checking if its structural information is an instance of TensorStructInfo, which avoids overly complex and redundant getattr calls.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread python/tvm/relax/frontend/onnx/onnx_frontend.py
@cchung100m cchung100m force-pushed the issue-19543 branch 2 times, most recently from f69ea8a to 753c4c8 Compare June 6, 2026 06:41
@cchung100m cchung100m changed the title [Relax][ONNX] Refactor: Sign operator returns 0 for NaN instead of preserving NaN [Relax][ONNX] Preserve NaN in Sign to align with ONNX Runtime Jun 6, 2026

@tlopex tlopex left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix

@tlopex tlopex merged commit aa59644 into apache:main Jun 6, 2026
11 checks passed
@cchung100m

Copy link
Copy Markdown
Contributor Author

Thanks to @tlopex 😄

@cchung100m cchung100m deleted the issue-19543 branch June 6, 2026 11:49
tlopex added a commit to tlopex/tvm that referenced this pull request Jun 19, 2026
…ndefined)

Remove the explicit NaN-preservation guards added in the ONNX frontend for
Relu (apache#19750), Sign (apache#19674), Clip's input (apache#19535), and ReduceMax/ReduceMin
(the _reduce_min_max_preserve_nan helper, apache#19750). Each paid an extra
isnan + where -- the reduce helper a full sum(isnan) pass -- over the data
solely to force a NaN input through, a corner case that mostly shows up in
fuzzing. NaN handling for these ops is now left unspecified, matching the
less-strict but efficient direction: relu/sign reduce to the plain relax op
and ReduceMax/ReduceMin to relax.op.max/min.

Clip keeps the cheap scalar NaN-bound sanitization: a NaN min/max bound is
still treated as unbounded (ORT parity), which is a distinct concern from
per-element input-NaN passthrough.

Drop the corresponding tests: test_relu_nan_preserve, test_sign_nan_preserve,
test_reduce_min_max_nan_preserve, and the NaN-input case of test_clip_v13.
tqchen pushed a commit that referenced this pull request Jun 20, 2026
This pr removes the explicit NaN-preservation guards added in the ONNX
frontend for Relu (#19750), Sign (#19674), Clip's input (#19535), and
ReduceMax/ReduceMin (the _reduce_min_max_preserve_nan helper, #19750).
Each paid an extra isnan + where -- the reduce helper a full sum(isnan)
pass

This pr also drops the corresponding tests: test_relu_nan_preserve,
test_sign_nan_preserve, test_reduce_min_max_nan_preserve, and the
NaN-input case of test_clip_v13.

PRs about NaN-preservation updated in backend will be followed up in the
future
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants