[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish#9795
Merged
Conversation
masahi
commented
Dec 22, 2021
|
|
||
| if mode == "constant": | ||
| if not non_zero_found: | ||
| return data |
Member
Author
There was a problem hiding this comment.
This is a minor optimization but it non-trivially helped performance on the DETR model. @comaniac
Contributor
There was a problem hiding this comment.
Hmm interesting. I didn't notice that we may have pad ops that actually pad nothing.
masahi
commented
Dec 22, 2021
| const auto* begin = types[1].as<TensorTypeNode>(); | ||
| if (begin == nullptr) { | ||
| return false; | ||
| } |
Member
Author
There was a problem hiding this comment.
This and the change below in src/relay/op/tensor/transform.cc are the fix for the type inference issue mentioned in "Known issues" section of #9746
No test is added because it is hard to reproduce on a simple test case and the change is trivial.
56e0e95 to
18e0736
Compare
comaniac
approved these changes
Dec 22, 2021
|
|
||
| if mode == "constant": | ||
| if not non_zero_found: | ||
| return data |
Contributor
There was a problem hiding this comment.
Hmm interesting. I didn't notice that we may have pad ops that actually pad nothing.
ylc
pushed a commit
to ylc/tvm
that referenced
this pull request
Jan 7, 2022
…rdSwish (apache#9795) * [Torch] do not pad if pad widths are all zero * silu fusion supported * adding hardswish support * support fast_math sigmoid op * fixed type inference for yolov5 + silu fusion * use include_non_call_ops=False in AnnotateTarget * update cutlass * revert change in build.py * simplify codegen * lint
ylc
pushed a commit
to ylc/tvm
that referenced
this pull request
Jan 13, 2022
…rdSwish (apache#9795) * [Torch] do not pad if pad widths are all zero * silu fusion supported * adding hardswish support * support fast_math sigmoid op * fixed type inference for yolov5 + silu fusion * use include_non_call_ops=False in AnnotateTarget * update cutlass * revert change in build.py * simplify codegen * lint
qsqqsqqsq-intellif
pushed a commit
to qsqqsqqsq-intellif/tvm
that referenced
this pull request
Apr 29, 2022
…rdSwish (apache#9795) * [Torch] do not pad if pad widths are all zero * silu fusion supported * adding hardswish support * support fast_math sigmoid op * fixed type inference for yolov5 + silu fusion * use include_non_call_ops=False in AnnotateTarget * update cutlass * revert change in build.py * simplify codegen * lint
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Now dependent PRs in the cutlass repo have been merged, so we can enable more fusions. They were used in the benchmark in #9746
@comaniac @Laurawly