Skip to content

[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish#9795

Merged
masahi merged 10 commits into
apache:mainfrom
masahi:cutlass-conv2d-fuse2
Dec 23, 2021
Merged

[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish#9795
masahi merged 10 commits into
apache:mainfrom
masahi:cutlass-conv2d-fuse2

Conversation

@masahi

@masahi masahi commented Dec 22, 2021

Copy link
Copy Markdown
Member

Now dependent PRs in the cutlass repo have been merged, so we can enable more fusions. They were used in the benchmark in #9746

@comaniac @Laurawly


if mode == "constant":
if not non_zero_found:
return data

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor optimization but it non-trivially helped performance on the DETR model. @comaniac

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting. I didn't notice that we may have pad ops that actually pad nothing.

const auto* begin = types[1].as<TensorTypeNode>();
if (begin == nullptr) {
return false;
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the change below in src/relay/op/tensor/transform.cc are the fix for the type inference issue mentioned in "Known issues" section of #9746

No test is added because it is hard to reproduce on a simple test case and the change is trivial.

@masahi masahi force-pushed the cutlass-conv2d-fuse2 branch from 56e0e95 to 18e0736 Compare December 22, 2021 20:26

@comaniac comaniac left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


if mode == "constant":
if not non_zero_found:
return data

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting. I didn't notice that we may have pad ops that actually pad nothing.

@masahi masahi merged commit 1afcf36 into apache:main Dec 23, 2021
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
…rdSwish (apache#9795)

* [Torch] do not pad if pad widths are all zero

* silu fusion supported

* adding hardswish support

* support fast_math sigmoid op

* fixed type inference for yolov5 + silu fusion

* use include_non_call_ops=False in AnnotateTarget

* update cutlass

* revert change in build.py

* simplify codegen

* lint
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…rdSwish (apache#9795)

* [Torch] do not pad if pad widths are all zero

* silu fusion supported

* adding hardswish support

* support fast_math sigmoid op

* fixed type inference for yolov5 + silu fusion

* use include_non_call_ops=False in AnnotateTarget

* update cutlass

* revert change in build.py

* simplify codegen

* lint
qsqqsqqsq-intellif pushed a commit to qsqqsqqsq-intellif/tvm that referenced this pull request Apr 29, 2022
…rdSwish (apache#9795)

* [Torch] do not pad if pad widths are all zero

* silu fusion supported

* adding hardswish support

* support fast_math sigmoid op

* fixed type inference for yolov5 + silu fusion

* use include_non_call_ops=False in AnnotateTarget

* update cutlass

* revert change in build.py

* simplify codegen

* lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants