Skip to content

Lower cache_read and cache_write to Hexagon DMA via tensorize#10365

Merged
masahi merged 4 commits into
apache:mainfrom
adstraw:hexagon-tir-to-dma-lower
Feb 25, 2022
Merged

Lower cache_read and cache_write to Hexagon DMA via tensorize#10365
masahi merged 4 commits into
apache:mainfrom
adstraw:hexagon-tir-to-dma-lower

Conversation

@adstraw

@adstraw adstraw commented Feb 23, 2022

Copy link
Copy Markdown
Contributor

Adds support to lower cache_read and cache_write schedule primitives to Hexagon User DMA using tensorize and new mem_copy intrinsic.

@adstraw

adstraw commented Feb 23, 2022

Copy link
Copy Markdown
Contributor Author

HUGE thanks to @masahi who also contributed to this PR.

@masahi masahi self-assigned this Feb 24, 2022

@tmoreau89 tmoreau89 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @adstraw and @masahi for getting the pattern matching from cache_read/write to DMA working!

Comment thread src/runtime/cpu_device_api.cc Outdated
Comment thread include/tvm/tir/builtin.h
* Same semantics as memcpy(destination, source, size)
* Allows for device specific implementations e.g. direct memory access (DMA)
*/
TVM_DLL const Op& mem_copy();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @tqchen @junrushao1994 if they want a better name here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make it more specific? We have a few examples above like texture2d_load/store. What about dma_mem_copy() or any hexagon-specific name?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about something with "DMA" in the name but I realized that DMA is more of an implementation detail. I ended up referencing DMA in the comments instead. Happy to take suggestions including dma_mem_copy.

@masahi masahi Feb 24, 2022

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about:

  • mem_copy_local (to emphasize the local nature of the intended usage, unlike TVMArrayCopyFromTo @junrushao1994 mentioned which might involve RPC)
  • mem_copy_1d (if we intend to add 2d rectangular memcpy later)

But both of them don't particularly sound better.

Comment thread tests/python/contrib/test_hexagon/test_cache_read_write.py Outdated
@masahi

masahi commented Feb 24, 2022

Copy link
Copy Markdown
Member

cc @kparzysz-quic

Comment thread src/tir/transforms/lower_tvm_builtin.cc Outdated
@masahi masahi merged commit dcbdedd into apache:main Feb 25, 2022
@adstraw adstraw deleted the hexagon-tir-to-dma-lower branch February 25, 2022 19:16
pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
…#10365)

* Lower cache_read and cache_write to Hexagon DMA via tensorize

* rework test to be compatible with launcher

* remove cpu device api mem_copy implementation and test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants