[Hexagon] 3-stage pipeline; multi queue async DMA for cache read / write#12954
Conversation
|
CC @masahi |
| sched = tvm.testing.parameter("cache_read", "cache_read_write") | ||
|
|
||
|
|
||
| @tvm.testing.fixture |
There was a problem hiding this comment.
With this PR, we could technically test any n-stage pipeline, correct (doesn't have to be limited to 3-stage)?
There was a problem hiding this comment.
Correct. This allows for any number of cache_read and cache_write stages to be lowered using Async DMA on Hexagon. Note that there is a known issue when trying to do cache_read for an op with multiple inputs in the same stage which will be addressed in a future PR. Future PR will modify compute on this test to be a + b instead of a + 1 and add support to lower cache_read of both a and b in the same stage to Async DMA.
tmoreau89
left a comment
There was a problem hiding this comment.
Thanks for the PR Adam, LGTM (left a few nits)
|
@tvm-bot rerun |
|
Thanks @adstraw , the PR has been merged |
…ite (apache#12954) * [Hexagon] 3-stage pipeline; multi queue async DMA for cache rd / wr * add cache_write (no cache_read) schedule to python test
Add
HexagonUserDMAsupport for multiple virtual queues which enables Async DMA for bothcache_readandcache_writewhile maintaining a single descriptor chain to maintain overall FIFO ordering between virtual queues. Tested with runtime unit tests and at the python level for a simple operator.