[WIP][Runtime]Pipeline Executor For Compute graph pipeline#7892
[WIP][Runtime]Pipeline Executor For Compute graph pipeline#7892huajsj wants to merge 28 commits into
Conversation
|
Thanks @tqchen for the follow up, I proposed a RFC(https://discuss.tvm.apache.org/t/rfc-compute-graph-pipeline-with-new-subgraph-executor/9839) to explain the motivation and solution architecture as reference. |
comaniac
left a comment
There was a problem hiding this comment.
Reviewed the test for APIs and user interfaces, but it needs lots of changes so I'll stop here for now.
comaniac
left a comment
There was a problem hiding this comment.
Alco cc @areusch @tmoreau89
| params : dict of str to NDArray | ||
| Additional arguments | ||
| """ | ||
| if key is not None: |
There was a problem hiding this comment.
Why key and value are allowed to be None?
There was a problem hiding this comment.
Didn't see the fix? Why we need if key is not None?
6061ae2 to
1c67a87
Compare
|
I'm busy with other tasks in recent days. Will try to take another look when I got a chance. |
2947327 to
62e4d5f
Compare
|
@comaniac, if you have time , could you help for a review. thanks. |
|
|
||
| template <typename SLOT_TYPE = SLOT> | ||
| void deleteQueue(squeue<SLOT_TYPE>* q) { | ||
| free(q); |
| #include <assert.h> | ||
| #include <sched.h> | ||
| #include <string.h> | ||
| #include <sys/syscall.h> |
|
Thanks @huajsj. Please respect our coding style and standard good programming practices:
|
[Finding]
the final output is same with constant value, seems like input data is
0, this is because the get input index have a '+1' operation, but the
input index already start from 0. the means when doing setinput('x',..) it
should get convert to setinput(0, ..), but the wrong logic is
setinput(1,..).
|
IIUC, this PR should be out-of-date? Should we close? |
|
@comaniac , yes we should close this PR, closed it now. |
Issue:
SOC hardware plarform have multiple types compute chipset like
GPU,FPGA,APU,RPU etc, there is a requirement that use these compute
unit in parallel to reach best performance.
Solution:
In these pipeline solution, we first split the compute graph into
a group of subgraph, then run these subgraph in a pipeline module
to make the GPU/FPGA/APU/RPU parallel running become possible.
Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.