Skip to content

[Docs] Refactor BYOC example NPU tutorial #19439

Merged
MasterJH5574 merged 1 commit into
apache:mainfrom
tlopex:doc500
Apr 25, 2026
Merged

[Docs] Refactor BYOC example NPU tutorial #19439
MasterJH5574 merged 1 commit into
apache:mainfrom
tlopex:doc500

Conversation

@tlopex

@tlopex tlopex commented Apr 25, 2026

Copy link
Copy Markdown
Member

This pr refactors the BYOC tutorial for the example NPU backend so the full pipeline (register → partition → codegen → VM execute) actually runs and visibly demonstrates fusion.
Also picks up several latent bugs in the example backend that the original tutorial was implicitly papering over.

@tlopex tlopex changed the title [Docs]Refactor BYOC example NPU tutorial [Docs] Refactor BYOC example NPU tutorial Apr 25, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the 'Bring Your Own Codegen' (BYOC) tutorial and the example NPU backend. Key changes include adding a fused MatMul+ReLU pattern, updating the runtime dispatch logic to handle fused operations, and reordering dispatch checks to prevent incorrect substring matches (e.g., ensuring 'depthwise_conv2d' is checked before 'conv2d'). The tutorial is also expanded with execution examples and clearer explanations of the partitioning process. Feedback suggests improving consistency in the runtime by adding the 'is_fused' parameter to all convolution dispatch functions, even if fused patterns for them are not yet registered.

Comment on lines +329 to +330
} else if (op_name.find("depthwise") != std::string::npos) {
ExecuteDepthwiseConv2D(node, engine);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While ExecuteMatMul and ExecuteConv2D have been updated to accept the is_fused flag, ExecuteDepthwiseConv2D (and ExecuteConv1D below) still lack this parameter. Although no fused patterns for depthwise or 1D convolution are currently registered in patterns.py, adding the parameter here would improve consistency across the runtime's dispatch logic and make it more robust for future extensions.

@MasterJH5574 MasterJH5574 merged commit 9dc87f1 into apache:main Apr 25, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants