Skip to content

issue/843: QY机器支持scale_mm#68

Merged
qinyiqun merged 2 commits intoqinyiqun:issue/843from
xgqdut2016:issue/843
Jan 15, 2026
Merged

issue/843: QY机器支持scale_mm#68
qinyiqun merged 2 commits intoqinyiqun:issue/843from
xgqdut2016:issue/843

Conversation

@xgqdut2016
Copy link
Copy Markdown

9410d8af-b891-4b1c-ae76-31f026794a5f ba9d9f19-e02a-48bc-9340-dd320f7139a7

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scaled_mm未来会存放多种量化算法,所以kernel.cuh应该给一个比较明确的名字,以及需要注释

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件应该是qy-gpu的,所以不应该用nvidia来命名

@qinyiqun qinyiqun merged commit 188885a into qinyiqun:issue/843 Jan 15, 2026
2 of 5 checks passed
qinyiqun pushed a commit that referenced this pull request Jan 20, 2026
* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
qinyiqun pushed a commit that referenced this pull request Jan 21, 2026
* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
qinyiqun pushed a commit that referenced this pull request Jan 27, 2026
* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
qinyiqun pushed a commit that referenced this pull request Feb 5, 2026
* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
qinyiqun added a commit that referenced this pull request Mar 4, 2026
demo131 - multiple issues regarding quantization, qy, and so forth

* issue/843: success per_channel_quant_int8

* issue/843: success qy quant

* issue/843: modified quant

* Add w8a8int8 performance tests

* add infinicore op linear_w8a8i8

* w8a8 linear module functional nn

* issue/843: QY-GPU Support Int8 scale_mm (#68)

* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh

* fix parallel slic in w8

* w8: support multiple batch size

* temp: 修改quantconfig处理

* fix format and delete redundancy code

* fix format

* fix format

* fix format

* Refactor: add new API alongside legacy interfaces with deprecation warnings

* 添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore

* 量化算子支持图

* solve cub version problem and fix code structure

* fix format

* demo131 - remove commented lines

---------

Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: xgqdut2016 <140036308+xgqdut2016@users.noreply.github.com>
Co-authored-by: wooway777 <wooway777@gmail.com>
xgqdut2016 added a commit to xgqdut2016/InfiniCore that referenced this pull request Mar 4, 2026
* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants