Skip to content

collect more shapes / 收集更多矩阵形状 #1004

Description

@functionstackx

@claude do a profile run of conc 16 dsr1 mi355 fp4 and list out the kernels and reprod script to microbenchmark each kernel. also in parallel do a profile run of conc 16 dsr1 h200 fp8 and list out the kernels and reprod script to microbenchmark each kernel

中文说明

请求并行执行两个 profile 运行:(1) DeepSeek-R1 FP4 MI355X conc 16 和 (2) DeepSeek-R1 FP8 H200 conc 16,分别列出内核并提供微基准测试每个内核的复现脚本。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions