[Runtime] DeepSeek v4 pro multi node runtimes by YouNeedCryDear · Pull Request #631 · ome-projects/ome

YouNeedCryDear · 2026-06-17T16:36:01Z

What this PR does

Adds DeepSeek v4 Pro multi-node runtime configuration:

Adds the vllm-deepseek-v4-pro-multi ClusterServingRuntime with an SMG router and vLLM leader/worker engine configuration.
Registers the runtime in config/runtimes/kustomization.yaml.
Adds a matching InferenceService sample for deepseek-v4-pro-multi.

Why we need it

Enables OME users to deploy DeepSeek v4 Pro on a two-node H100 topology with vLLM.

Fixes #

How to test

Not run locally; configuration-only PR submission.

Checklist

Tests added/updated (if applicable)
Docs updated (if applicable)
make test passes locally

shenoyvvarun · 2026-06-18T20:12:38Z

+          - --master-addr=$(LWS_LEADER_ADDRESS)
+          - --gpu-memory-utilization=0.95
+          - --max-num-seqs=256
+          - --max-num-batched-tokens=512


Yeah, sadly.

shenoyvvarun · 2026-06-18T20:14:58Z

+          - -cc.pass_config.fuse_allreduce_rms=False
+          - --master-addr=$(LWS_LEADER_ADDRESS)
+          - --gpu-memory-utilization=0.95
+          - --max-num-seqs=256


Does decreasing this increase this improve the batched_tokens?

decreasing this help with the memory pressure.

YouNeedCryDear requested review from CatherineSue, XinyueZhang369 and slin1237 as code owners June 17, 2026 16:36

github-actions Bot added runtime Runtime configuration changes config Configuration changes labels Jun 17, 2026

DeepSeek v4 pro multi node runtimes

530087c

YouNeedCryDear force-pushed the feat/deepseek-v4-pro-multi-node branch from 5862827 to 530087c Compare June 17, 2026 16:38

YouNeedCryDear changed the title ~~DeepSeek v4 pro multi node runtimes~~ [Runtime] DeepSeek v4 pro multi node runtimes Jun 17, 2026

shenoyvvarun reviewed Jun 18, 2026

View reviewed changes

shenoyvvarun approved these changes Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Runtime] DeepSeek v4 pro multi node runtimes#631

[Runtime] DeepSeek v4 pro multi node runtimes#631
YouNeedCryDear wants to merge 1 commit into
mainfrom
feat/deepseek-v4-pro-multi-node

YouNeedCryDear commented Jun 17, 2026

Uh oh!

shenoyvvarun Jun 18, 2026

Uh oh!

YouNeedCryDear Jun 22, 2026

Uh oh!

shenoyvvarun Jun 18, 2026

Uh oh!

YouNeedCryDear Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YouNeedCryDear commented Jun 17, 2026

What this PR does

Why we need it

How to test

Checklist

Uh oh!

shenoyvvarun Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

YouNeedCryDear Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

shenoyvvarun Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

YouNeedCryDear Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants