Bug
The CUDA 12.8.1 + PyTorch 2.8.0 template (runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404) installs PyTorch 2.4.1 instead of 2.8.0.
Root Cause
In official-templates/pytorch/docker-bake.hcl line 27:
{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "128" },
The Dockerfile runs:
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu128
PyTorch 2.8.0 stable wheels are not published for cu128. Pip silently falls back to PyTorch 2.4.1+cu124.
Evidence
Deployed a 2x B200 pod using the "Runpod Pytorch 2.8.0" template:
$ nvidia-smi --query-gpu=name,driver_version --format=csv,noheader
NVIDIA B200, 580.126.09
NVIDIA B200, 580.126.09
$ python3 -c "import torch; print(torch.__version__, torch.version.cuda)"
2.4.1+cu124 12.4
$ python3 -c "import torch; print(torch.cuda.get_device_properties(0))"
UserWarning: NVIDIA B200 with CUDA capability sm_100 is not compatible
with the current PyTorch installation. The current PyTorch install
supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90.
Pod hostname: 7ab25c9e0ebb
Pod IP: 38.80.152.146:31039
Impact
- B200 GPUs (compute capability sm_100) are completely unusable with PyTorch 2.4.1
- Users pay for B200 GPU time they cannot use
- RunPod's own deployment page warns "B200s only support Pytorch 2.8 and above" but the template labeled 2.8.0 doesn't deliver it
Comparison with Other Templates
The older templates embed the actual torch version in the image tag and work correctly:
Pytorch 2.1: runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04 ✅
Pytorch 2.2: runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04 ✅
Pytorch 2.4: runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04 ✅
Pytorch 2.8: runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404 ❌ installs 2.4.1
Suggested Fix
Change the wheel source for torch 2.8.0 in docker-bake.hcl:
# Current (broken):
{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "128" },
# Fix option 1 — use cu124 wheels:
{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "124" },
# Fix option 2 — use nightly wheels (as the original commit abfb7ab did):
# TORCH = "torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128"
Also affects the CUDA 12.9.0 and 13.0.0 rows if the same wheel source issue applies.
Related
- RunPod support ticket #35526 (same user, same template)
- The same
docker-bake.hcl rows for torch 2.6.0 and 2.7.1 may also be affected — worth validating all cu128 wheel availability
Bug
The CUDA 12.8.1 + PyTorch 2.8.0 template (
runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404) installs PyTorch 2.4.1 instead of 2.8.0.Root Cause
In
official-templates/pytorch/docker-bake.hclline 27:{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "128" },The Dockerfile runs:
PyTorch 2.8.0 stable wheels are not published for cu128. Pip silently falls back to PyTorch 2.4.1+cu124.
Evidence
Deployed a 2x B200 pod using the "Runpod Pytorch 2.8.0" template:
Pod hostname:
7ab25c9e0ebbPod IP:
38.80.152.146:31039Impact
Comparison with Other Templates
The older templates embed the actual torch version in the image tag and work correctly:
Suggested Fix
Change the wheel source for torch 2.8.0 in
docker-bake.hcl:Also affects the CUDA 12.9.0 and 13.0.0 rows if the same wheel source issue applies.
Related
docker-bake.hclrows for torch 2.6.0 and 2.7.1 may also be affected — worth validating all cu128 wheel availability