PyTorch 2.8.0 cu128 template installs 2.4.1 — missing cu128 wheels

## Bug

The CUDA 12.8.1 + PyTorch 2.8.0 template (`runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404`) installs **PyTorch 2.4.1** instead of 2.8.0.

## Root Cause

In `official-templates/pytorch/docker-bake.hcl` line 27:

```hcl
{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "128" },
```

The Dockerfile runs:
```
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu128
```

**PyTorch 2.8.0 stable wheels are not published for cu128.** Pip silently falls back to PyTorch 2.4.1+cu124.

## Evidence

Deployed a 2x B200 pod using the "Runpod Pytorch 2.8.0" template:

```
$ nvidia-smi --query-gpu=name,driver_version --format=csv,noheader
NVIDIA B200, 580.126.09
NVIDIA B200, 580.126.09

$ python3 -c "import torch; print(torch.__version__, torch.version.cuda)"
2.4.1+cu124 12.4

$ python3 -c "import torch; print(torch.cuda.get_device_properties(0))"
UserWarning: NVIDIA B200 with CUDA capability sm_100 is not compatible
with the current PyTorch installation. The current PyTorch install
supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90.
```

Pod hostname: `7ab25c9e0ebb`
Pod IP: `38.80.152.146:31039`

## Impact

- B200 GPUs (compute capability sm_100) are **completely unusable** with PyTorch 2.4.1
- Users pay for B200 GPU time they cannot use
- RunPod's own deployment page warns "B200s only support Pytorch 2.8 and above" but the template labeled 2.8.0 doesn't deliver it

## Comparison with Other Templates

The older templates embed the actual torch version in the image tag and work correctly:

```
Pytorch 2.1: runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04  ✅
Pytorch 2.2: runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04  ✅
Pytorch 2.4: runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04  ✅
Pytorch 2.8: runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404            ❌ installs 2.4.1
```

## Suggested Fix

Change the wheel source for torch 2.8.0 in `docker-bake.hcl`:

```hcl
# Current (broken):
{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "128" },

# Fix option 1 — use cu124 wheels:
{ cuda_version = "12.8.1", torch = "2.8.0", whl_src = "124" },

# Fix option 2 — use nightly wheels (as the original commit abfb7ab did):
# TORCH = "torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128"
```

Also affects the CUDA 12.9.0 and 13.0.0 rows if the same wheel source issue applies.

## Related

- RunPod support ticket #35526 (same user, same template)
- The same `docker-bake.hcl` rows for torch 2.6.0 and 2.7.1 may also be affected — worth validating all cu128 wheel availability

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch 2.8.0 cu128 template installs 2.4.1 — missing cu128 wheels #114

Bug

Root Cause

Evidence

Impact

Comparison with Other Templates

Suggested Fix

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PyTorch 2.8.0 cu128 template installs 2.4.1 — missing cu128 wheels #114

Description

Bug

Root Cause

Evidence

Impact

Comparison with Other Templates

Suggested Fix

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions