You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenShell GPU sandbox setup currently relies on a static filesystem baseline plus runtime enumeration of /dev/nvidia[0-9] device nodes. That covers the common native Linux CUDA path, but it does not derive the inner sandbox requirements from the selected CDI device spec.
That leaves non-standard CDI layouts fragile. The container runtime may inject the right device nodes, mounts, libraries, and group metadata into the outer sandbox container, while the OpenShell supervisor still denies or misconfigures the inner workload because those CDI-derived requirements are absent from the Landlock policy or process credentials.
Tegra/Jetson systems can require additional platform device nodes, host-file mounts, and supplemental group handling. Closed PR fix(gpu): add Tegra/Jetson GPU support #625 called out CDI additionalGIDs, GID 44 / video, and /dev/nvmap.
Other NVIDIA platforms may expose CDI device nodes or library mount destinations that do not match the current static native Linux baseline.
This issue is related to #398, but narrower. #398 tracks using CDI for GPU injection. This issue tracks translating selected CDI metadata into the OpenShell supervisor's inner Landlock policy and process setup.
Proposed Design
File this as a design/spike issue first. Do not mark it state:agent-ready until the runtime contract and implementation split are accepted.
Settled direction:
Docker is the first build target.
Docker-selected CDI device IDs come from the existing GPU request path.
Docker CDI spec directories come from daemon-reported Info.CDISpecDirs.
Docker delivers cdi-context.json through the container archive upload API after create_container and before start_container.
The supervisor resolves selected opaque CDI IDs from supervisor-only CDI spec mounts and applies derived Landlock/process requirements before exec.
CDI specs are the source of truth for device nodes, mount destinations, writable single-file mount candidates, and additionalGIDs.
Non-GPU sandboxes receive no CDI context, spec mounts, CDI-derived policy, or CDI warnings.
Kubernetes, Podman, WSL2 hardware validation, and Tegra/Jetson hardware validation are follow-ups unless their dependencies are available during the first implementation.
The shared CDI context schema and resolver should live in openshell-core::cdi unless implementation uncovers a dependency reason to split it into a dedicated crate. Docker-specific tar upload and bind construction should remain in openshell-driver-docker; Landlock and process credential application should remain in openshell-sandbox.
The runtime-to-supervisor contract should be a small versioned supervisor-only CDI context. Initial shape:
source is diagnostic only. The supervisor must use container-side path values for CDI spec discovery and must not treat host-side source paths as policy paths.
The supervisor-side resolver should:
resolve opaque fully-qualified CDI device IDs to spec kind plus device name;
apply spec-level containerEdits once for any selected device from the spec;
apply device-level containerEdits for each selected device;
extract deviceNodes[].path as container-side device paths;
extract mounts[].containerPath as container-side mount destinations;
extract additionalGIDs and apply supplemental groups before exec;
ignore host paths for Landlock policy;
ignore CDI hooks, environment variables, Intel RDT, and network devices for the first filesystem/process-credential implementation except for diagnostics.
CDI mount destinations should default to read-only. Read-write access may be allowed for single files only when an explicit policy opt-in is present. Read-write CDI directory mounts should be rejected in the first implementation.
Kubernetes should use the same supervisor-side resolution path only after a node-local provider can supply selected CDI device IDs and supervisor-only CDI spec mounts for the actual sandbox pod/container. The Kubernetes driver must not infer selected CDI IDs from nvidia.com/gpu, because that is a resource request, not an opaque CDI device ID.
Alternatives Considered
Static path expansion: rejected as the long-term design. It repeats the current failure mode for WSL2, Tegra/Jetson, and future CDI layouts.
Inspect injected container state: useful for diagnostics, but rejected as the authorization source because it cannot reliably distinguish selected CDI paths from unrelated devices or mounts.
Driver-side pre-resolution: rejected for Docker because it creates different local and remote daemon behavior. The supervisor should resolve selected CDI IDs from daemon-host CDI specs mounted into the sandbox.
Hard-coded Docker CDI spec dir defaults: rejected for the initial implementation. Use Info.CDISpecDirs; if Docker reports no CDI spec dirs, keep rejecting GPU/CDI sandbox creation as unsupported.
Test Strategy
Most behavior can be tested without physical GPUs by constructing test-only CDI specs and presenting them to the supervisor through the same runtime-to-supervisor contract used by Docker.
Minimum hardware-free fixture suite:
native-single: one selected device with /dev/nvidiactl, /dev/nvidia0, and a read-only library mount.
native-all: aggregate nvidia.com/gpu=all entry with multiple device nodes.
wsl-shape: fake WSL-style entry with /dev/dxg and WSL library mount destinations.
tegra-shape: fake Tegra/Jetson-style entry with /dev/nvmap, another platform device node, and additionalGIDs.
rw-file: writable single-file mount, accepted only with explicit policy opt-in.
rw-directory: writable directory mount, rejected.
missing-device: selected CDI ID absent from all mounted specs.
malformed-spec: invalid YAML or unsupported CDI shape.
unsafe-paths: relative paths, paths containing .., /, /dev, /proc, /sys, /run, and /usr as broad derived policy paths.
duplicate-conflict: duplicate paths that request incompatible access modes.
Definition Of Done
Document the Docker CDI metadata source and runtime-to-supervisor contract.
Define how Docker mounts daemon-host CDI spec directories for local and remote daemon cases.
Define how Docker uploads cdi-context.json without bind-mounting a gateway-local metadata file.
Ensure Docker removes a created container if CDI context upload fails before start_container.
Ensure CDI context, spec mounts, policy changes, and warnings apply only to GPU/CDI sandboxes.
Define fail-closed diagnostics for missing selected IDs, missing specs, malformed specs, unsafe paths, writable directory requests, and custom-policy conflicts.
Confirm the initial implementation does not add static GPU device-node or library fallbacks.
Define explicit policy opt-in behavior for CDI writable single-file mount destinations.
Reject CDI-derived read-write directory mounts.
Define Kubernetes selected-device handoff requirements or split that into a follow-up issue.
Add resolver tests for representative CDI specs, including additionalGIDs.
Add supervisor tests using test-only CDI specs and fake/non-standard CDI devices.
Add sandbox policy enrichment tests proving CDI-derived paths are included only for GPU-requested sandboxes.
Add tests for missing selected IDs, malformed specs, duplicate paths, writable mounts, and unsafe broad paths.
Preserve or improve diagnostics for custom policies that conflict with CDI-required GPU paths.
Create follow-up issues for Podman, Kubernetes selected-device discovery, WSL2 validation, and Tegra/Jetson validation as needed.
Agent Investigation
Local design iteration reviewed the Docker GPU/CDI request path, the sandbox GPU baseline, Kubernetes GPU resource handling, related WSL/Tegra history, and the CDI runtime contract constraints.
Current relevant code paths:
crates/openshell-driver-docker/src/lib.rs: Docker GPU support detection from Info.CDISpecDirs, CDI DeviceRequest, Docker create/start flow.
crates/openshell-sandbox/src/lib.rs: current static GPU Landlock baseline and custom-policy conflict behavior.
crates/openshell-core/src/gpu.rs: shared GPU-to-CDI device ID helper.
Problem Statement
Parent: #1444
OpenShell GPU sandbox setup currently relies on a static filesystem baseline plus runtime enumeration of
/dev/nvidia[0-9]device nodes. That covers the common native Linux CUDA path, but it does not derive the inner sandbox requirements from the selected CDI device spec.That leaves non-standard CDI layouts fragile. The container runtime may inject the right device nodes, mounts, libraries, and group metadata into the outer sandbox container, while the OpenShell supervisor still denies or misconfigures the inner workload because those CDI-derived requirements are absent from the Landlock policy or process credentials.
Motivating examples:
/dev/dxgand WSL-specific libraries. Prior work in bug: GPU passthrough fails on WSL2 — NVML init fails without CDI mode and libdxcore.so #404 and closed PR feat(sandbox): add GPU sandbox support for WSL2 #608 identified/dev/dxg,/usr/lib/wsl, and CDI mode as important for WSL GPU support.additionalGIDs, GID 44 /video, and/dev/nvmap.This issue is related to #398, but narrower. #398 tracks using CDI for GPU injection. This issue tracks translating selected CDI metadata into the OpenShell supervisor's inner Landlock policy and process setup.
Proposed Design
File this as a design/spike issue first. Do not mark it
state:agent-readyuntil the runtime contract and implementation split are accepted.Settled direction:
Info.CDISpecDirs.cdi-context.jsonthrough the container archive upload API aftercreate_containerand beforestart_container.additionalGIDs./procwrite access is out of scope for the CDI resolver and remains a separate CUDA sandbox runtime compatibility requirement linked to fix(sandbox): restore GPU filesystem baseline #1522.The shared CDI context schema and resolver should live in
openshell-core::cdiunless implementation uncovers a dependency reason to split it into a dedicated crate. Docker-specific tar upload and bind construction should remain inopenshell-driver-docker; Landlock and process credential application should remain inopenshell-sandbox.The runtime-to-supervisor contract should be a small versioned supervisor-only CDI context. Initial shape:
{ "version": 1, "runtime": "docker", "selected_devices": ["nvidia.com/gpu=all"], "spec_dirs": [ { "path": "/run/openshell/supervisor/cdi-specs/0", "source": "/etc/cdi" } ] }sourceis diagnostic only. The supervisor must use container-sidepathvalues for CDI spec discovery and must not treat host-side source paths as policy paths.The supervisor-side resolver should:
kindplus devicename;containerEditsonce for any selected device from the spec;containerEditsfor each selected device;deviceNodes[].pathas container-side device paths;mounts[].containerPathas container-side mount destinations;additionalGIDsand apply supplemental groups before exec;CDI mount destinations should default to read-only. Read-write access may be allowed for single files only when an explicit policy opt-in is present. Read-write CDI directory mounts should be rejected in the first implementation.
Kubernetes should use the same supervisor-side resolution path only after a node-local provider can supply selected CDI device IDs and supervisor-only CDI spec mounts for the actual sandbox pod/container. The Kubernetes driver must not infer selected CDI IDs from
nvidia.com/gpu, because that is a resource request, not an opaque CDI device ID.Alternatives Considered
Info.CDISpecDirs; if Docker reports no CDI spec dirs, keep rejecting GPU/CDI sandbox creation as unsupported.Test Strategy
Most behavior can be tested without physical GPUs by constructing test-only CDI specs and presenting them to the supervisor through the same runtime-to-supervisor contract used by Docker.
Minimum hardware-free fixture suite:
native-single: one selected device with/dev/nvidiactl,/dev/nvidia0, and a read-only library mount.native-all: aggregatenvidia.com/gpu=allentry with multiple device nodes.wsl-shape: fake WSL-style entry with/dev/dxgand WSL library mount destinations.tegra-shape: fake Tegra/Jetson-style entry with/dev/nvmap, another platform device node, andadditionalGIDs.rw-file: writable single-file mount, accepted only with explicit policy opt-in.rw-directory: writable directory mount, rejected.missing-device: selected CDI ID absent from all mounted specs.malformed-spec: invalid YAML or unsupported CDI shape.unsafe-paths: relative paths, paths containing..,/,/dev,/proc,/sys,/run, and/usras broad derived policy paths.duplicate-conflict: duplicate paths that request incompatible access modes.Definition Of Done
cdi-context.jsonwithout bind-mounting a gateway-local metadata file.start_container./procwrite access is out of scope for the CDI resolver and remains linked to fix(sandbox): restore GPU filesystem baseline #1522.additionalGIDs.Agent Investigation
Local design iteration reviewed the Docker GPU/CDI request path, the sandbox GPU baseline, Kubernetes GPU resource handling, related WSL/Tegra history, and the CDI runtime contract constraints.
Current relevant code paths:
crates/openshell-driver-docker/src/lib.rs: Docker GPU support detection fromInfo.CDISpecDirs, CDIDeviceRequest, Docker create/start flow.crates/openshell-sandbox/src/lib.rs: current static GPU Landlock baseline and custom-policy conflict behavior.crates/openshell-core/src/gpu.rs: shared GPU-to-CDI device ID helper.crates/openshell-driver-kubernetes/src/driver.rs: Kubernetesnvidia.com/gpuresource request path.Related work: