fix: Kill() races on netns — missing runtime.LockOSThread() leaks sandbox TAP and tc filters

### Description

`Unikontainer.Kill()` in `pkg/unikontainers/unikontainers.go` (lines 562–595) calls `joinSandboxNetNs()` without first locking the goroutine to its OS thread via `runtime.LockOSThread()`. This leaks the sandbox's TAP device, ingress qdisc, and tc redirect filters on every successful kill.

### Root cause

`joinSandboxNetNs` performs `unix.Setns(fd, CLONE_NEWNET)`, which mutates the network namespace of the **calling OS thread only**. In Go, a goroutine can be migrated between Ms (OS threads) at any scheduling point — every syscall entry/exit, channel op, or async preemption.

Between the `setns` and the netlink-backed `network.CleanupAllUruncTaps()` call later in `Kill()`, there are several scheduling points:

- `hypervisors.NewVMM(...)` (allocations)
- `vmm.Stop(pid)` → `kill(2)` syscall
- `network.CleanupAllUruncTaps()` → `netlink.NewHandle()` → `socket(AF_NETLINK)` + `bind(2)`

If the goroutine has migrated by the time `netlink.NewHandle()` runs, the netlink socket is bound to whatever netns *that* thread is in (the host netns, in practice), not the sandbox netns. The `^tap\d+_urunc$` scan finds zero matches (the sandbox TAP lives in the sandbox netns), the loop is a no-op, and `Kill()` returns `nil` while the sandbox TAP, qdisc, and tc redirect filters all remain.

The contract was already documented on the function itself:

```go
// joinSandboxNetns joins the network namespace of the sandbox
// This function should be called only from a locked thread
// (i.e. runtime. LockOSThread())
func (u Unikontainer) joinSandboxNetNs() error { ... }
```

`Exec()` at line 537 obeys this with `runtime.LockOSThread()`. `Kill()` simply forgot to.

### Realistic trigger

Every `nerdctl rm -f`, `kubectl delete pod`, k8s eviction, Knative scale-to-zero, and CRI `StopContainer` flows through `Unikontainer.Kill()`. The race is deterministic on `GOMAXPROCS > 1` (default everywhere) — Go's scheduler does migrate on `kill(2)` and `socket(2)` syscalls.

### Impact

- **TAP / tc-filter accumulation in long-lived sandbox netns**: each `Kill()` leaks one TAP device + four tc rules (ingress qdisc on tap, ingress qdisc on eth0, redirect filter tap→eth0, redirect filter eth0→tap). On nodes that restart unikernel containers (Knative revision rolls, livenessProbe restarts, CronJob replays), this exhausts the netns and leaves stale tc rules redirecting traffic to deleted ifindexes — manifesting as "the unikernel's network stops working after a restart".
- **Silent failure**: `CleanupAllUruncTaps()` returns `nil` when no devices match, no error surfaces to `urunc kill` or containerd. Invisible until the node degrades.

### Reproduction

1. Start a unikernel container:
   ```
   nerdctl run -d --name uk1 --runtime io.containerd.urunc.v2 <unikernel-image>
   PID=$(nerdctl inspect uk1 -f '{{.State.Pid}}')
   nsenter -t $PID -n ip link show | grep urunc   # tap0_urunc present
   ```
2. Kill the container:
   ```
   nerdctl kill uk1
   ```
3. The sandbox netns persists (held by pause container in k8s, or by other refs). Re-enter and verify:
   ```
   nsenter -t $PID -n ip link show
   nsenter -t $PID -n tc qdisc show
   ```
   The TAP device, ingress qdiscs, and tc redirect filters are still present.
4. Loop to make accumulation visible:
   ```
   for i in $(seq 50); do
     nerdctl run -d --name uk$i --runtime io.containerd.urunc.v2 <image>
     nerdctl kill uk$i
   done
   nsenter -t <sandbox-pid> -n ip link | grep urunc | wc -l   # 50, expected 0
   ```

### Proposed fix

Add `runtime.LockOSThread()` + `defer runtime.UnlockOSThread()` at the top of `Kill()`, and snapshot/restore the original netns so the locked OS thread is not handed back to the runtime pool while still pointing at the sandbox netns.

PR: #588

### Related context

- PR #476 (merged) refactored `CleanupAllUruncTaps` to scan via regex but did not address thread-locking.
- PR #162 ("Fix force deleting containers") is an earlier closed iteration of the kill path; the missing `LockOSThread` predates and survived both.

### Checklist

- [x] I have read the [contribution guide](https://urunc.io/developer-guide/contribute/).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Kill() races on netns — missing runtime.LockOSThread() leaks sandbox TAP and tc filters #587

Description

Root cause

Realistic trigger

Impact

Reproduction

Proposed fix

Related context

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

fix: Kill() races on netns — missing runtime.LockOSThread() leaks sandbox TAP and tc filters #587

Description

Description

Root cause

Realistic trigger

Impact

Reproduction

Proposed fix

Related context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions