fix(kill): lock OS thread around sandbox netns join to prevent TAP leak#588
Closed
mn-ram wants to merge 1 commit into
Closed
fix(kill): lock OS thread around sandbox netns join to prevent TAP leak#588mn-ram wants to merge 1 commit into
mn-ram wants to merge 1 commit into
Conversation
✅ Deploy Preview for urunc canceled.
|
Closed
1 task
Unikontainer.Kill() called joinSandboxNetNs() (which performs
unix.Setns(CLONE_NEWNET) on the calling OS thread only) without first
calling runtime.LockOSThread(). Between the setns and the subsequent
netlink work in network.CleanupAllUruncTaps(), the Go scheduler is
free to migrate the goroutine to another M still bound to the host
network namespace. When that happens, netlink.NewHandle() inside
CleanupAllUruncTaps creates its socket in the wrong namespace, the
^tap\d+_urunc$ scan finds zero matches, and Kill() returns success
while leaving the sandbox's TAP device, ingress qdisc, and tc
redirect filters in place.
The contract was already documented on joinSandboxNetNs ("This
function should be called only from a locked thread (i.e.
runtime.LockOSThread())") and respected in Exec(); Kill() simply
missed it.
Lock the OS thread for the duration of Kill(), and snapshot/restore
the original netns so the locked thread is not returned to the
runtime pool while still pointing at the sandbox netns.
Reproducer:
for i in $(seq 50); do
nerdctl run -d --name uk$i --runtime io.containerd.urunc.v2 <img>
nerdctl kill uk$i
done
nsenter -t <sandbox-pid> -n ip link | grep urunc | wc -l
Before: 50. After: 0.
Signed-off-by: mn-ram <235066282+mn-ram@users.noreply.github.com>
90a6a89 to
80f887e
Compare
Author
|
Closing — the premise of this PR was incorrect. As @cmainas pointed out on #587, both callers of |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Unikontainer.Kill()calledjoinSandboxNetNs()(which performsunix.Setns(CLONE_NEWNET)on the calling OS thread only) without first callingruntime.LockOSThread(). Between thesetnsand the netlink-backednetwork.CleanupAllUruncTaps()call later in the function, the Go scheduler is free to migrate the goroutine to another M still bound to the host network namespace. When that happens,netlink.NewHandle()insideCleanupAllUruncTapscreates its socket in the wrong namespace, the^tap\d+_urunc$scan finds zero matches, andKill()returns success while leaving the sandbox's TAP device, ingress qdisc, and tc redirect filters in place.The contract was already documented on
joinSandboxNetNsitself ("This function should be called only from a locked thread (i.e.runtime.LockOSThread())") and respected inExec()at line 537;Kill()simply missed it.This change locks the OS thread for the duration of
Kill(), and snapshots/restores the original netns so the locked thread is not handed back to the runtime pool while still pointing at the sandbox netns.Related issues
How was this tested?
make lint/go vet ./pkg/...clean.go build ./...clean.Checklist
make lint).