Skip to content

[vsock] cleanup half-closed connections#1101

Merged
papertigers merged 6 commits intomasterfrom
spr/papertigers/vsock-cleanup-half-closed-connections
Apr 4, 2026
Merged

[vsock] cleanup half-closed connections#1101
papertigers merged 6 commits intomasterfrom
spr/papertigers/vsock-cleanup-half-closed-connections

Conversation

@papertigers
Copy link
Copy Markdown
Contributor

@papertigers papertigers commented Apr 2, 2026

There are two scenarios under which connections are considering "closing" and we need to make sure that we handle them.

Scenario 1: The host sends a SHUTDOWN packet with VsockPacketFlags::VIRTIO_VSOCK_SHUTDOWN_F_SEND | VsockPacketFlags::VIRTIO_VSOCK_SHUTDOWN_F_RECEIVE, under normal circumstances the guest is supposed to ack this with an eventual RST packet. If we don't see the SHUTDOWN in a timely manor we send our own RST and cleanup the connection.

Scenario 2: The guest sends us a SHUTDOWN with VsockPacketFlags::VIRTIO_VSOCK_SHUTDOWN_F_SEND | VsockPacketFlags::VIRTIO_VSOCK_SHUTDOWN_F_RECEIVE, however the internal ring buffer that is flushing to the underlying socket has not yet finished draining all data. After a reasonable amount of time we need to send a RST to ack the SHUTDOWN and we need to cleanup the tracked connection state.

Created using jj-spr 0.1.0

[skip ci]
Created using jj-spr 0.1.0
@papertigers papertigers requested a review from iximeow April 2, 2026 17:44
Copy link
Copy Markdown
Member

@iximeow iximeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, seems good. in particular, if I'm misunderstanding the host-side-closed bits then 👍 shipit, but if I'm right then 👍 shipit-if-you-want-but-maybe-twiddling-that-would-be-useful.

in particular I'm thinking about quiescing connections eventually holding up quiescing devices before a migration or shutdown or whatever, so if we're leaving ourselves holding for a few seconds unnecessarily it'll end up uncomfortable.

Comment on lines +618 to +621
self.quiescing.push_back(ClosingConn {
key,
started: Instant::now(),
});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the guest has already closed their end of the connection (sent a RST, say), could we be here reading from the other end (say VM RoT) which has only just closed its end of the socket?

if so we'd push this connection - which probably should be ConnState::Closing { read: true, write: true }? - into the quiescing queue and close it in a few seconds? in which case maybe this is a point where we'd want to conn.shutdown_guest_write() and check conn.should_close() at some point here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ConnState::Closing is importantly tracking the guests willingness to send data or read data, let me adjust the comment there.

Where this comment is, is the EOF from the underlying host socket (the attestation server or something else) so we are sending the SHUTDOWN packet and saying we will not be accepting any more data and we will not be sending any more data.

This is what the spec has to say about a graceful shutdown:

Clean disconnect is achieved by one or more VIRTIO_VSOCK_OP_SHUTDOWN packets that indicate no
more data will be sent and received, followed by a VIRTIO_VSOCK_OP_RST response from the peer. If
no VIRTIO_VSOCK_OP_RST response is received within an implementation-specific amount of time, a
VIRTIO_VSOCK_OP_RST packet is sent to forcibly disconnect the socket.

Does this still trigger alarm bells for you? It's possible I am overlooking something.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this further over chat and added a comment in bf80e3b

Created using jj-spr 0.1.0
Created using jj-spr 0.1.0
Copy link
Copy Markdown
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me! I had some smallish suggestions, but no major issues.

}

fn quiesce_connections(&mut self) {
#[allow(unstable_name_collisions)]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

um, what the heck is "unstable name collisions"? is this bad?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is because we are reimplementing the VecDeque method that isn't stable yet, isn't it? could we maybe say something about that?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that was because my suggestion: #1101 (comment)

fwiw i don't sweat the allow() because we should bump propolis to 1.93 after R19 meaning this goes away.. soon

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left an additional comment at the call site.

Comment on lines +692 to +694
while let Some(conn) = self.quiescing.pop_front_if(|conn| {
conn.started.elapsed() > DEFAULT_QUIESCE_TIMEOUT
}) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, IIUC, there shouldn't be any situation in which an entry that is not at the head of the queue could have an elapsed quiesce timeout even if the head of the queue doesn't? But, it might be worth noting that explicitly, since it is representable here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a comment, let me know if it's clear enough

Created using jj-spr 0.1.0
@papertigers papertigers changed the base branch from spr/papertigers/master.vsock-cleanup-half-closed-connections to master April 3, 2026 23:09
@papertigers papertigers requested review from hawkw and iximeow April 3, 2026 23:11
Copy link
Copy Markdown
Member

@iximeow iximeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫡 🙏

@papertigers papertigers merged commit 7e5ed24 into master Apr 4, 2026
7 checks passed
@papertigers papertigers deleted the spr/papertigers/vsock-cleanup-half-closed-connections branch April 4, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants