Skip to content

Data Transfer Stall #7972

Description

@Stebalien

Update (May 28th): We have deployed #8082 to monitor BS with the latest fix ipfs/go-bitswap#477. We are still running into the 'this request was not processed in time' error but this is likely a false positive due to an incorrect timeout setting in the monitoring tool that we should calibrate for a GW node in production. We need to make this setting a normal IPFS config that can be modified through the standard CLI to avoid restarting the node and better calibrate the timeout. (There's a tribute issue about it that might be picked up next week: #8157.)


Version information:

go-ipfs v0.8.0

Description:

We saw a bitswap stall on some of the preload nodes where they were unable to download any content, even from connected nodes (ipfs/js-ipfs#3577).

This looks related to #5183 but that issue was mostly about specific content. That is, a specific request would fail, and we'd get into a stuck state where we couldn't download that specific content.

The preload nodes did not appear to be spinning, stuck, etc. but restarting them fixed the issue.

Possible causes:

  • Deadlock somewhere.
  • Full wantlist? Maybe we have some cases where wants are getting stuck in the wantlist and never clearing, building up over time?

Metadata

Metadata

Assignees

Labels

P1High: Likely tackled by core team if no one steps upkind/bugA bug in existing code (including security flaws)

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions