GET initiated during peer-connection churn stalls silently for 60+s instead of fail-fast / re-route

## Symptom

User reported a GET against a low-popularity contract on a local node took 30+ seconds and the HTTP client timed out. Retry one second later succeeded in 18s. Telemetry shows the first GET actually completed network-side after **62 seconds** — it didn't fail, it just stalled.

## Evidence from telemetry (2026-05-19, contract `6FzSeAUKcqJrveKyU8RJgGKc5jRB1Z2juvxXtwTA4Em9`)

Originating peer `jMphpuy7z7PPG4Xt@73.98.109.226` (user's local node):

| Attempt | TX ID | Started (UTC) | Completed (UTC) | `elapsed_ms` |
|---|---|---|---|---|
| 1st (client timed out @30s) | `01KRZ6ADBAT9D0WZYAXC6SQSR2` | ~04:01:20.7 | 04:02:22.462 | **61716** |
| 2nd (retry, succeeded) | `01KRZ6BDQE7DG7W8V2BCDMX0G2` | 04:01:53.823 | 04:02:07.367 | 13544 |

Originating peer's events around the first GET:
- `04:01:19.557` — `disconnect` (peer connection lost)
- `04:01:20.7`  — first GET issued (derived from ULID + `elapsed_ms`)
- `04:01:32.919` — `connect_request_received` + `connect_rejected`
- `04:01:32.999` — `connect_request_sent`
- `04:01:33.321` — `connect_connected` (recovery)
- `04:01:53.823` — retry GET — routes cleanly via 3 hops, succeeds in 13.5s
- `04:02:22.462` — original first GET finally `get_success` (62s after issue)

**Notable:** the first transaction emitted **zero `get_request` events** in telemetry — only the eventual `get_success`. Compare with the retry, which emitted 9 `get_request` events (one per hop) within the first 400ms. The op was held in a pre-routing state while the originating node was reconnecting, and no peer-forwarding telemetry was emitted until the eventual response.

## Hypothesis

When a GET is initiated and the originating node has just lost peer connections (or otherwise lacks a viable forwarding target), the op gets parked in an internal queue waiting for routing to recover. It is not actively re-evaluated, not aggressively re-routed once new peers connect (~13s later in this case), and not failed-fast back to the caller. It just waits — in this case, 62 seconds.

The client-visible 30s timeout is just the symptom; the underlying issue is that the operation has no fail-fast or active-reroute semantics during peer-state churn.

## What I'd expect

Some combination of:
- **Fail-fast**: if a GET has zero viable forwarding targets at issue-time, return an error to the client immediately rather than parking it.
- **Active reroute**: when new peers connect while a GET is parked, re-evaluate routing and dispatch.
- **Bounded park time**: explicit timeout on \"waiting for routable peer\" with telemetry on what we were waiting for.

Today there's apparently no telemetry between op-creation and op-completion in this path — adding instrumentation for \"GET parked waiting for routing\" would already help diagnose this class of failures.

## Repro

Hard to reliably repro without inducing a disconnect at the right moment, but the conditions are:
1. A node loses one or more peer connections.
2. Within ~1s, a contract GET is issued via HTTP.
3. Result: GET hangs for tens of seconds until either peers recover and routing finally fires, or client times out.

## Cross-reference

- Original investigation: freenet/river debug session 2026-05-19
- Telemetry data: nova OTLP collector logs.jsonl, contract `6FzSeAUKcqJrveKyU8RJgGKc5jRB1Z2juvxXtwTA4Em9`, TX `01KRZ6ADBAT9D0WZYAXC6SQSR2`

[AI-assisted - Claude]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GET initiated during peer-connection churn stalls silently for 60+s instead of fail-fast / re-route #4154

Symptom

Evidence from telemetry (2026-05-19, contract `6FzSeAUKcqJrveKyU8RJgGKc5jRB1Z2juvxXtwTA4Em9`)

Hypothesis

What I'd expect

Repro

Cross-reference

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Attempt	TX ID	Started (UTC)	Completed (UTC)	`elapsed_ms`
1st (client timed out @30s)	`01KRZ6ADBAT9D0WZYAXC6SQSR2`	~04:01:20.7	04:02:22.462	61716
2nd (retry, succeeded)	`01KRZ6BDQE7DG7W8V2BCDMX0G2`	04:01:53.823	04:02:07.367	13544

Uh oh!

GET initiated during peer-connection churn stalls silently for 60+s instead of fail-fast / re-route #4154

Description

Symptom

Evidence from telemetry (2026-05-19, contract 6FzSeAUKcqJrveKyU8RJgGKc5jRB1Z2juvxXtwTA4Em9)

Hypothesis

What I'd expect

Repro

Cross-reference

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Evidence from telemetry (2026-05-19, contract `6FzSeAUKcqJrveKyU8RJgGKc5jRB1Z2juvxXtwTA4Em9`)