fix(retry): context_overflow falls through to the next candidate instead of aborting#28
Conversation
retry_policies.balanced.context_overflow was the only non-stream kind that aborted the whole request. That predates provider-neutral families (#26): a family like gpt-5.4 now spans candidates with heterogeneous context windows (openai, openrouter, antseed), so an overflow on the first route says nothing about the rest — yet abort killed the request without trying them. retry_same would be futile (same model, same window); next_candidate is not. Now context_overflow falls through like bad_request/timeout. If every candidate overflows the request still ends cleanly in `exhausted: context_overflow`. stream_interrupted stays abort (content already delivered — can't retry). Behavioural test (fails against HEAD): family:gpt-5.4 whose every route returns context_overflow now calls >1 candidate before exhausting, where HEAD aborted on the first — the exact shape of the prod failure (family:gpt-5.4 tried only openrouter, then aborted).
|
Warning Review limit reached
More reviews will be available in 59 minutes and 48 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Problem (diagnosed against prod
v-3a2e0ea)retry_policies.balanced.context_overflow = { action = "abort" }was the only non-stream error kind that aborts the whole request. That predates provider-neutral families (#26): a family likegpt-5.4now spans candidates with heterogeneous context windows (openai,openrouter,antseed), so an overflow on the first route says nothing about the rest — yetabortkilled the request without trying them.Prod trace of the exact failure —
family:gpt-5.4tried onlyopenrouter/gpt-5.4, got a 400, and aborted:Change
context_overflownow falls through likebad_request/timeout.retry_samewould be futile (same model, same window);next_candidateis not — another route may have a larger window. If every candidate overflows the request still ends cleanly inexhausted: context_overflow.stream_interruptedstaysabort(content already delivered — can't retry).Tests (fail against HEAD)
Behavioural, against the real
config.live.lua: afamily:gpt-5.4whose every route returnscontext_overflownow calls more than one candidate before exhausting, where HEAD aborted on the first (reproducing the prod shape: onlyopenrouterwas tried). 45 passed across live-wiring + flow + policy-ir files.Sibling PR:
fix/legible-provider-400s(precise 400 classification + real provider message). With both, a misclassified 400 is no longer fatal and genuine overflow gets a fallback.