fix(passthrough): swallow flush replay errors; map Anthropic overloaded_error to 529 (#29187)#29205
Conversation
…oaded_error to 529 (BerriAI#29187) Signed-off-by: Tai An <antai12232931@outlook.com>
Greptile SummaryThis PR makes two targeted fixes to the Bedrock passthrough + Anthropic overload error path: it wraps the post-stream flush (used only for spend-tracking) in a
Confidence Score: 4/5Both changes are narrow and well-scoped; the flush wrapping adds no new risk to the client response path, and the 529 mapping is straightforwardly additive. The flush try/except silently skips failure-handler invocation when an error is caught, so overloaded passthrough calls will not appear in cost or alerting callbacks. This is a present observability gap, but it does not affect correctness of the client-facing response, which is already closed before the flush runs. litellm/litellm_core_utils/litellm_logging.py — the exception branch in both flush methods returns without firing any failure handler.
|
| Filename | Overview |
|---|---|
| litellm/litellm_core_utils/litellm_logging.py | Wraps _flush_passthrough_collected_chunks_helper in try/except for both sync and async flush paths; on error, logs a warning and returns early instead of propagating the exception as an unhandled asyncio task exception. |
| litellm/llms/anthropic/chat/handler.py | Maps overloaded_error in-stream error type to HTTP 529 status instead of the former hard-coded 500, allowing retry middleware to distinguish transient overload from generic server errors. |
Comments Outside Diff (1)
-
litellm/litellm_core_utils/litellm_logging.py, line 2019-2044 (link)No failure handler called on flush error — failed passthrough calls go unobserved
When
_flush_passthrough_collected_chunks_helperraises (e.g. onoverloaded_error), the new code logs a warning and returns without callingself.failure_handler(...). The sync path has the same gap. This means overloaded / errored passthrough calls will never appear in cost-tracking or alerting dashboards as failures — they are silently dropped from observability. If a caller relies on failure callbacks (e.g. for budget enforcement or alert thresholds) those callbacks are never fired for this code path.Consider whether
self.failure_handler(e)/await self.async_failure_handler(e)should be invoked inside theexceptblock after the warning log, mirroring how non-passthrough streaming errors are handled.
Reviews (1): Last reviewed commit: "fix(passthrough): swallow flush replay e..." | Re-trigger Greptile
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
The bare <a> in the "Not merged yet" copy on the unmerged-PR page
("BerriAI/litellm#29205 hasn't been merged…") had no explicit color and
fell back to the browser default #0000EE on the #111111 surface — 2:1
contrast, well below WCAG AA's 4.5:1 floor. axe-core caught it on the
unmerged-PR page once we added coverage there.
Adds .answer-date a styling matching the established pattern
(.answer-meta a / .pr-banner a / .sec-label a): --text at rest with a
muted underline, --accent on hover.
Also:
- a11y-contrast.test.ts: new case covering the open-PR page (regression
gate for the dark-blue-link bug shape).
- a11y-full-audit.test.ts: gated re-runnable sweep that runs the full
WCAG 2.1 AA ruleset across every distinct page state. Opt-in via
A11Y_AUDIT=1 — review tool, not a per-PR gate.
- validate.sh: now runs the chromium contrast suite pre-push when
playwright's chromium is installed; gracefully skips with a one-liner
install hint otherwise. CI's a11y job remains the authoritative gate.
Matches the shellcheck/actionlint/osv tier.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Closes #29187.
Two small, scoped changes that together turn the Bedrock-passthrough
"client sees
Internal server error, no signal to retry" failure into aproper 529 surfacing path, and stop the
Task exception was never retrievedlog spam during Anthropicoverloads.
1.
litellm_core_utils/litellm_logging.py— wrap flush replay intry/exceptflush_passthrough_collected_chunksand its async sibling are scheduledvia
asyncio.create_taskin thefinallyblock from PR #26719 (v1.84.0onwards). They replay the buffered stream through the provider config
purely for spend-tracking / success-logging — the client HTTP response is
already fully closed by the time we get here.
If the buffered stream contains a mid-stream error event (Anthropic's
overloaded_erroris the visible one for Claude Code via Bedrock,covered in the issue), the provider config raises during chunk replay,
the unhandled exception escapes the async task, and the user sees:
…while the real surfacing path (the actual HTTP response) is already
gone. Wrapping the replay in
try/exceptlogs it as a warning, skipsthe success handler (no spend to track for a failed call anyway), and
lets the asyncio task complete cleanly. This is symmetric across the
sync (
flush_passthrough_collected_chunks) and async(
async_flush_passthrough_collected_chunks) paths.2.
llms/anthropic/chat/handler.py— pick status fromerror.typeModelResponseIterator.chunk_parser'stype == "error"branchhard-coded
status_code=500for every in-stream error event, with acomment noting Anthropic doesn't return an HTTP status in the chunk.
Anthropic's public convention is 529 for
overloaded_error(theirown SDK / docs use this code for the "API temporarily overloaded" case).
Mapping
error.type == "overloaded_error"→ 529 lets retry middlewareand observers distinguish a transient/retryable overload from a generic
500 server error. Any other
error.typekeeps the existing 500 default.Test plan
AnthropicError('Overloaded')raised atanthropic/chat/handler.py:851is now wrapped at thelitellm_logging.py:1996/2010callsites — no moreTask exception was never retrieved.streaming path, now propagates as a 529 rather than a 500 — clients
can choose to retry.
python -m py_compileboth files.complete_streaming_response is None(existing path) or replay succeeds, control flow isunchanged —
success_handler/async_success_handleris stillinvoked exactly as before.
neighboring style in the file).
Out of scope
anthropic_messagesroute mid-stream fallback ([Bug]: Mid-stream fallback not supported for anthropic_messages route type #24004)and
/v1/messagesasync_sse_wrappererror handling ([Bug]: No Error Handling in /v1/messages Path #24609) areseparate surfacing paths; this PR only addresses the bedrock
passthrough flush path called out in Bedrock passthrough requests appear to hang then fail with "Internal server error" when Anthropic is overloaded — overloaded error is never surfaced to the client (e.g. Claude Code) #29187 (root + secondary
finding).
🤖 Generated with Claude Code