Bug Summary
litellm.responses() / litellm.aresponses() silently drops the timeout parameter when routing through the completion transformation path (used by Anthropic, Vertex AI Claude, Bedrock, and any provider without a native Responses API config). The timeout works correctly on the native Responses API path (OpenAI, Azure).
This means Router(timeout=40) is a no-op for Anthropic models using the Responses API. The actual timeout falls back to the Anthropic SDK default (~600s).
Production Impact
- Incident: A single
aresponses() call to claude-sonnet-4-6 hung for 602.7 seconds despite Router(timeout=40).
- Root cause: The 602.7s matches the Anthropic SDK default timeout (~600s), confirming the Router's
timeout=40 was never enforced.
Bug Location
File: litellm/responses/main.py — responses() function
The problem (line ~1108 on current main)
# COMPLETION TRANSFORMATION PATH (Anthropic, Bedrock, etc.)
# timeout is a NAMED PARAMETER of responses(), so it is NOT in **kwargs
if responses_api_provider_config is None or use_chat_completions_api is True:
return litellm_completion_transformation_handler.response_api_handler(
model=model,
input=input,
responses_api_request=response_api_optional_params,
custom_llm_provider=custom_llm_provider,
_is_async=_is_async,
stream=stream,
extra_headers=extra_headers,
extra_body=extra_body,
**kwargs, # <-- timeout is NOT here, it was consumed as a named param
)
The working path for comparison (line ~1169 on current main)
# NATIVE RESPONSES API PATH (OpenAI, Azure)
response = base_llm_http_handler.response_api_handler(
model=model,
input=input,
...
timeout=timeout or request_timeout, # <-- timeout IS explicitly forwarded here
...
)
Root Cause
timeout is declared as a named parameter in the responses() function signature:
def responses(
input: ...,
model: ...,
...
timeout: Optional[Union[float, httpx.Timeout]] = None, # <-- consumed here
...
**kwargs, # <-- timeout is NOT in kwargs
):
Since timeout is a named parameter, Python removes it from **kwargs. When the completion transformation path passes **kwargs to response_api_handler(), the timeout value is silently lost.
The native path explicitly forwards timeout=timeout or request_timeout, but the completion transformation path does not.
Fix
Add timeout=timeout to the response_api_handler() call on the completion transformation path:
if responses_api_provider_config is None or use_chat_completions_api is True:
return litellm_completion_transformation_handler.response_api_handler(
model=model,
input=input,
responses_api_request=response_api_optional_params,
custom_llm_provider=custom_llm_provider,
_is_async=_is_async,
stream=stream,
extra_headers=extra_headers,
extra_body=extra_body,
timeout=timeout, # <-- ADD THIS
**kwargs,
)
The downstream handler (LiteLLMCompletionTransformationHandler.response_api_handler) already accepts **kwargs, so timeout will flow through to acompletion() / completion() calls automatically.
Also check: extra_body was recently added to the forwarded params. Other named parameters that may also be silently dropped on this path (same class of bug):
| Parameter |
Forwarded on completion path? |
Forwarded on native path? |
timeout |
NO |
YES (timeout or request_timeout) |
extra_query |
NO |
YES |
max_output_tokens |
NO (in response_api_optional_params) |
YES |
temperature |
NO (in response_api_optional_params) |
YES |
top_p |
NO (in response_api_optional_params) |
YES |
timeout and extra_query are the most critical missing ones since they are NOT included in response_api_optional_params either.
Precedent
This is the exact same class of bug that was fixed in #22544 for the metadata parameter. That PR added metadata=metadata to the completion transformation path call. The same fix pattern applies to timeout.
Reproduction
import asyncio
import litellm
from litellm import Router
router = Router(
model_list=[{
"model_name": "claude-sonnet",
"litellm_params": {
"model": "anthropic/claude-sonnet-4-6",
"api_key": "sk-ant-...",
},
}],
timeout=5, # 5 second timeout
)
async def test():
# This will NOT timeout after 5s for Anthropic models.
# It will use the Anthropic SDK default timeout (~600s).
response = await router.aresponses(
model="claude-sonnet",
input="Write a very long essay about the history of computing",
stream=True,
)
asyncio.run(test())
Test results
- Simple model name + slow server: timeout fires correctly at ~5s
anthropic/claude-sonnet-4-6 non-streaming: 201.2s instead of ~5s — FAIL
anthropic/claude-sonnet-4-6 streaming (the incident path): 60.3s (hung until slow server responded) — FAIL, timeout completely ignored
Introspection test result
Monkey-patching litellm.aresponses confirmed that Router does pass timeout=5 to litellm.aresponses(). The bug is inside litellm.aresponses() / responses() itself — it receives the timeout but drops it before calling acompletion().
Affected Providers
Any provider that goes through the completion transformation path (i.e., does NOT have a native BaseResponsesAPIConfig):
- Anthropic (
anthropic/claude-*)
- Vertex AI (
vertex_ai/claude-*)
- Bedrock (
bedrock/anthropic.*)
- Any other provider without native Responses API support
Related Issues/PRs
Bug Summary
litellm.responses()/litellm.aresponses()silently drops thetimeoutparameter when routing through the completion transformation path (used by Anthropic, Vertex AI Claude, Bedrock, and any provider without a native Responses API config). The timeout works correctly on the native Responses API path (OpenAI, Azure).This means
Router(timeout=40)is a no-op for Anthropic models using the Responses API. The actual timeout falls back to the Anthropic SDK default (~600s).Production Impact
aresponses()call toclaude-sonnet-4-6hung for 602.7 seconds despiteRouter(timeout=40).timeout=40was never enforced.Bug Location
File:
litellm/responses/main.py—responses()functionThe problem (line ~1108 on current
main)The working path for comparison (line ~1169 on current
main)Root Cause
timeoutis declared as a named parameter in theresponses()function signature:Since
timeoutis a named parameter, Python removes it from**kwargs. When the completion transformation path passes**kwargstoresponse_api_handler(), the timeout value is silently lost.The native path explicitly forwards
timeout=timeout or request_timeout, but the completion transformation path does not.Fix
Add
timeout=timeoutto theresponse_api_handler()call on the completion transformation path:The downstream handler (
LiteLLMCompletionTransformationHandler.response_api_handler) already accepts**kwargs, sotimeoutwill flow through toacompletion()/completion()calls automatically.Also check:
extra_bodywas recently added to the forwarded params. Other named parameters that may also be silently dropped on this path (same class of bug):timeouttimeout or request_timeout)extra_querymax_output_tokensresponse_api_optional_params)temperatureresponse_api_optional_params)top_presponse_api_optional_params)timeoutandextra_queryare the most critical missing ones since they are NOT included inresponse_api_optional_paramseither.Precedent
This is the exact same class of bug that was fixed in #22544 for the
metadataparameter. That PR addedmetadata=metadatato the completion transformation path call. The same fix pattern applies totimeout.Reproduction
Test results
anthropic/claude-sonnet-4-6non-streaming: 201.2s instead of ~5s — FAILanthropic/claude-sonnet-4-6streaming (the incident path): 60.3s (hung until slow server responded) — FAIL, timeout completely ignoredIntrospection test result
Monkey-patching
litellm.aresponsesconfirmed thatRouterdoes passtimeout=5tolitellm.aresponses(). The bug is insidelitellm.aresponses()/responses()itself — it receives the timeout but drops it before callingacompletion().Affected Providers
Any provider that goes through the completion transformation path (i.e., does NOT have a native
BaseResponsesAPIConfig):anthropic/claude-*)vertex_ai/claude-*)bedrock/anthropic.*)Related Issues/PRs
metadatanot forwarded (same bug class)litellm_settings.request_timeout#25591 — Fixed timeout not fetched fromlitellm_settings(different but related)