fix(openai-responses): strip Anthropic cache_control from Responses API requests by cwang-otto · Pull Request #28431 · BerriAI/litellm

cwang-otto · 2026-05-21T02:44:49Z

Title

fix(openai-responses): strip Anthropic cache_control from Responses API requests

Relevant issues

OpenAI's Responses API rejects unknown fields on input content blocks with HTTP 400:

BadRequestError: Unknown parameter: 'input[0].content[0].cache_control'.

This bites callers who layer the Anthropic cache-control hook above a fallback chain whose primary route is the native OpenAI Responses API. The chat completions path already strips cache_control via remove_cache_control_flag_from_messages_and_tools (litellm/llms/openai/chat/gpt_transformation.py); the Responses path didn't.

Pre-Submission checklist

Made sure tests pass
Added new tests (3 new cases — input strip, tool strip, no-cache passthrough)
No regressions in openai/responses (103), chatgpt/responses (14), azure/response (22) — 139 passed
Mypy clean on touched file
Live-verified end-to-end against real OpenAI Responses API

Type

🐛 Bug Fix

Changes

litellm/llms/openai/responses/transformation.py

transform_responses_api_request now calls remove_cache_control_flag_from_input_and_tools after _validate_input_param.
New helper is the sibling of the chat path's remove_cache_control_flag_from_messages_and_tools: same signature shape (model, input, tools), same recursive primitive (filter_value_from_dict), strips from both input content blocks and tools for symmetry.
Returns Tuple[Union[str, ResponseInputParam], Optional[List[ALL_RESPONSES_API_TOOL_PARAMS]]] — precise types so callers stay mypy-strict.
model: str parameter is unused today but mirrors the chat helper, leaving room for subclasses to selectively skip stripping (same hook the chat path comment calls out).

tests/test_litellm/llms/openai/responses/test_openai_responses_transformation.py

test_transform_strips_cache_control_from_input_content_blocks — the failing case (matches the live OpenAI 400 wording).
test_transform_strips_cache_control_from_tools — symmetry with chat path.
test_transform_preserves_input_without_cache_control — no-op regression guard.

Proof

Unit tests + mypy

Live verification (real OpenAI Responses API)

Ran a script that calls litellm.aresponses(model="gpt-4o-mini", input=…, tools=…) with cache_control: {"type": "ephemeral"} injected on both an input content block and a tool definition, then performs the same call directly through the raw openai SDK as a negative control:

[1] Calling litellm.aresponses() with cache_control on input + tools…
    OK — response text: 'pong'

[2] Calling raw OpenAI SDK with cache_control on input (no strip)…
    OK — OpenAI rejected as expected: BadRequestError
    error: Error code: 400 - {'error': {'message': "Unknown parameter: 'input[0].content[0].cache_control'.", 'type': 'invalid_request_error', 'param': 'input[0].content[0].cache_control', 'code': 'unknown_param…

=== Summary ===
  litellm strip works (200 OK):           True
  raw openai rejects cache_control (400): True

Confirms (a) the bug is reproducible against the live API and (b) this PR's strip lets the same payload through unchanged from the caller's perspective.

OpenAI's Responses API rejects unknown fields on input content blocks with HTTP 400 ("Unknown parameter: 'input[0].content[0].cache_control'"). Chat Completions already strips Anthropic-only `cache_control` markers via `remove_cache_control_flag_from_messages_and_tools`. Mirror that behavior in the native OpenAI Responses path so cross-provider callers (e.g. AnthropicCacheControlHook upstream of a fallback chain) don't trip a 400 when the primary route is OpenAI Responses. Strips from both input content blocks and tools for symmetry with the chat path.

greptile-apps · 2026-05-21T02:47:10Z

Greptile Summary

This PR fixes a BadRequestError: Unknown parameter: 'input[0].content[0].cache_control' that OpenAI's Responses API throws when Anthropic cache_control markers are present in request payloads. The fix mirrors the existing Chat Completions path by adding a _strip_cache_control_flag helper that recursively removes cache_control keys from input content blocks and tools using the shared filter_value_from_dict utility.

Adds _strip_cache_control_flag to OpenAIResponsesAPIConfig.transform_responses_api_request, called immediately after _validate_input_param so stripping applies to already-validated (dict-form) input.
Three new unit tests cover the stripping of input content blocks, stripping of tools, and a no-op passthrough for clean inputs; no existing tests were modified.

Confidence Score: 4/5

Safe to merge — the change is a targeted, in-place strip of Anthropic-only cache_control keys using the same recursive utility the chat path already relies on, with no modifications to existing tests.

The core logic is correct: filter_value_from_dict is called on each top-level message dict and each tool dict, and its recursive descent into nested lists/dicts correctly reaches content blocks where cache_control lives. The mutation-in-place pattern matches the existing chat path. The only gap is a loose tuple return annotation on _strip_cache_control_flag.

No files require special attention; both changed files are self-contained and straightforward.

Important Files Changed

Filename	Overview
litellm/llms/openai/responses/transformation.py	Adds `_strip_cache_control_flag` static method and wires it into `transform_responses_api_request`; logic is correct and symmetric with the chat path, minor return-type annotation could be tightened.
tests/test_litellm/llms/openai/responses/test_openai_responses_transformation.py	Three new mock-only tests covering the exact error scenario, tool symmetry, and a no-op passthrough; no existing tests modified.

_{Reviews (1): Last reviewed commit: "fix(openai): strip cache_control from Re..." | Re-trigger Greptile}

greptile-apps · 2026-05-21T02:47:17Z

+    def _strip_cache_control_flag(
+        input: Union[str, ResponseInputParam],
+        response_api_optional_request_params: Dict,
+    ) -> tuple:


The return annotation tuple is untyped — mypy will infer tuple[Any, ...] for the return, losing the precise types. Using Tuple[Union[str, ResponseInputParam], Dict] makes the contract explicit and keeps mypy checks tight on callers.

Suggested change

) -> tuple:

) -> Tuple[Union[str, ResponseInputParam], Dict]:

Fixed in 4f6d962 — return annotation is now Tuple[Union[str, ResponseInputParam], Optional[List[ALL_RESPONSES_API_TOOL_PARAMS]]]. Also took the opportunity to mirror the chat-path API shape: renamed to public remove_cache_control_flag_from_input_and_tools (sibling of remove_cache_control_flag_from_messages_and_tools), instance method, separate tools arg, and added the model: str hook param for subclass selective-skip.

Per review feedback: - Rename to public `remove_cache_control_flag_from_input_and_tools` (sibling of chat path's `remove_cache_control_flag_from_messages_and_tools`) - Instance method, takes `tools` as explicit arg with `ALL_RESPONSES_API_TOOL_PARAMS` typing, and `model: str` hook for subclass selective-skip — parallel signature to chat helper - Tighten return annotation to `Tuple[Union[str, ResponseInputParam], Optional[List[ALL_RESPONSES_API_TOOL_PARAMS]]]` (addresses Greptile P2 on bare `tuple`)

oss-pr-review-agent-shin · 2026-05-21T02:55:31Z

🤖 litellm-agent: Squash-merged into staging branch shin_agent_oss_staging_05_21_2026. Staging PR: #28432

Triage Summary
Adds a _strip_cache_control_flag helper to OpenAIResponsesAPIConfig.transform_responses_api_request that recursively removes cache_control keys from input content blocks and tools before sending requests to OpenAI's Responses API. OpenAI rejects these Anthropic-only fields with HTTP 400; the Chat Completions path already strips them via remove_cache_control_flag_from_messages_and_tools, and this PR mirrors that behavior for the Responses path. Three new unit tests cover the input strip, tool strip, and no-op passthrough cases.

Merge Confidence: 5/5 ✅ READY
Ready to ship.

All checks green. Greptile 4/5, no blocking pattern findings, no CircleCI runs (OSS-typical).

cwang-otto · 2026-05-21T02:57:29Z

Update on this PR:

Addressed Greptile P2: tightened return annotation to Tuple[Union[str, ResponseInputParam], Optional[List[ALL_RESPONSES_API_TOOL_PARAMS]]] in 4f6d962.
Refactor for symmetry: renamed helper to public remove_cache_control_flag_from_input_and_tools (sibling of chat path's remove_cache_control_flag_from_messages_and_tools), instance method, separate tools arg, added unused model: str param mirroring the chat helper's subclass-skip hook.
Live-verified end-to-end against real OpenAI Responses API (proof in updated PR description):
- litellm.aresponses() with cache_control on input + tools → 200 OK
- Raw openai SDK with the same payload → 400 Unknown parameter: 'input[0].content[0].cache_control' (negative control)

139 tests pass, mypy clean on touched file.

…PI requests (BerriAI#28431) Squash-merged by litellm-agent from cwang-otto's PR.

…PI requests (#28431) Squash-merged by litellm-agent from cwang-otto's PR.

* fix(anthropic): handle empty streaming tool calls (#28549) Co-authored-by: shin-berri <shin-laptop@berri.ai> Co-authored-by: yuneng-jiang <yuneng@berri.ai> * [Feature][Bug Fix] Decouple Azure OpenAI Deployment ID from model name via base_model to fix gpt5 model routing (#28490) * feat(azure): decouple deployment ID from model name via base_model Azure OpenAI deployments have arbitrary names (deployment IDs) that may not match the underlying model. Previously, model-type detection (o-series, gpt-5, etc.) relied on substring matching against the deployment name, causing misrouted configs and rejected params when deployment names were non-standard (e.g. 'my-deployment-id' for gpt-5.2). This change extends the existing base_model field to drive model-type detection, config selection, supported param resolution, and param mapping throughout the Azure call path: - _get_azure_config() uses base_model for is_o_series/is_gpt_5 checks - get_provider_chat_config() threads base_model for Azure - get_supported_openai_params() accepts and uses base_model - get_optional_params() accepts base_model and passes it to all Azure config method calls (get_supported_openai_params, map_openai_params) - azure.py completion handler uses base_model for GPT-5 detection - Config internal methods (e.g. is_model_gpt_5_2_model) now receive base_model so features like logprobs are correctly enabled Fully backward compatible - when base_model is unset, behavior is identical. Existing o_series/ and gpt5_series/ prefix workarounds continue to work. Usage in proxy config: model_list: - model_name: my-gpt5 litellm_params: model: azure/my-deployment-id model_info: base_model: azure/gpt-5.2 Fixes: non-standard deployment names like 'prefix-gpt-5.2' rejecting logprobs/top_logprobs despite the underlying model supporting them. * Addressing Greptile comments. * gemini-3.1-flash-lite pricing (#27933) * feat(model_prices): add gemini-3.1-flash-lite pricing with standard/batch/flex/priority tiers * fix pricing * add service tier --------- Co-authored-by: shin-berri <shin-laptop@berri.ai> * fix(openai-responses): strip Anthropic cache_control from Responses API requests (#28431) Squash-merged by litellm-agent from cwang-otto's PR. * Treat None litellm_provider as wildcard in _check_provider_match (#28523) Squash-merged by litellm-agent from adityasingh2400's PR. * fix greptile * fix: use _azure_detection_model in default Azure branch of get_supported_openai_params Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(openai-responses): strip cache_control on compact endpoint as well Co-authored-by: Yassin Kortam <yassin@berri.ai> --------- Co-authored-by: Felipe Garé <90070734+FelipeRodriguesGare@users.noreply.github.com> Co-authored-by: shin-berri <shin-laptop@berri.ai> Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: withomasmicrosoft <withomas@microsoft.com> Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com> Co-authored-by: cwang-otto <chengxuan.wang@ottotheagent.com> Co-authored-by: Aditya Singh <60082699+adityasingh2400@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai>

greptile-apps Bot reviewed May 21, 2026

View reviewed changes

oss-pr-review-agent-shin Bot changed the base branch from shin_agent_oss_staging_05_20_2026 to shin_agent_oss_staging_05_21_2026 May 21, 2026 02:55

oss-pr-review-agent-shin Bot merged commit f92e1b0 into BerriAI:shin_agent_oss_staging_05_21_2026 May 21, 2026
2 checks passed

cwang-otto added a commit to cwang-otto/litellm that referenced this pull request May 21, 2026

fix(openai-responses): strip Anthropic cache_control from Responses A…

16a875b

…PI requests (BerriAI#28431) Squash-merged by litellm-agent from cwang-otto's PR.

shenshouer mentioned this pull request May 21, 2026

fix(ui/log-drawer): paginate session trace list to surface logs beyond the first 50 #28225

Closed

4 tasks

DaoyuanLi2816 mentioned this pull request May 21, 2026

fix(bedrock): normalize invoke tool search tools for messages API #28209

Closed

4 tasks

Sameerlite pushed a commit that referenced this pull request May 22, 2026

fix(openai-responses): strip Anthropic cache_control from Responses A…

6a13fed

…PI requests (#28431) Squash-merged by litellm-agent from cwang-otto's PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(openai-responses): strip Anthropic cache_control from Responses API requests#28431

fix(openai-responses): strip Anthropic cache_control from Responses API requests#28431
oss-pr-review-agent-shin[bot] merged 2 commits into
BerriAI:shin_agent_oss_staging_05_21_2026from
cwang-otto:fix/openai-responses-strip-cache-control-staging

cwang-otto commented May 21, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 21, 2026

Important Files Changed

Uh oh!

greptile-apps Bot May 21, 2026

Uh oh!

cwang-otto May 21, 2026

Uh oh!

Uh oh!

oss-pr-review-agent-shin Bot commented May 21, 2026

Uh oh!

cwang-otto commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	) -> tuple:
	) -> Tuple[Union[str, ResponseInputParam], Dict]:

Uh oh!

Conversation

cwang-otto commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Proof

Unit tests + mypy

Live verification (real OpenAI Responses API)

Uh oh!

greptile-apps Bot commented May 21, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

cwang-otto May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oss-pr-review-agent-shin Bot commented May 21, 2026

Uh oh!

cwang-otto commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cwang-otto commented May 21, 2026 •

edited

Loading