[litellm-agent] Staging → litellm_internal_staging (5/20/2026) by oss-pr-review-agent-shin[bot] · Pull Request #28310 · BerriAI/litellm

oss-pr-review-agent-shin · 2026-05-20T00:29:45Z

Merged PRs (1)

#	Title
#28215	fix(router): wrap aresponses streaming iterator for mid-stream fallbacks

Auto-updated by litellm-agent on each merge.

…#27700) Squash-merged by litellm-agent from TorvaldUtne's PR.

Co-authored-by: shin-berri <shin-laptop@berri.ai> Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* feat(model_prices): add gemini-3.1-flash-lite pricing with standard/batch/flex/priority tiers * fix pricing * add service tier --------- Co-authored-by: shin-berri <shin-laptop@berri.ai>

@greptile-apps

…dge (#28201) * fix(anthropic): accept dict-shape reasoning_effort from Responses bridge Issue #28196 — the Responses->Chat parser (transformation.py:184-200) keeps the full dict as reasoning_effort when summary is set; that branch was added in #25359. But the Anthropic transformation here still guarded on isinstance(value, str), silently dropping the param. Result: callers using the standard Reasoning(effort, summary) OpenAI-shaped object on Anthropic lose thinking entirely (0 reasoning_tokens, no thinking_blocks). Coerce dict -> string before mapping. Same shape tolerance that gpt_5_transformation._normalize_reasoning_effort_for_chat_completion already implements. summary is irrelevant for Anthropic's thinking_blocks. Adds two regression tests: one parametrized over string + dict shapes (with and without summary), one covering unparseable dict inputs (drops silently, no crash). * test(anthropic): add non-adaptive model coverage for dict-shape reasoning_effort Per Greptile feedback on PR #28198: the original regression test only exercised the adaptive (4.6+) path. Add a parametrized test for the non-adaptive branch (claude-sonnet-4-5) verifying that dict-shape reasoning_effort still maps to thinking.type='enabled' + budget_tokens, and that output_config is NOT set on pre-4.6 models. * test(anthropic): convert unparseable-dict test to @pytest.mark.parametrize Per @greptile-apps inline review on PR #28201 — matches the parametrize style of the two adjacent dict-shape tests and produces clearer failure messages (test ID per case instead of one collapsing for-loop).

…28280) Squash-merged by litellm-agent from ro31337's PR.

…cks (#28215) Squash-merged by litellm-agent from cwang-otto's PR.

oss-pr-review-agent-shin · 2026-05-20T00:29:47Z

@greptile please review

CLAassistant · 2026-05-20T00:29:56Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
5 out of 6 committers have signed the CLA.

✅ TorvaldUtne
✅ mubashir1osmani
✅ ro31337
✅ cwang-otto
✅ IshaMeera
❌ oss-agent-shin
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

greptile-apps · 2026-05-20T00:32:26Z

Greptile Summary

This PR merges PR #28215, which adds mid-stream fallback support for the Responses API (aresponses) streaming path in the Router, bringing it to parity with the existing chat-completions mid-stream fallback. It also fixes the Anthropic transformation to accept dict-shaped reasoning_effort (not just strings), adds stable pricing entries for gemini-3.1-flash-lite across all provider prefixes, and trims whitespace from MCP tool test panel string inputs.

router.py: Introduces _aresponses_with_streaming_fallbacks and four helpers that wrap the streaming iterator to catch MidStreamFallbackError, build a continuation prompt with partial assistant output, merge partial-stream usage onto the fallback's response.completed event, and clean up both streams in a shielded finally block. call_type == "aresponses" is now routed through this wrapper instead of the generic fallback handler.
transformation.py: Loosens the reasoning_effort guard from isinstance(value, str) to also coerce a {"effort": ..., "summary": ...} dict to its effort string before mapping — fixing silent drops when the Responses→Chat bridge forwards a dict-shaped reasoning field.
Model pricing / UI: gemini-3.1-flash-lite stable entries added for all four provider prefixes; ToolTestPanel.tsx trims string inputs before type-coercion.

Confidence Score: 5/5

Safe to merge — the changes are well-scoped and all new code paths are covered by mock-only unit tests.

The core router change is a net-new code path that does not alter existing non-streaming or non-fallback flows. The transformation.py fix is a narrow guard relaxation with full parametrised test coverage. No existing tests were weakened, no real network calls were added, and the cleanup logic uses shielded cancellation scopes correctly.

litellm/router.py — specifically the FallbackResponsesStreamWrapper constructor and the stream_with_fallbacks finally block, which are the most complex parts of the new fallback path.

Important Files Changed

Filename	Overview
litellm/router.py	Adds `_aresponses_with_streaming_fallbacks` and supporting helpers to give the Responses-API path mid-stream fallback parity with the chat-completions path. Logic is sound and well-commented; `FallbackResponsesStreamWrapper._hidden_params` is copied from the source iterator so post-fallback wrapper metadata reflects the source's deployment, not the fallback's.
litellm/llms/anthropic/chat/transformation.py	Loosens the `reasoning_effort` guard to also accept a dict with an `effort` key, fixing silent drops when the Responses→Chat bridge passes a dict-shaped reasoning field.
tests/router_unit_tests/test_router_aresponses_streaming_fallback.py	New unit-test file covering all four helpers added to Router. All tests use mocks — no real network calls.
tests/test_litellm/test_router.py	Adds extensive mock-based tests for the aresponses streaming-fallback path. The only change to an existing test is a pure reformat — assertions are unchanged.
model_prices_and_context_window.json	Adds stable pricing entries for gemini-3.1-flash-lite across all four provider prefixes, consistent with the existing -preview entry.

_{Reviews (2): Last reviewed commit: "fix(router): unblock staging — mypy + co..." | Re-trigger Greptile}

greptile-apps · 2026-05-20T00:32:30Z

@@ -1519,12 +1531,12 @@ def map_openai_params(  # noqa: PLR0915
                    optional_params["thinking"] = mapped_thinking
                    if AnthropicConfig._is_adaptive_thinking_model(model):
                        mapped_effort = REASONING_EFFORT_TO_OUTPUT_CONFIG_EFFORT.get(
-                            value
+                            effort_value
                        )
                        if mapped_effort is None:
                            AnthropicConfig._raise_invalid_reasoning_effort(
                                model=model,
-                                value=value,
+                                value=effort_value,
                                llm_provider=self.custom_llm_provider or "anthropic",
                            )
                        optional_params["output_config"] = {"effort": mapped_effort}


Hardcoded model-specific flag check — should use model_prices_and_context_window.json

_is_effort_supported_model and _is_adaptive_thinking_model (called later in this block) are hardcoded model-name lists. The team rule is that model capability flags belong in model_prices_and_context_window.json and are read via get_model_info / supports_* helpers so they work automatically when new models are added. Putting them here means every new model that supports effort-based thinking requires a LiteLLM code change rather than a config-file entry.

Rule Used: What: Do not hardcode model-specific flags in the ... (source)

codecov · 2026-05-20T00:33:19Z

Codecov Report

❌ Patch coverage is 87.50000% with 16 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
litellm/router.py	86.88%	16 Missing ⚠️

📢 Thoughts on this report? Let us know!

…ng fallback (#28318) Squash-merged by litellm-agent from cwang-otto's PR.

oss-pr-review-agent-shin · 2026-05-20T04:04:51Z

@greptile please review

Sameerlite · 2026-05-20T10:47:54Z

Covered in #28337

TorvaldUtne and others added 7 commits May 19, 2026 02:38

feat: add Xiaomi MiMo-V2.5-Pro and MiMo-V2.5 OpenRouter model entries (…

43bc7d6

…#27700) Squash-merged by litellm-agent from TorvaldUtne's PR.

fix(ui): trim whitespace from MCP inspector tool call inputs (#28203)

91927eb

Co-authored-by: shin-berri <shin-laptop@berri.ai> Co-authored-by: yuneng-jiang <yuneng@berri.ai>

gemini-3.1-flash-lite pricing (#27933)

6f83cb2

* feat(model_prices): add gemini-3.1-flash-lite pricing with standard/batch/flex/priority tiers * fix pricing * add service tier --------- Co-authored-by: shin-berri <shin-laptop@berri.ai>

fix: incorrect /v1/agents request example (#28131)

249ec01

feat: add pricing entry for openrouter/google/gemini-3.1-flash-lite (#…

87a55f5

…28280) Squash-merged by litellm-agent from ro31337's PR.

fix(router): wrap aresponses streaming iterator for mid-stream fallba…

5039e63

…cks (#28215) Squash-merged by litellm-agent from cwang-otto's PR.

oss-pr-review-agent-shin Bot mentioned this pull request May 20, 2026

fix(router): wrap aresponses streaming iterator for mid-stream fallbacks #28215

Merged

greptile-apps Bot reviewed May 20, 2026

View reviewed changes

cwang-otto mentioned this pull request May 20, 2026

fix(router): unblock staging — mypy + coverage for aresponses streaming fallback #28318

Merged

fix(router): unblock staging — mypy + coverage for aresponses streami…

6ea1f57

…ng fallback (#28318) Squash-merged by litellm-agent from cwang-otto's PR.

Sameerlite closed this May 20, 2026

Sameerlite deleted the shin_agent_oss_staging_05_20_2026 branch May 22, 2026 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[litellm-agent] Staging → litellm_internal_staging (5/20/2026)#28310

[litellm-agent] Staging → litellm_internal_staging (5/20/2026)#28310
oss-pr-review-agent-shin[bot] wants to merge 8 commits into
litellm_internal_stagingfrom
shin_agent_oss_staging_05_20_2026

oss-pr-review-agent-shin Bot commented May 20, 2026 •

edited

Loading

Uh oh!

oss-pr-review-agent-shin Bot commented May 20, 2026

Uh oh!

CLAassistant commented May 20, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 20, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps Bot May 20, 2026

Uh oh!

Uh oh!

codecov Bot commented May 20, 2026 •

edited

Loading

Uh oh!

oss-pr-review-agent-shin Bot commented May 20, 2026

Uh oh!

Sameerlite commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Uh oh!

Conversation

oss-pr-review-agent-shin Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merged PRs (1)

Uh oh!

oss-pr-review-agent-shin Bot commented May 20, 2026

Uh oh!

CLAassistant commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

oss-pr-review-agent-shin Bot commented May 20, 2026

Uh oh!

Sameerlite commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

oss-pr-review-agent-shin Bot commented May 20, 2026 •

edited

Loading

CLAassistant commented May 20, 2026 •

edited

Loading

greptile-apps Bot commented May 20, 2026 •

edited

Loading

codecov Bot commented May 20, 2026 •

edited

Loading