fix(bedrock): stop base_model label from stripping tools/tool_choice by andrey-dubnik · Pull Request #29621 · BerriAI/litellm

andrey-dubnik · 2026-06-03T21:26:28Z

Relevant issues

A Router/proxy Bedrock deployment whose model_info.base_model is a friendly label (e.g. claude-haiku-4-5) silently lost tools and tool_choice. The outgoing Bedrock Converse request was built without toolConfig, so the model behaved as if no tools were provided. This worked in v1.84.0 and regressed in v1.85.0; with drop_params: true it failed silently rather than surfacing an error

Linear ticket

Root cause

Two changes compound into the bug. completion() passed model_info.base_model as the model argument to get_optional_params (introduced in #27720), so the real Bedrock model id never reached supported-param resolution. Separately, get_supported_openai_params resolved the provider config's params from base_model or model (#28582), letting the label fully replace the real model. For Bedrock the label resolves to no tool support (Bedrock gates tools on recognizing the Converse model id), so tools/tool_choice were stripped before transformation and never reached toolConfig. Azure was unaffected because its deployment names are opaque, so base_model is the only signal it has

Fix

completion() now keeps model as the real deployment model and threads the resolved base_model (the kwarg, falling back to model_info) through separately. get_supported_openai_params treats base_model as additive: it returns the union of the params supported by model and by base_model. A hint can only add capabilities, never strip ones the real model already exposes. That keeps the original base_model behavior from #27717 (a registered base_model adding reasoning_effort/thinking for an unregistered model) and Azure's base_model driven model-type detection working, while restoring Bedrock tool support

Pre-Submission checklist

I have added meaningful tests
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible; it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Verified against a live proxy hitting real Bedrock (real, billable calls)

The fix: a friendly `base_model` label no longer strips tools

This is the configuration from the issue, a Bedrock deployment whose model_info.base_model is the label claude-haiku-4-5, with drop_params: true

model_list:
  - model_name: claude-haiku-4-5
    litellm_params:
      model: bedrock/eu.anthropic.claude-haiku-4-5-20251001-v1:0
    model_info:
      base_model: claude-haiku-4-5

litellm_settings:
  drop_params: true

general_settings:
  master_key: sk-1234

python litellm/proxy/proxy_cli.py --config bedrock_basemodel_test.yaml --port 4000

curl -s http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [{"role": "user", "content": "What is the weather in Copenhagen?"}],
    "tools": [{"type": "function", "function": {"name": "get_weather", "description": "Get the weather for a city", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}}}],
    "tool_choice": "auto"
  }' | jq '{model, finish_reason: .choices[0].finish_reason, tool_calls: .choices[0].message.tool_calls, usage}'

Response on this branch:

{
  "model": "claude-haiku-4-5",
  "finish_reason": "tool_calls",
  "tool_calls": [
    {
      "index": 0,
      "function": { "arguments": "{\"city\": \"Copenhagen\"}", "name": "get_weather" },
      "id": "tooluse_bBi2LpRSJZixywecGRb5MV",
      "type": "function"
    }
  ],
  "usage": { "completion_tokens": 54, "prompt_tokens": 564, "total_tokens": 618 }
}

The model selects get_weather with {"city": "Copenhagen"}, so tools/tool_choice reach Bedrock Converse. On the pre-fix code the same request drops both params before transformation, so no toolConfig is sent and the model cannot call the tool

No regression for the common pattern (no `base_model`)

The same model registered the usual way, without model_info.base_model, which was never affected but is worth confirming stays correct

model_list:
  - model_name: claude-haiku-4-5
    litellm_params:
      model: bedrock/eu.anthropic.claude-haiku-4-5-20251001-v1:0
      aws_region_name: eu-west-1

litellm_settings:
  drop_params: true

Same curl, response on this branch:

{
  "model": "claude-haiku-4-5",
  "finish_reason": "tool_calls",
  "tool_calls": [
    {
      "index": 0,
      "function": { "arguments": "{\"city\": \"Copenhagen\"}", "name": "get_weather" },
      "id": "tooluse_OGbWS86RtufrlCBmBiZhC3",
      "type": "function"
    }
  ],
  "usage": { "completion_tokens": 54, "prompt_tokens": 564, "total_tokens": 618 }
}

Type

Bug Fix

Changes

litellm/main.py: in completion(), resolve base_model from the kwarg or model_info, and stop overwriting the model passed to get_optional_params with the base_model label so the real model reaches capability resolution

litellm/litellm_core_utils/get_supported_openai_params.py: make base_model additive for the provider-config path by returning the union of the params supported by model and by base_model, instead of letting base_model replace model

Tests: a get_supported_openai_params suite covering the union both directions (Bedrock label that would strip tools, gemini base_model that adds reasoning_effort, Azure detection preserved, no-base_model and base_model==model unchanged); a get_optional_params regression asserting tools/tool_choice survive drop_params with the label; and an update to test_completion_optional_params_base_model to assert per case that model stays the real deployment model while base_model is threaded through as the additive hint

A Router/proxy Bedrock deployment whose model_info.base_model is a friendly label (e.g. claude-haiku-4-5) silently lost tools/tool_choice: the outgoing Converse request was built without toolConfig, so the model behaved as if no tools were provided. Worked in v1.84.0, regressed in v1.85.0, and with drop_params=true it failed silently. Two changes compound into the bug. completion() passed model_info.base_model as the model argument to get_optional_params, so the real Bedrock model id never reached supported-param resolution; and get_supported_openai_params resolved the provider config's params from base_model or model, letting the label fully replace the real model. For Bedrock the label resolves to no tool support, so tools/tool_choice were dropped before transformation. completion() now keeps model as the real deployment model and threads the resolved base_model (kwarg or model_info) through separately, and get_supported_openai_params treats base_model as additive: it returns the union of the params supported by model and by base_model. A hint can only add capabilities, never strip ones the real model already exposes, which also preserves the original base_model behavior from BerriAI#27717 and Azure's base_model driven model-type detection. Fixes BerriAI#29618

codecov · 2026-06-03T21:30:05Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-03T21:31:03Z

Greptile Summary

This PR fixes a regression introduced in v1.85.0 where Bedrock deployments with a friendly model_info.base_model label (e.g. claude-haiku-4-5) silently lost tools/tool_choice when drop_params: true was set. Two root-cause changes are reverted/replaced: main.py no longer overwrites the real deployment model with the base_model label when calling get_optional_params, and get_supported_openai_params now returns the union of params from both model and base_model so a hint can only add capabilities, never remove them.

litellm/main.py: base_model is resolved from the kwarg or model_info and threaded as a separate additive hint; the real deployment model id is preserved when building optional_param_args.
litellm/litellm_core_utils/get_supported_openai_params.py: The provider-config path now returns list(dict.fromkeys([*model_params, *base_model_params])) instead of letting base_model fully replace model, keeping Azure model-type detection and the existing #27717 reasoning-effort use-case intact.
Tests: New file test_get_supported_openai_params.py covers the regression, the additive direction, Azure detection, and edge cases; test_utils.py adds a get_optional_params-level regression; test_main.py asserts both model and base_model are threaded correctly per parametrized case.

Confidence Score: 5/5

Safe to merge; the fix is narrowly scoped to how base_model is resolved and applied in get_optional_params, all changed paths have explicit regression tests, and the additive-union design prevents any existing capability from being stripped.

The two-line change in main.py is mechanical and the union logic in get_supported_openai_params is straightforward. Tests cover the Bedrock regression, gemini reasoning-effort preservation, Azure detection, and edge cases. One small gap exists where spreading None could throw if a provider config returns None for an unrecognized model id, but this does not affect the intended code paths.

litellm/litellm_core_utils/get_supported_openai_params.py — the union spread lacks a None guard for the unlikely case where get_supported_openai_params returns None for an unrecognized model.

Important Files Changed

Filename	Overview
litellm/litellm_core_utils/get_supported_openai_params.py	Core fix: `base_model` is now additive (union of both param sets) instead of a full replacement; one defensive null-check gap when either get_supported_openai_params call returns None.
litellm/main.py	Resolves `base_model` from kwarg or `model_info`, stops overwriting `model` with the base_model label when calling `get_optional_params`, threads base_model as a separate additive hint instead.
tests/test_litellm/litellm_core_utils/test_get_supported_openai_params.py	New test file with thorough coverage: Bedrock regression, additive direction, Azure detection preserved, base_model==model edge case, and the gemini capability-adding scenario.
tests/test_litellm/test_main.py	Updated parametrized test to assert both `model` stays the real deployment id and `base_model` is separately threaded through; adds `expected_base_model_param` per case.
tests/test_litellm/test_utils.py	Adds `TestBedrockBaseModelLabelKeepsTools` class with two integration-style tests confirming tools survive `drop_params` with the real model id, and confirming the label alone correctly drops them.

_{Reviews (2): Last reviewed commit: "test(main): make base_model param test r..." | Re-trigger Greptile}

Restore an explicit per-case expected_model_param literal instead of hardcoding the gemini id, so a future case with a different model can't produce a misleading assertion failure.

andrey-dubnik · 2026-06-04T07:19:57Z

@greptileai

Sameerlite

LGTM

* fix(azure): apply api_version fallback chain to image edit URL `AzureImageEditConfig.get_complete_url` only read `api_version` from `litellm_params`. When callers configured it via `litellm.api_version` or `AZURE_API_VERSION`, the constructed URL had no `?api-version=` and Azure responded `404 Resource not found`. Apply the same fallback chain the Azure chat path already uses in `common_utils.py`: litellm_params > litellm.api_version > AZURE_API_VERSION env > litellm.AZURE_DEFAULT_API_VERSION Adds 5 unit tests pinning each layer of the chain plus a regression guard for `api_base` that already carries `?api-version=`. * feat(mcp): core sampling and elicitation flow with security hardening - Add sampling_handler.py: full MCP sampling/createMessage flow with model selection (hint-based + priority-based), auth enforcement, budget checks, route restriction gates, and tag policy pre-auth - Add elicitation_handler.py: MCP elicitation/create relay with downstream client capability detection - Wire sampling/elicitation callbacks in mcp_server_manager.py gated behind allow_sampling/allow_elicitation config flags - Add allow_sampling/allow_elicitation fields to MCPServer type - Fix session lock deadlock: skip lock for JSON-RPC response POSTs (elicitation/sampling replies) with truncated-body heuristic - Extend client.py with sampling_callback and elicitation_callback - Security: RouteChecks gate, tag-budget bypass fix, x-forwarded-for spoofing fix, Latin-1 header encoding guard - Add 4 new test modules (model access, priority selection, request builder, tool conversion) + update existing MCP tests * fix(security): run pre-call guardrails before MCP sampling acompletion Without this, an upstream MCP server with allow_sampling enabled could send prompts that bypass every guardrail (content filtering, PII redaction, prompt-injection detection) configured on /chat/completions. - Call proxy_logging_obj.pre_call_hook(call_type='acompletion') before llm_router.acompletion so guardrails fire for sampling sub-calls - Add HTTPException to the re-raise list so guardrail rejections propagate correctly instead of being swallowed as generic errors * feat(bedrock_mantle): add Responses API support (/openai/v1/responses) (#29490) * feat(bedrock_mantle): add Responses API transformation config * test(bedrock_mantle): cover trailing-slash api_base normalization * feat(bedrock_mantle): export BedrockMantleResponsesAPIConfig * feat(bedrock_mantle): register gpt-5.x Responses config (gpt-oss unchanged) * feat(bedrock_mantle): add gpt-5.5/gpt-5.4 Responses price-map entries * refactor(bedrock_mantle): exclude gpt-oss instead of allow-listing gpt-5 for Responses routing Frontier OpenAI models on Bedrock Mantle are Responses-only on /openai/v1/responses; gpt-oss is the legacy family that also speaks chat-completions. Gate by excluding gpt-oss (which keeps its chat-completions emulation) and defaulting everything else to the native Responses config, so future frontier models (gpt-6, etc.) route correctly without a code change. Verified against the live us-east-2 Mantle endpoint: gpt-oss 400s on /openai/v1/responses while gpt-5.5 400s on both standard paths. * test(bedrock_mantle): cover supports_native_websocket opt-out Closes the one uncovered line flagged by codecov on the Responses config. The assertion documents that Mantle Responses has no realtime/websocket transport, so realtime routing must not attempt a socket it cannot serve. * fix(bedrock_mantle): route file_search through emulation instead of forwarding to Mantle BedrockMantleResponsesAPIConfig inherited supports_native_file_search() -> True from OpenAIResponsesAPIConfig but never overrode it. Mantle has no OpenAI vector stores, so a forwarded file_search tool is rejected with a 400 (verified upstream: Tool type 'file_search' is not supported). Opting out, like the existing supports_native_websocket override, routes the tool through LiteLLM's file_search emulation instead. * fix(bedrock_mantle): only route openai.gpt frontier models to Responses The previous gate excluded gpt-oss and routed every other model to the native Responses config. But on Mantle only the OpenAI gpt frontier models (gpt-5.x) are served on /openai/v1/responses; gpt-oss and the non-OpenAI families (nvidia, mistral, google, zai, ...) are chat-completions only and 400 on that path. Allow-list the openai.gpt- family (excluding gpt-oss) instead, so chat-only models fall through to the chat-completions emulation. Verified against the live us-east-2 endpoint: nvidia.nemotron-nano-9b-v2 returns 400 on /openai/v1/responses and 200 on /v1/chat/completions. * feat(custom_llm): allow streaming/astreaming to yield ModelResponseStream (#27580) * fix(custom_llm): allow streaming/astreaming to yield ModelResponseStream directly * fix(streaming): enhance ModelResponseStream handling for custom LLM providers * fix(streaming): strip finish_reason from content chunks and ensure tool_calls are preserved * fix(streaming): add type ignore for finish_reason assignment in CustomStreamWrapper * fix(proxy): strip stack trace from HTTP 503 responses (CWE-209) (#28330) * fix(proxy/cwe-209): strip Python traceback from HTTP 503 error responses The /cache/ping endpoint included a full Python traceback in its 503 error response body (inside the ProxyException message), leaking internal file paths, line numbers, and call stacks to any caller. Two MCP route handlers in proxy_server.py similarly interpolated str(e) into "Internal server error" detail strings. Fix: log the traceback server-side via verbose_proxy_logger.exception() and omit it from the ProxyException payload / HTTPException detail returned to clients. Tests updated to assert no "traceback" keyword or frame paths appear in the 503 body, with a new dedicated regression test. CWE-209: Generation of Error Message Containing Sensitive Information. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(proxy/cwe-209): apply Greptile P2 fixes and add MCP exception-path tests Greptile 4/5 review identified two remaining gaps and Codecov reported 0% coverage on the two MCP handler exception branches: 1. caching_routes.py — str(e) in "Service Unhealthy ({str(e)})" could still leak Redis hostnames/IPs; replaced with static "Service Unhealthy". HTTPException is now re-raised before the generic handler so the "cache not initialized" 503 still reaches callers with its detail. Removed the redundant str(e) arg from verbose_proxy_logger.exception() (exception() already appends the traceback automatically). 2. tests — two new unit tests cover the exception paths in dynamic_mcp_route and toolset_mcp_route that were previously at 0%: - test_dynamic_mcp_route_unexpected_exception_returns_500_without_traceback - test_toolset_mcp_route_unexpected_exception_returns_500_without_traceback All 25 tests pass (9 caching + 16 MCP). CWE-209: Generation of Error Message Containing Sensitive Information. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(caching_routes): restore precise assertion in test_cache_ping_no_cache_initialized The assertion was weakened to `"Cache not initialized" in str(data)`, which matches the raw string of the entire response dict and would pass even if the error moved to an unexpected field or changed structure. Restore a targeted check on the parsed response: assert the exact string in the correct field `data["detail"]`, matching FastAPI's HTTPException serialisation format {"detail": "<message>"}. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(caching_routes): restore precise assertion and add CWE-209 no-cache path test The assertion in test_cache_ping_no_cache_initialized was weakened to `"Cache not initialized" in str(data)`, which matched against the raw string representation of the entire response dict. This would pass silently even if the error message moved to an unexpected field or the structure changed. Restore a targeted assertion on the parsed field: assert data["detail"] == "Cache not initialized. litellm.cache is None" matching FastAPI's HTTPException serialisation format exactly. Add test_cache_ping_no_cache_does_not_expose_internals to show the code path is still working correctly after the CWE-209 fix: verifies that the HTTPException is re-raised as-is (no traceback, no source paths), and asserts the complete response structure is exactly {"detail": "Cache not initialized. litellm.cache is None"}. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(caching_routes): restore ProxyException envelope for null-cache 503 The except HTTPException: raise guard (added in the CWE-209 fix) caused the null-cache HTTPException to escape as FastAPI's {"detail": "..."} shape instead of the {"error": {...}} ProxyException envelope that callers expect. Move the null-cache guard before the try block and raise ProxyException directly so the response structure is consistent with all other /cache/ping 503s, and the except HTTPException: raise guard is only reachable by unexpected downstream HTTPExceptions. Update the two no-cache tests to assert the correct ProxyException envelope. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Update utils.py (#26609) * feat(pricing): add Snowflake Cortex REST API model pricing (#26612) * feat(pricing): add Snowflake Cortex REST API model pricing ## Summary Adds pricing and context window information for 20+ Snowflake Cortex REST API models to `model_prices_and_context_window.json`. ## What's included - **7 Claude models** (sonnet-4-5, sonnet-4-6, 4-sonnet, 4-opus, haiku-4-5, 3-7-sonnet, 3-5-sonnet) — with prompt caching rates - **4 OpenAI models** (gpt-4.1, gpt-5, gpt-5-mini, gpt-5-nano) — with prompt caching rates - **5 Llama models** (3.1-8b, 3.1-70b, 3.1-405b, 3.3-70b, 4-maverick) - **1 DeepSeek model** (deepseek-r1) - **1 Mistral model** (mistral-large2) - **1 Snowflake model** (snowflake-llama-3.3-70b) - **2 Embedding models** (arctic-embed-l-v2.0, arctic-embed-m-v2.0) Each entry includes `input_cost_per_token`, `output_cost_per_token`, `cache_read_input_token_cost` (where applicable), `max_input_tokens`, `max_output_tokens`, and capability flags (`supports_function_calling`, `supports_vision`, `supports_prompt_caching`, `supports_reasoning`). ## Pricing source All prices are in USD per token, sourced from the official [Snowflake Service Consumption Table](https://www.snowflake.com/legal-files/CreditConsumptionTable.pdf) — Tables 6(b) (REST API with Prompt Caching) and 6(c) (REST API). ## Context The existing `snowflake/` provider has zero model entries in the pricing JSON, which means LiteLLM cannot track costs for Snowflake Cortex calls. This PR fills that gap. ## Related - Existing provider: `litellm/llms/snowflake/` - Cortex REST API docs: https://docs.snowflake.com/en/user-guide/snowflake-cortex/cortex-rest-api * Update model_prices_and_context_window.json Fix the JSON parsing error * Update model_prices_and_context_window.json Removed the duplicate entry * fix(utils): copy extra_body before adding unknown params to prevent model config mutation (#29620) Fixes #29615. In add_provider_specific_params_to_optional_params, the line: extra_body = passed_params.pop("extra_body", None) or {} returns the original dict reference when extra_body is non-empty (truthy). Subsequent writes like extra_body[k] = passed_params[k] then mutate the shared model config object held by the router, poisoning /model/info and all subsequent requests for that deployment. The or {} short-circuit creates a new dict only when extra_body is falsy (None or {}), which is why the bug does not reproduce with extra_body: {}. Fix: wrap in dict() so we always work on a fresh shallow copy. * fix(vertex_ai): Bake tool_choice into Gemini CachedContent body to prevent silent drop (#29097) * fix(vertex_ai): bake tool_choice into Gemini CachedContent body to prevent silent drop * address greptile feedback on tool_choice cache test * adds test that uses ToolConfig(functionCallingConfig=FunctionCallingConfig(mode=ANY)) instead of a dict literal, mirroring what map_tool_choice_values actually produce * fix(gemini/veo): move image from parameters into instances[0] (#29501) * fix(gemini/veo): move image from parameters into instances[0] Veo's predictLongRunning schema puts image (and prompt) on the instances element; parameters is for aspectRatio/durationSeconds/etc. The Gemini path was leaving image in params_copy, so it ended up nested under parameters and the API silently ignored it. The Vertex path already builds the instance dict explicitly, so this just aligns the Gemini path with it. Fixes #29498 * address greptile: unconditional pop + BytesIO test - Pop `image` from params_copy unconditionally so it never reaches GeminiVideoGenerationParameters even when None, removing implicit reliance on Pydantic's extra-field-ignore. - Add test_transform_video_create_request_image_filelike_goes_to_instance covering the BytesIO path (_convert_image_to_gemini_format) — round-trips the base64 to confirm encoding. - Add test_transform_video_create_request_image_none_is_dropped covering the new None branch. * fix(huggingface): handle special token text in embedding usage (#29660) * fix(guardrails): recompile ToolPermissionGuardrail rules on update_in_memory_litellm_params (#29655) * fix(guardrails): recompile ToolPermissionGuardrail rules on update_in_memory_litellm_params ToolPermissionGuardrail builds self.rules and the compiled target/pattern maps only in __init__. The base update_in_memory_litellm_params re-sets raw attributes via setattr but never rebuilds those maps, so a guardrail updated in place (PUT /guardrails, or the immediate in-memory sync) keeps enforcing the construction-time rules until it is reinitialized (PATCH path, periodic DB poll, or restart). Extract the compile step into _load_rules and override update_in_memory_litellm_params to rebuild from it (dict- and model-safe), re-normalizing default_action / on_disallowed_action. Mirrors the existing PresidioGuardrail override of the same method. Adds regression tests. Fixes #29592. * fix(guardrails): handle dict params in ToolPermissionGuardrail in-memory update Delegate to super() only for LitellmParams input (the base setattr loop is model-only); apply the raw-dict case inline. Fixes the mypy arg-type error and makes the recompile work when the proxy passes the raw DB dict. * fix(guardrails): preserve tool-permission rules on a partial in-memory update A partial update (e.g. a LitellmParams whose rules field is None) ran through the generic setattr, which set self.rules to None, and the recompile was skipped, leaving the guardrail with no rules. Snapshot the previous rules and restore them when the update carries no rules; an explicit empty list still clears them. Adds a regression test for the rules-absent case. Addresses the Greptile review note on #29655. * fix(bedrock): stop base_model label from stripping tools/tool_choice (#29621) * fix(bedrock): stop base_model label from stripping tools/tool_choice A Router/proxy Bedrock deployment whose model_info.base_model is a friendly label (e.g. claude-haiku-4-5) silently lost tools/tool_choice: the outgoing Converse request was built without toolConfig, so the model behaved as if no tools were provided. Worked in v1.84.0, regressed in v1.85.0, and with drop_params=true it failed silently. Two changes compound into the bug. completion() passed model_info.base_model as the model argument to get_optional_params, so the real Bedrock model id never reached supported-param resolution; and get_supported_openai_params resolved the provider config's params from base_model or model, letting the label fully replace the real model. For Bedrock the label resolves to no tool support, so tools/tool_choice were dropped before transformation. completion() now keeps model as the real deployment model and threads the resolved base_model (kwarg or model_info) through separately, and get_supported_openai_params treats base_model as additive: it returns the union of the params supported by model and by base_model. A hint can only add capabilities, never strip ones the real model already exposes, which also preserves the original base_model behavior from #27717 and Azure's base_model driven model-type detection. Fixes #29618 * test(main): make base_model param test robust to new parametrize cases Restore an explicit per-case expected_model_param literal instead of hardcoding the gemini id, so a future case with a different model can't produce a misleading assertion failure. * fix(fireworks_ai): pass response_format json_schema through unchanged (#29606) FireworksAIConfig.map_openai_params was rewriting the OpenAI strict `{type: json_schema, json_schema: {name, strict, schema}}` shape into `{type: json_object, schema: ...}` before sending to Fireworks, dropping `strict` and `name` and changing the `type`. Per Fireworks' docs json_object means "force any valid JSON output (no specific schema)", so the schema constraint was effectively dropped and grammar-guided decoding never ran; model output silently violated the schema. The rewrite landed in #7085 (Dec 2024) when Fireworks did not yet accept native json_schema. Fireworks accepts the OpenAI strict shape natively now, so the rewrite has become a regression. Removes the rewrite. Passes response_format through unchanged. Updates the existing test_map_response_format to assert pass-through. Adds focused regression tests in tests/test_litellm/ covering preservation of type, strict, name, and schema body, plus that json_object alone still works. * fix(types): import Required from typing_extensions in gemini types * style: reformat sampling_handler.py for py312 black compat * refactor(mcp-sampling): extract helpers to fix PLR0915 too-many-statements in handle_sampling_create_message * fix(proxy-server): add explicit ProxyLogging type annotation to proxy_logging_obj to fix mypy inference * fix(mcp-sampling): suppress mypy assignment error on ImportError fallback for proxy_logging_obj * fix(test): use .value when comparing LlmProviders enum against string in test_default_api_base * fix(test): iterate LlmProviders enum in test_default_api_base to avoid str pollution from custom provider registration litellm.provider_list is a mutable global initialized to list(LlmProviders) but custom_llm_setup() appends plain provider strings to it. When a test_custom_llm.py test runs first in the same xdist worker, provider_list contains a str and calling .value on it raises AttributeError. Iterate the immutable LlmProviders enum instead, which is deterministic and what the check intends. * fix(mcp): depth-aware JSON-RPC response detection and neutral speed-priority fallback Replace the flat substring check in the truncated-body routing path with a top-level-key scan so a JSON-RPC response whose result payload nests a "method" field is still detected as a response and skips the session lock, removing a deadlock against the in-flight tool call awaiting it. Drop the inverse max_output_tokens speed proxy when no model exposes output_tokens_per_second; context-window size does not track latency, so a neutral score avoids biasing speedPriority toward the smallest-context model. * fix(guardrails): make ToolPermission rule reload atomic on invalid regex _load_rules appended each rule to self.rules before compiling its regex, so an invalid pattern raised mid-loop after the bad rule was already live but without a _compiled_rule_targets entry. _matches_regex reads a missing compiled target as a None pattern and returns True, turning the bad rule into a match-all that silently applies its decision to every tool. Via update_in_memory_litellm_params (PUT /guardrails) this corrupted the live guardrail. Build the parsed rules and compiled maps into locals and swap them in only after every regex compiles, and restore the previous ruleset if a live update is rejected, so an invalid regex now fails the update without leaving the guardrail enforcing a broken policy. * test(mcp): cover sampling conversion, model resolution, and elicitation relay paths The MCP sampling and elicitation handlers shipped with partial test coverage, leaving the response-to-MCP conversion, the model resolution fallback chain, completion-kwargs assembly, guardrail routing, and the entire elicitation relay untested. That pulled the PR's diff (patch) coverage below the codecov threshold even though overall project coverage rose. Add focused unit tests for _convert_openai_response_to_mcp_result, _convert_mcp_tools_to_openai, _convert_mcp_tool_choice_to_openai, image and audio content conversion, the hint-matching and fallback branches of _resolve_model_from_preferences, _build_completion_kwargs, the router and guardrail-rejection paths of _run_guardrails_and_call_llm, the handle_sampling_create_message success and error-propagation flows, the marker-hoisting fallback for tool content on unexpected roles, and the elicitation form/url/generic relay together with its decline paths --------- Co-authored-by: shin-berri <shin-laptop@berri.ai> Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: lengkejun <lengkejun@xd.com> Co-authored-by: Yug <yugborana000@gmail.com> Co-authored-by: Kent <72616338+kingdoooo@users.noreply.github.com> Co-authored-by: tanmay958 <53569547+tanmay958@users.noreply.github.com> Co-authored-by: DrishnaTrivedi <142084770+DrishnaTrivedi@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Navnit Shukla <Navnit.shukla25@gmail.com> Co-authored-by: PRABHU KIRAN VANDRANKI <72809214+VANDRANKI@users.noreply.github.com> Co-authored-by: Adrian Lopez <109683617+adriangomez24@users.noreply.github.com> Co-authored-by: hcl <chenglunhu@gmail.com> Co-authored-by: JooHo Lee <96564470+BWAAEEEK@users.noreply.github.com> Co-authored-by: Dinesh Girbide <85330597+Dinesh-Girbide@users.noreply.github.com> Co-authored-by: cloudwiz <22098246+andrey-dubnik@users.noreply.github.com> Co-authored-by: Ahmad Khan <ahmadkhan2508@gmail.com> Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>

andrey-dubnik · 2026-06-08T08:03:40Z

@Sameerlite do you know when this changes is schedule for release? We can't upgrade without it in-place as Bedrock no longer works in LiteLLM.

Sameerlite · 2026-06-08T08:06:18Z

v1.89.0-rc.1 has the change

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread tests/test_litellm/test_main.py

test(main): make base_model param test robust to new parametrize cases

0036fb1

Restore an explicit per-case expected_model_param literal instead of hardcoding the gemini id, so a future case with a different model can't produce a misleading assertion failure.

Sameerlite approved these changes Jun 4, 2026

View reviewed changes

Sameerlite changed the base branch from litellm_internal_staging to litellm_oss_staging_040626 June 4, 2026 12:33

Sameerlite merged commit 57e777d into BerriAI:litellm_oss_staging_040626 Jun 4, 2026
46 checks passed

andrey-dubnik deleted the litellm_fix_bedrock_base_model_label_drops_tools branch June 4, 2026 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(bedrock): stop base_model label from stripping tools/tool_choice#29621

fix(bedrock): stop base_model label from stripping tools/tool_choice#29621
Sameerlite merged 2 commits into
BerriAI:litellm_oss_staging_040626from
MaerskTech:litellm_fix_bedrock_base_model_label_drops_tools

andrey-dubnik commented Jun 3, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

Important Files Changed

Uh oh!

Uh oh!

andrey-dubnik commented Jun 4, 2026

Uh oh!

Sameerlite left a comment

Uh oh!

Uh oh!

andrey-dubnik commented Jun 8, 2026

Uh oh!

Sameerlite commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

andrey-dubnik commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Linear ticket

Root cause

Fix

Pre-Submission checklist

CI (LiteLLM team)

Screenshots / Proof of Fix

The fix: a friendly base_model label no longer strips tools

No regression for the common pattern (no base_model)

Type

Changes

Uh oh!

codecov Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

andrey-dubnik commented Jun 4, 2026

Uh oh!

Sameerlite left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andrey-dubnik commented Jun 8, 2026

Uh oh!

Sameerlite commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andrey-dubnik commented Jun 3, 2026 •

edited

Loading

The fix: a friendly `base_model` label no longer strips tools

No regression for the common pattern (no `base_model`)

codecov Bot commented Jun 3, 2026 •

edited

Loading

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading