chore(release): backport #30064, #29991, #30009 to stable/1.85.x#30149
Conversation
…#30064) * Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI Adds cost map entries for claude-fable-5 ($10/$50 per MTok, 1M context, 128K output, adaptive thinking only) on the Anthropic API, Bedrock converse (base, global, and us/eu geo inference profiles at the 10% regional premium), Vertex AI, and Azure AI (Microsoft Foundry, which serves Fable 5 with the full 1M context window unlike Opus 4.8). Registers anthropic.claude-fable-5 in BEDROCK_CONVERSE_MODELS, lists the model in the setup wizard, and extends the reasoning effort e2e grid. The Bedrock, Vertex, and Azure grid cells carry fail_reason markers until the CI accounts are provisioned: Bedrock needs the provider data sharing opt-in Fable 5 requires, and the Foundry resource needs a claude-fable-5 deployment. The first-party entry carries provider_specific_entry {us: 1.1} for the inference_geo premium and deliberately no fast multiplier since Fable 5 has no fast mode. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Drop removed sampling params for Claude 4.7+ when drop_params is set Fable 5, Opus 4.7, and Opus 4.8 removed sampling params: the API rejects top_p, top_k, and any temperature other than 1 with a 400. LiteLLM was forwarding them even with drop_params enabled because the Anthropic and Bedrock converse transformations passed temperature/top_p through unconditionally. Mirror the GPT-5/o-series handling: temperature=1 still passes through, other values and any top_p are dropped when drop_params is set, and without drop_params a clean client-side UnsupportedParamsError tells the caller how to opt in, instead of surfacing the raw provider error. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Drive sampling param gating from the cost map and cover top_k Greptile review follow-ups on the sampling param fix: the restriction for Fable 5 / Opus 4.7 / 4.8 is now declared as supports_sampling_params: false on every affected cost map entry (perplexity excluded; that route is OpenAI-compatible and maps sampling params upstream) and read back through a tri-state map lookup, keeping the name check only as a fallback for provider-routed ids whose hosted map entries predate the flag, the same layering supports_adaptive_thinking uses. top_k bypasses map_openai_params as a provider-specific kwarg, so it is gated at the shared AnthropicConfig.transform_request boundary (direct, Bedrock invoke, Vertex, Azure) and in the Bedrock converse _handle_top_k_value path, with drop_params threaded through the converse transform helpers. Also updates the reasoning effort grid cell count assertion for the four Fable 5 rows added on this branch (29 x 11 cells). https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Declare supports_sampling_params in the cost map schema The model map validation schema uses additionalProperties: false, so the new flag must be declared for the 28 entries that carry it; this was the one failing job (misc / Run tests) on the previous commit. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * fix(bedrock): gate top_k=0 on converse to match Anthropic boundary Truthiness check let top_k=0 silently disappear on models that removed sampling params, while AnthropicConfig.transform_request treats 0 as present and raises UnsupportedParamsError (or drops when drop_params is set). Switch to 'is not None' so converse, direct Anthropic, invoke, Vertex, and Azure all behave the same for top_k=0. --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis backport brings three already-merged fixes onto
Confidence Score: 5/5Safe to merge; all three cherry-picks adapt cleanly to this branch with no divergence beyond the documented conflict resolutions, and the targeted test suite shows 32 passing tests with no new failures. The model-map gating logic, CrowdStrike identity capture, and batch-file auth path are each covered by dedicated mock tests. The No files require special attention; the concerns already flagged in prior review threads (name-based fallback in
|
| Filename | Overview |
|---|---|
| litellm/llms/anthropic/common_utils.py | Added _supports_sampling_params, _apply_sampling_param, _model_map_lookup_candidates, _get_model_capability, and _supports_model_capability helpers; refactored _is_adaptive_thinking_model to use them. Model-map driven gating for Fable 5 / Opus 4.7/4.8 sampling params. |
| litellm/llms/anthropic/chat/transformation.py | Routes temperature/top_p through _apply_sampling_param; adds a top_k gating block at the transform_request boundary shared by Anthropic/Bedrock/Vertex/Azure paths. |
| litellm/llms/bedrock/chat/converse_transformation.py | Propagates drop_params through _prepare_request_params and _handle_top_k_value; fixes top_k=0 truthiness bug; delegates sampling-param gating to _apply_sampling_param. |
| litellm/proxy/guardrails/guardrail_hooks/crowdstrike_aidr/crowdstrike_aidr.py | Adds _merge_metadata_bags to merge both metadata and litellm_metadata bags; forwards user_id, model, and extra_info (with user_name) to the CrowdStrike AIDR payload. |
| litellm/proxy/hooks/batch_rate_limiter.py | Fixes false 403s by authorizing batch files using target_model_names decoded from the unified file ID instead of reverse-mapping body.model; removes the resolve_model_name_from_model_id call. |
| model_prices_and_context_window.json | Adds 8 Claude Fable 5 entries (Anthropic, Bedrock base + 3 geo profiles, Vertex AI + @default, Azure AI) and adds supports_sampling_params: false to Opus 4.7/4.8 entries. |
| tests/test_litellm/proxy/guardrails/guardrail_hooks/test_crowdstrike_aidr.py | Adds 5 new mock tests covering identity capture, empty extra_info, missing metadata, and dual-bag identity reads; also strengthens two existing tests with monkeypatch.delenv guards. |
| tests/test_litellm/proxy/hooks/test_batch_file_validation.py | Updates the existing auth alias test to pass target_model_names directly and assert resolve_model_name_from_model_id is never called; adds a parametrized regression canary for the multi-deployment false-403 scenario. |
| tests/test_litellm/test_claude_fable_5_config.py | New test file validating Fable 5 pricing, capabilities, cost-map parity between root and backup, adaptive thinking flag presence, and sampling-params flag across all variants. |
| tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_transformation.py | Appends 137 lines of sampling-param gating tests (drop, raise, temperature=1 passthrough, model-map override, top_k boundary) with no modifications to existing tests. |
| tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py | Adds converse-specific sampling-param and top_k gating tests, including a top_k=0 truthiness regression canary. |
Reviews (2): Last reviewed commit: "fix(proxy): authorize batch files using ..." | Re-trigger Greptile
| model_lower = model.lower() | ||
| return not any( | ||
| v in model_lower | ||
| for v in ( | ||
| "fable", | ||
| "opus-4-7", | ||
| "opus_4_7", | ||
| "opus-4.7", | ||
| "opus_4.7", | ||
| "opus-4-8", | ||
| "opus_4_8", | ||
| "opus-4.8", | ||
| "opus_4.8", | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Hardcoded model-name fallback for sampling-params gating
The fallback block hard-codes "fable", "opus-4-7", "opus-4-8" (and their variant spellings) directly in source code. Per the project rule, capability flags should live exclusively in model_prices_and_context_window.json and be read through get_model_info/_get_model_capability, so that a future model addition only needs a JSON edit rather than a code change. Every affected entry in this PR already carries "supports_sampling_params": false in the map and _get_model_capability will find it (including after Bedrock/Vertex prefix-stripping via _model_map_lookup_candidates), making the name-match branch dead code for any model that exists in the map. The only gap left is a model whose map entry is entirely absent — in that case falling back to True (supported) is also safer than a false-positive reject.
Rule Used: What: Do not hardcode model-specific flags in the ... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| if _supports_factory( | ||
| model=model, | ||
| custom_llm_provider=None, | ||
| key="supports_adaptive_thinking", | ||
| custom_llm_provider="anthropic", | ||
| key=key, | ||
| ): | ||
| return True | ||
| except Exception: | ||
| pass | ||
| return AnthropicModelInfo._get_model_capability(model, key) is True |
There was a problem hiding this comment.
custom_llm_provider changed from None to "anthropic" in _supports_factory call
The original _is_adaptive_thinking_model called _supports_factory with custom_llm_provider=None, which allows the factory to resolve the model against all providers in the cost map. The refactored _supports_model_capability now passes custom_llm_provider="anthropic", which may cause _supports_factory to narrow its search to Anthropic-keyed entries and miss Bedrock- or Vertex AI-routed Claude 4.6/4.7 models (e.g., vertex_ai/claude-opus-4-6) that exist under a non-Anthropic provider key. The _get_model_capability fallback at line 388 recovers for most cases, but if _supports_factory had previously returned True for a model that _get_model_capability cannot resolve (e.g., a cross-region inference ARN with no cost-map entry), the name-based fallback _is_claude_4_6_model/_is_claude_4_7_model becomes the only safety net for adaptive-thinking detection.
(cherry picked from commit 6fc715c)
…gs (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…T-3593) (#30009) * fix(proxy): authorize batch files using upload target_model_names (LIT-3593) After replace_model_in_jsonl, body.model is a stripped provider id. Reverse-mapping it via resolve_model_name_from_model_id is first-match on model_list and caused false 403s when multiple deployments share the same stripped name. Use target_model_names from the unified file id instead. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593) Restores the reverse-lookup for the JSONL body.model fallback path so that legacy/pre-target_model_names managed files still map stripped provider IDs back to proxy aliases before auth. Also cleans up redundant `or None`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)" This reverts commit 30d2e96. --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 2cd7e87)
| ai_guard_payload["user_id"] = user_id | ||
|
|
||
| extra_info: dict[str, str] = {} | ||
| user_email = metadata.get("user_api_key_user_email") |
There was a problem hiding this comment.
Low: Spoofable guardrail user identity
metadata can include caller-supplied fields. If extra_info.user_name is populated from metadata.user_api_key_user_email, an authenticated caller can send that metadata key and make CrowdStrike AIDR receive an arbitrary user name. Only forward this value from internal auth-derived metadata, or omit it when the authenticated email is not available.
PR overviewThis pull request backports selected changes to the stable/1.85.x release branch, including updates around the CrowdStrike AIDR guardrail integration in the LiteLLM proxy. There is one open security concern involving how the CrowdStrike AIDR hook derives the user name it forwards from request metadata. An authenticated caller may be able to influence that forwarded identity value, which could affect downstream attribution or guardrail context, but the impact appears limited to spoofing that field rather than broader access or execution. No issues have been marked fixed or addressed yet. Open issues (1)
Fixed/addressed: 0 · PR risk: 4/10 |
Relevant issues
Backports three changes to
stable/1.85.x.First, #30064 (Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI), cherry-picked from the squashed commit on
litellm_internal_staging(e15b37a18e). Part of the Fable 5 backport set: v1.89.0-rc.2 (#30143), stable/1.88.x (#30144), stable/1.87.x (#30146), stable/1.86.x (#30148), and this line.Second, the CrowdStrike AIDR identity capture plus its fix #29991, mirroring the stable/1.84.x backport #29994 that shipped in v1.84.6. Without these two commits, anyone upgrading from 1.84.6 to a 1.85.x release silently loses the user identity the guardrail sends to CrowdStrike; upgrades should never drop a working capability.
Third, #30009, which authorizes batch files using the upload
target_model_namesinstead of reverse-mappingbody.model, fixing false 403s when multiple deployments share the same stripped provider model name.No version bump, per the release process.
What is included
In merge order on this branch:
9f45b538d8Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI #30064 Claude Fable 5: all 8 model-map entries (Anthropic, Bedrock converse + global/us/eu profiles, Vertex AI +@default, Azure AI),BEDROCK_CONVERSE_MODELSregistration, setup wizard entry, sampling-params gating (supports_sampling_params: falsewithdrop_paramssupport and cleanUnsupportedParamsError), thetop_k=0truthiness fix, and the unit tests79d96b740ffeat(guardrails): capture user and model metadata in CrowdStrike AIDR; cherry-picked from stable/1.84.x6fc715c5bd(bump: version 1.84.6 (backport CrowdStrike AIDR metadata capture + identity fix) #29994) because its staging counterpart lives inside the aggregator commit32c88ca74f(Litellm oss staging 080626 #29932); content verified identical to the aggregator's crowdstrike hunks except thefrom collections.abc import Mappingimport, which staging already had via fix(guardrails): improve CrowdStrike AIDR input handling #26658 and this line did not1ef0ef45ecfix(guardrails): read CrowdStrike AIDR identity from both metadata bags; cherry-picked from staging1bbaf1c39d(fix(guardrails): read CrowdStrike AIDR identity from both metadata bags #29991)38bfd32bfffix(proxy): authorize batch files using upload target_model_names; cherry-picked from staging2cd7e87485(fix(proxy): authorize batch files using upload target_model_names (LIT-3593) #30009)Conflict resolutions / branch adaptations
#30064 (unchanged from the original revision of this PR). Same shape as 1.87.x/1.86.x (#30146/#30148), with one difference: the reasoning-effort e2e grid does not exist on this line, so the two grid files from #30064 are not backported (11 files here vs 13 upstream). Model map JSONs applied entry-level; 8 Fable entries inserted verbatim from upstream (validated byte-identical),
supports_sampling_params: falseadded to the 4.7-family entries that exist here (18 flagged total). Staging-only Opus 4.8 /jp.*entries not imported.constants.py/setup_wizard.py: Fable 5 lines only.common_utils.pyresolved to upstream's final helper block.test_anthropic_chat_transformation.py: kept this branch's tail, appended exactly #30064's 137-line sampling/top_k test block.#29991. The
_merge_metadata_bagshelper is anchored afterCrowdStrikeAIDRGuardrailMissingSecretsbecause upstream anchors it after_extract_text_from_content, which is #26658 code absent on this line. The helper body and the call-site change are byte-identical to staging. The test hunk takes only the PR's own addedtest_apply_guardrail_reads_identity_from_either_metadata_bag; a neighboring test visible in the conflict region pre-exists only on staging and was not imported.#30009. Applied cleanly; every patch difference is a context line because this line's
_enforce_batch_file_model_accesspredates the team-scoped auth path and callscan_key_call_modeldirectly. The fix's own hunks are unchanged: auth usestarget_model_namesdecoded from the unified file id, and theresolve_model_name_from_model_idreverse-mapping is removed. Both helpers the pick references (get_models_from_unified_file_id,_is_base64_encoded_unified_file_id) exist on this line with the same signatures as staging.Known noise on this line
None. The targeted baseline on the branch tip before the new picks (the two upstream-touched test files) was 22 passed, 0 failed.
Screenshots / Proof of Fix
The CrowdStrike picks were verified by replaying the reproducer from #29991 against a live proxy running this branch: real gpt-4o completions, a local HTTP sink standing in for the CrowdStrike AIDR endpoint recording exactly what the proxy sends, and a key bound to a user that has an email.
All three completions returned 200 with real model output. What the proxy sent to the CrowdStrike endpoint for every shape (on the unpatched line it sends no identity at all, and with only the feature commit the
nulland trace-dict shapes would drop it):Proxy sanity on the same instance matches the pre-pick baseline captured on the branch tip:
/health/liveliness, a realclaude-haiku-4-5completion, key generate plus a scoped-key completion, all green.Targeted tests on the two upstream-touched files: 22 passed before the picks, 32 passed after, 0 new failures. The picks' own tests pass, including #30009's regression canary asserting
can_key_call_modelreceives the upload alias and the reverse-lookup is never called. An adversarial audit of each pick against its upstream commit (diff fidelity, symbol resolution, ported tests, caller drift) found no divergence beyond the adaptation notes above.Pre-Submission checklist
@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🆕 New Feature (backport)
🐛 Bug Fix (backport)
Changes
Cherry-picks onto
stable/1.85.x: #30064 (Fable 5), CrowdStrike AIDR metadata capture + #29991 (1.84.6 parity), #30009 (batch file auth via target_model_names). No version bump