chore(release): backport #30064, #29991, #30009 to stable/1.86.x and cut 1.86.5#30148
Conversation
…#30064) * Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI Adds cost map entries for claude-fable-5 ($10/$50 per MTok, 1M context, 128K output, adaptive thinking only) on the Anthropic API, Bedrock converse (base, global, and us/eu geo inference profiles at the 10% regional premium), Vertex AI, and Azure AI (Microsoft Foundry, which serves Fable 5 with the full 1M context window unlike Opus 4.8). Registers anthropic.claude-fable-5 in BEDROCK_CONVERSE_MODELS, lists the model in the setup wizard, and extends the reasoning effort e2e grid. The Bedrock, Vertex, and Azure grid cells carry fail_reason markers until the CI accounts are provisioned: Bedrock needs the provider data sharing opt-in Fable 5 requires, and the Foundry resource needs a claude-fable-5 deployment. The first-party entry carries provider_specific_entry {us: 1.1} for the inference_geo premium and deliberately no fast multiplier since Fable 5 has no fast mode. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Drop removed sampling params for Claude 4.7+ when drop_params is set Fable 5, Opus 4.7, and Opus 4.8 removed sampling params: the API rejects top_p, top_k, and any temperature other than 1 with a 400. LiteLLM was forwarding them even with drop_params enabled because the Anthropic and Bedrock converse transformations passed temperature/top_p through unconditionally. Mirror the GPT-5/o-series handling: temperature=1 still passes through, other values and any top_p are dropped when drop_params is set, and without drop_params a clean client-side UnsupportedParamsError tells the caller how to opt in, instead of surfacing the raw provider error. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Drive sampling param gating from the cost map and cover top_k Greptile review follow-ups on the sampling param fix: the restriction for Fable 5 / Opus 4.7 / 4.8 is now declared as supports_sampling_params: false on every affected cost map entry (perplexity excluded; that route is OpenAI-compatible and maps sampling params upstream) and read back through a tri-state map lookup, keeping the name check only as a fallback for provider-routed ids whose hosted map entries predate the flag, the same layering supports_adaptive_thinking uses. top_k bypasses map_openai_params as a provider-specific kwarg, so it is gated at the shared AnthropicConfig.transform_request boundary (direct, Bedrock invoke, Vertex, Azure) and in the Bedrock converse _handle_top_k_value path, with drop_params threaded through the converse transform helpers. Also updates the reasoning effort grid cell count assertion for the four Fable 5 rows added on this branch (29 x 11 cells). https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Declare supports_sampling_params in the cost map schema The model map validation schema uses additionalProperties: false, so the new flag must be declared for the 28 entries that carry it; this was the one failing job (misc / Run tests) on the previous commit. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * fix(bedrock): gate top_k=0 on converse to match Anthropic boundary Truthiness check let top_k=0 silently disappear on models that removed sampling params, while AnthropicConfig.transform_request treats 0 as present and raises UnsupportedParamsError (or drops when drop_params is set). Switch to 'is not None' so converse, direct Anthropic, invoke, Vertex, and Azure all behave the same for top_k=0. --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR backports four changes to
Confidence Score: 5/5Safe to merge — all three backports carry their own mock-only tests, the touched paths are well-isolated, and the one pre-existing concern (hardcoded name fallback in All changes are cherry-picks of already-merged staging commits with adaptation notes confirmed in the PR description. The Fable 5 sampling-param gating is correctly driven by the model-map flag ( No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/crowdstrike_aidr/crowdstrike_aidr.py | Adds _merge_metadata_bags and injects user identity + model into the CrowdStrike payload; Mapping is already imported, merge order (litellm_metadata wins) is correct for system-injected identity |
| litellm/proxy/hooks/batch_rate_limiter.py | Replaces buggy first-match reverse-lookup with upload-time target_model_names from unified file ID; correctly falls back to JSONL scan for non-managed files via target_model_names or None |
| litellm/llms/anthropic/common_utils.py | Adds _supports_sampling_params, _apply_sampling_param, _get_model_capability, and _model_map_lookup_candidates; primary path driven by model-map flag, name fallback handled as documented |
| litellm/llms/anthropic/chat/transformation.py | Routes temperature/top_p through _apply_sampling_param in map_openai_params and gates top_k at the transform_request boundary shared by Anthropic/Bedrock/Vertex/Azure paths |
| litellm/llms/bedrock/chat/converse_transformation.py | Propagates drop_params through _prepare_request_params and _handle_top_k_value; fixes silent top_k=0 drop (if val_top_k: → if val_top_k is not None:) |
| model_prices_and_context_window.json | Adds supports_sampling_params: false to existing Opus 4.7/4.8 entries and new Bedrock Fable 5 regional entries (eu/us/global/anthropic prefixes) with full capability flags |
| tests/test_litellm/proxy/guardrails/guardrail_hooks/test_crowdstrike_aidr.py | Adds 4 new identity-injection tests and a parametrized 4-case test for dual-bag reads; existing tests hardened with monkeypatch.delenv to prevent env-var leakage between runs |
| tests/test_litellm/proxy/hooks/test_batch_file_validation.py | Updates existing proxy-alias test to pass target_model_names directly and asserts resolve_model_name_from_model_id is no longer called; adds new parametrized multi-deployment test covering the LIT-3593 bug |
| litellm/constants.py | Adds anthropic.claude-fable-5 to BEDROCK_CONVERSE_MODELS |
Reviews (2): Last reviewed commit: "chore: refresh uv.lock for 1.86.5" | Re-trigger Greptile
| return not any( | ||
| v in model_lower | ||
| for v in ( | ||
| "fable", | ||
| "opus-4-7", | ||
| "opus_4_7", | ||
| "opus-4.7", | ||
| "opus_4.7", | ||
| "opus-4-8", | ||
| "opus_4_8", | ||
| "opus-4.8", | ||
| "opus_4.8", | ||
| ) | ||
| ) | ||
|
|
||
| @staticmethod | ||
| def _apply_sampling_param( |
There was a problem hiding this comment.
Hardcoded model-name fallback in
_supports_sampling_params
The team's convention is that model-specific capability flags belong exclusively in model_prices_and_context_window.json, read via get_model_info. The fallback name list here ("fable", "opus-4-7", "opus-4-8", …) diverges from that pattern: a future model that removes sampling params but uses an unfamiliar name would silently pass through until the hardcoded list is extended, while a model whose name accidentally contains "fable" would be incorrectly gated even if it supports sampling params.
The primary map-driven path (_get_model_capability) already handles all the newly-added entries correctly. Removing the name fallback and relying solely on the map would be the safest long-term approach — any caller on an old map that lacks supports_sampling_params would transparently forward params rather than silently dropping them, which is the safe default.
Rule Used: What: Do not hardcode model-specific flags in the ... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
(cherry picked from commit 6fc715c)
…gs (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…T-3593) (#30009) * fix(proxy): authorize batch files using upload target_model_names (LIT-3593) After replace_model_in_jsonl, body.model is a stripped provider id. Reverse-mapping it via resolve_model_name_from_model_id is first-match on model_list and caused false 403s when multiple deployments share the same stripped name. Use target_model_names from the unified file id instead. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593) Restores the reverse-lookup for the JSONL body.model fallback path so that legacy/pre-target_model_names managed files still map stripped provider IDs back to proxy aliases before auth. Also cleans up redundant `or None`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)" This reverts commit 30d2e96. --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 2cd7e87)
Relevant issues
Backports #30064 (Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI), #29991 (CrowdStrike AIDR identity fix) together with its prerequisite identity-capture feature, and #30009 (batch file authorization via upload target_model_names) to
stable/1.86.x, and cuts 1.86.5. Tag v1.86.4 and branch release/v1.86.4 both exist and point at this line's tip, so the tip version has shipped and the branch moves to 1.86.5 as the patch-to-be.The CrowdStrike pair is an upgrade-monotonicity backport: stable/1.84.x and stable/1.85.x already carry both the identity-capture feature and the #29991 fix, so a customer upgrading from those lines to 1.86.x would silently lose CrowdStrike AIDR identity attribution. This replicates the stable/1.85.x treatment exactly: the feature is cherry-picked from the stable/1.84.x commit
6fc715c5bd(its staging counterpart exists only inside aggregator commit32c88ca74f, content-verified against the aggregator's hunks), and the fix from the staging squash of #29991. The same gap exists on stable/1.87.x and stable/1.88.x.What is included
In merge order on top of the original #30064 pick (
59ae344f9e, cherry-picked from staginge15b37a18e):1038a68f34feat(guardrails): capture user and model metadata in CrowdStrike AIDR (from stable/1.84.x6fc715c5bd)2d2fb8c131fix(guardrails): read CrowdStrike AIDR identity from both metadata bags (fix(guardrails): read CrowdStrike AIDR identity from both metadata bags #29991, from staging1bbaf1c39dda)5cea68f3aefix(proxy): authorize batch files using upload target_model_names (fix(proxy): authorize batch files using upload target_model_names (LIT-3593) #30009, from staging2cd7e874859e)400f1786ccbump: version 1.86.4 -> 1.86.50fd4cdf74echore: refresh uv.lock for 1.86.5Adaptation notes
All three picks carry
-xfooters. None are byte-verbatim because this line's files drifted; the divergences are:Identity feature (
1038a68f34): thefrom collections.abc import Mappingline is dropped because this branch already imports Mapping; theai_guard_payloadhunk resolves toai_guard_payload: dict[str, Any] = {"guard_input": guard_input.model_dump(mode="json"), ...}because this branch carries the multimodal guard-input rework, which matches staging's final shape at that site byte for byte; the new tests are appended at the test file tail (this branch has newer tests mid-file). The appended test block is byte-identical to the upstream block.#29991 (
2d2fb8c131): source hunks are identical to the staging commit; the only divergence is the appended parametrized test landing at the test file tail instead of mid-file.#30009 (
5cea68f3ae): the +/- hunk lines are byte-identical to the staging commit (empty interdiff, same 19/10 numstat); only diff context differs because this line's_enforce_batch_file_model_accesspredates staging's team-aware access checks. The buggy reverse-mapping block the fix removes (resolve_model_name_from_model_idfirst-match lookup) was present on this line in the same shape, andget_models_from_unified_file_idis identical on this line and staging.Known noise on this line
None. The targeted baseline (both touched test files, captured on the line tip before any pick) was 23/23 green.
Screenshots / Proof of Fix
Targeted tests went 23 passed (baseline) to 33 passed (post-pick, zero new failures); the 10 new tests are the ones the picks carry. Ruff, mypy, and Black are clean on the touched modules. Gauntlet adversarial verification (deep): SURVIVED on all four sub-claims (diff fidelity vs source commits, symbol resolution, picked tests passing, no broken on-tree caller), 10 agents, zero refutations
Live proxy on this branch (worktree proxy, port 4002), baseline and post-pick:
CrowdStrike reproducer replay (the #29991 bug case). A local sink stood in for the CrowdStrike endpoint, the guardrail was registered via config, and a real user and key were created so authenticated identity flows through
/chat/completions:Before these picks this line's guardrail had no identity capture at all, so these fields could never reach CrowdStrike; with only the feature and not #29991, the null and dict variants would drop them. #30009 has no HTTP reproducer in its PR body; its behavioral proof is the four batch-authorization tests it carries, which pass on this branch.
Pre-Submission checklist
Type
🆕 New Feature (backport)
🐛 Bug Fix (backport)
Changes
Cherry-picks onto stable/1.86.x as listed above, plus the 1.86.5 version bump and uv.lock refresh