Skip to content

chore(release): backport #30064, #29991, #30009 to stable/1.85.x#30149

Merged
yuneng-berri merged 4 commits into
stable/1.85.xfrom
litellm_fable5_stable_1_85_x
Jun 11, 2026
Merged

chore(release): backport #30064, #29991, #30009 to stable/1.85.x#30149
yuneng-berri merged 4 commits into
stable/1.85.xfrom
litellm_fable5_stable_1_85_x

Conversation

@mateo-berri

@mateo-berri mateo-berri commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Relevant issues

Backports three changes to stable/1.85.x.

First, #30064 (Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI), cherry-picked from the squashed commit on litellm_internal_staging (e15b37a18e). Part of the Fable 5 backport set: v1.89.0-rc.2 (#30143), stable/1.88.x (#30144), stable/1.87.x (#30146), stable/1.86.x (#30148), and this line.

Second, the CrowdStrike AIDR identity capture plus its fix #29991, mirroring the stable/1.84.x backport #29994 that shipped in v1.84.6. Without these two commits, anyone upgrading from 1.84.6 to a 1.85.x release silently loses the user identity the guardrail sends to CrowdStrike; upgrades should never drop a working capability.

Third, #30009, which authorizes batch files using the upload target_model_names instead of reverse-mapping body.model, fixing false 403s when multiple deployments share the same stripped provider model name.

No version bump, per the release process.

What is included

In merge order on this branch:

Conflict resolutions / branch adaptations

#30064 (unchanged from the original revision of this PR). Same shape as 1.87.x/1.86.x (#30146/#30148), with one difference: the reasoning-effort e2e grid does not exist on this line, so the two grid files from #30064 are not backported (11 files here vs 13 upstream). Model map JSONs applied entry-level; 8 Fable entries inserted verbatim from upstream (validated byte-identical), supports_sampling_params: false added to the 4.7-family entries that exist here (18 flagged total). Staging-only Opus 4.8 / jp.* entries not imported. constants.py / setup_wizard.py: Fable 5 lines only. common_utils.py resolved to upstream's final helper block. test_anthropic_chat_transformation.py: kept this branch's tail, appended exactly #30064's 137-line sampling/top_k test block.

#29991. The _merge_metadata_bags helper is anchored after CrowdStrikeAIDRGuardrailMissingSecrets because upstream anchors it after _extract_text_from_content, which is #26658 code absent on this line. The helper body and the call-site change are byte-identical to staging. The test hunk takes only the PR's own added test_apply_guardrail_reads_identity_from_either_metadata_bag; a neighboring test visible in the conflict region pre-exists only on staging and was not imported.

#30009. Applied cleanly; every patch difference is a context line because this line's _enforce_batch_file_model_access predates the team-scoped auth path and calls can_key_call_model directly. The fix's own hunks are unchanged: auth uses target_model_names decoded from the unified file id, and the resolve_model_name_from_model_id reverse-mapping is removed. Both helpers the pick references (get_models_from_unified_file_id, _is_base64_encoded_unified_file_id) exist on this line with the same signatures as staging.

Known noise on this line

None. The targeted baseline on the branch tip before the new picks (the two upstream-touched test files) was 22 passed, 0 failed.

Screenshots / Proof of Fix

The CrowdStrike picks were verified by replaying the reproducer from #29991 against a live proxy running this branch: real gpt-4o completions, a local HTTP sink standing in for the CrowdStrike AIDR endpoint recording exactly what the proxy sends, and a key bound to a user that has an email.

KEY=...   # key bound to user cs-e2e-user (email alice@example.com)
for BODY in \
  '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}],"max_tokens":5}' \
  '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}],"max_tokens":5,"litellm_metadata":null}' \
  '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}],"max_tokens":5,"litellm_metadata":{"trace_id":"t1"}}'; do
  curl -s http://localhost:4002/v1/chat/completions \
    -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" -d "$BODY" -o /dev/null
done

All three completions returned 200 with real model output. What the proxy sent to the CrowdStrike endpoint for every shape (on the unpatched line it sends no identity at all, and with only the feature commit the null and trace-dict shapes would drop it):

/v1/guard_chat_completions | user_id: cs-e2e-user | extra_info: {'user_name': 'alice@example.com'}
/v1/guard_chat_completions | user_id: cs-e2e-user | extra_info: {'user_name': 'alice@example.com'}
/v1/guard_chat_completions | user_id: cs-e2e-user | extra_info: {'user_name': 'alice@example.com'}

Proxy sanity on the same instance matches the pre-pick baseline captured on the branch tip: /health/liveliness, a real claude-haiku-4-5 completion, key generate plus a scoped-key completion, all green.

Targeted tests on the two upstream-touched files: 22 passed before the picks, 32 passed after, 0 new failures. The picks' own tests pass, including #30009's regression canary asserting can_key_call_model receives the upload alias and the reverse-lookup is never called. An adversarial audit of each pick against its upstream commit (diff fidelity, symbol resolution, ported tests, caller drift) found no divergence beyond the adaptation notes above.

Pre-Submission checklist

  • The cherry-picked PRs carry their own tests
  • My PR passes the targeted unit tests on this line (32 passed, 0 new failures vs baseline)
  • Scope is limited to backporting already-merged changes
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🆕 New Feature (backport)
🐛 Bug Fix (backport)

Changes

Cherry-picks onto stable/1.85.x: #30064 (Fable 5), CrowdStrike AIDR metadata capture + #29991 (1.84.6 parity), #30009 (batch file auth via target_model_names). No version bump

…#30064)

* Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI

Adds cost map entries for claude-fable-5 ($10/$50 per MTok, 1M context,
128K output, adaptive thinking only) on the Anthropic API, Bedrock
converse (base, global, and us/eu geo inference profiles at the 10%
regional premium), Vertex AI, and Azure AI (Microsoft Foundry, which
serves Fable 5 with the full 1M context window unlike Opus 4.8).

Registers anthropic.claude-fable-5 in BEDROCK_CONVERSE_MODELS, lists the
model in the setup wizard, and extends the reasoning effort e2e grid.
The Bedrock, Vertex, and Azure grid cells carry fail_reason markers
until the CI accounts are provisioned: Bedrock needs the provider data
sharing opt-in Fable 5 requires, and the Foundry resource needs a
claude-fable-5 deployment.

The first-party entry carries provider_specific_entry {us: 1.1} for the
inference_geo premium and deliberately no fast multiplier since Fable 5
has no fast mode.

https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm

* Drop removed sampling params for Claude 4.7+ when drop_params is set

Fable 5, Opus 4.7, and Opus 4.8 removed sampling params: the API rejects
top_p, top_k, and any temperature other than 1 with a 400. LiteLLM was
forwarding them even with drop_params enabled because the Anthropic and
Bedrock converse transformations passed temperature/top_p through
unconditionally.

Mirror the GPT-5/o-series handling: temperature=1 still passes through,
other values and any top_p are dropped when drop_params is set, and
without drop_params a clean client-side UnsupportedParamsError tells the
caller how to opt in, instead of surfacing the raw provider error.

https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm

* Drive sampling param gating from the cost map and cover top_k

Greptile review follow-ups on the sampling param fix: the restriction for
Fable 5 / Opus 4.7 / 4.8 is now declared as supports_sampling_params: false
on every affected cost map entry (perplexity excluded; that route is
OpenAI-compatible and maps sampling params upstream) and read back through
a tri-state map lookup, keeping the name check only as a fallback for
provider-routed ids whose hosted map entries predate the flag, the same
layering supports_adaptive_thinking uses. top_k bypasses map_openai_params
as a provider-specific kwarg, so it is gated at the shared
AnthropicConfig.transform_request boundary (direct, Bedrock invoke, Vertex,
Azure) and in the Bedrock converse _handle_top_k_value path, with
drop_params threaded through the converse transform helpers.

Also updates the reasoning effort grid cell count assertion for the four
Fable 5 rows added on this branch (29 x 11 cells).

https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm

* Declare supports_sampling_params in the cost map schema

The model map validation schema uses additionalProperties: false, so the
new flag must be declared for the 28 entries that carry it; this was the
one failing job (misc / Run tests) on the previous commit.

https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm

* fix(bedrock): gate top_k=0 on converse to match Anthropic boundary

Truthiness check let top_k=0 silently disappear on models that removed
sampling params, while AnthropicConfig.transform_request treats 0 as
present and raises UnsupportedParamsError (or drops when drop_params is
set). Switch to 'is not None' so converse, direct Anthropic, invoke,
Vertex, and Azure all behave the same for top_k=0.

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

@mateo-berri mateo-berri marked this pull request as ready for review June 10, 2026 22:14
@mateo-berri mateo-berri requested a review from a team June 10, 2026 22:14
@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This backport brings three already-merged fixes onto stable/1.85.x: Claude Fable 5 model support across all four providers, CrowdStrike AIDR user identity capture (1.84.6 parity), and a batch-file authorization fix that eliminates false 403s when multiple deployments share a stripped provider model name.

Confidence Score: 5/5

Safe to merge; all three cherry-picks adapt cleanly to this branch with no divergence beyond the documented conflict resolutions, and the targeted test suite shows 32 passing tests with no new failures.

The model-map gating logic, CrowdStrike identity capture, and batch-file auth path are each covered by dedicated mock tests. The _apply_sampling_param helper is driven by cost-map flags with name-based fallbacks; edge cases (top_k=0, temperature=1, provider-prefixed model IDs) all have explicit test coverage. The _merge_metadata_bags merge order is validated across four parametrized shapes matching the live reproducer. The batch auth simplification correctly scopes the resolve_model_name_from_model_id removal to managed files and leaves non-managed files on the existing body.model fallback path.

No files require special attention; the concerns already flagged in prior review threads (name-based fallback in _supports_sampling_params, custom_llm_provider="anthropic" narrowing in _supports_model_capability) are pre-existing threads the author is already tracking.

Important Files Changed

Filename Overview
litellm/llms/anthropic/common_utils.py Added _supports_sampling_params, _apply_sampling_param, _model_map_lookup_candidates, _get_model_capability, and _supports_model_capability helpers; refactored _is_adaptive_thinking_model to use them. Model-map driven gating for Fable 5 / Opus 4.7/4.8 sampling params.
litellm/llms/anthropic/chat/transformation.py Routes temperature/top_p through _apply_sampling_param; adds a top_k gating block at the transform_request boundary shared by Anthropic/Bedrock/Vertex/Azure paths.
litellm/llms/bedrock/chat/converse_transformation.py Propagates drop_params through _prepare_request_params and _handle_top_k_value; fixes top_k=0 truthiness bug; delegates sampling-param gating to _apply_sampling_param.
litellm/proxy/guardrails/guardrail_hooks/crowdstrike_aidr/crowdstrike_aidr.py Adds _merge_metadata_bags to merge both metadata and litellm_metadata bags; forwards user_id, model, and extra_info (with user_name) to the CrowdStrike AIDR payload.
litellm/proxy/hooks/batch_rate_limiter.py Fixes false 403s by authorizing batch files using target_model_names decoded from the unified file ID instead of reverse-mapping body.model; removes the resolve_model_name_from_model_id call.
model_prices_and_context_window.json Adds 8 Claude Fable 5 entries (Anthropic, Bedrock base + 3 geo profiles, Vertex AI + @default, Azure AI) and adds supports_sampling_params: false to Opus 4.7/4.8 entries.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_crowdstrike_aidr.py Adds 5 new mock tests covering identity capture, empty extra_info, missing metadata, and dual-bag identity reads; also strengthens two existing tests with monkeypatch.delenv guards.
tests/test_litellm/proxy/hooks/test_batch_file_validation.py Updates the existing auth alias test to pass target_model_names directly and assert resolve_model_name_from_model_id is never called; adds a parametrized regression canary for the multi-deployment false-403 scenario.
tests/test_litellm/test_claude_fable_5_config.py New test file validating Fable 5 pricing, capabilities, cost-map parity between root and backup, adaptive thinking flag presence, and sampling-params flag across all variants.
tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_transformation.py Appends 137 lines of sampling-param gating tests (drop, raise, temperature=1 passthrough, model-map override, top_k boundary) with no modifications to existing tests.
tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py Adds converse-specific sampling-param and top_k gating tests, including a top_k=0 truthiness regression canary.

Reviews (2): Last reviewed commit: "fix(proxy): authorize batch files using ..." | Re-trigger Greptile

Comment on lines +288 to +302
model_lower = model.lower()
return not any(
v in model_lower
for v in (
"fable",
"opus-4-7",
"opus_4_7",
"opus-4.7",
"opus_4.7",
"opus-4-8",
"opus_4_8",
"opus-4.8",
"opus_4.8",
)
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded model-name fallback for sampling-params gating

The fallback block hard-codes "fable", "opus-4-7", "opus-4-8" (and their variant spellings) directly in source code. Per the project rule, capability flags should live exclusively in model_prices_and_context_window.json and be read through get_model_info/_get_model_capability, so that a future model addition only needs a JSON edit rather than a code change. Every affected entry in this PR already carries "supports_sampling_params": false in the map and _get_model_capability will find it (including after Bedrock/Vertex prefix-stripping via _model_map_lookup_candidates), making the name-match branch dead code for any model that exists in the map. The only gap left is a model whose map entry is entirely absent — in that case falling back to True (supported) is also safer than a false-positive reject.

Rule Used: What: Do not hardcode model-specific flags in the ... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines 380 to +388
if _supports_factory(
model=model,
custom_llm_provider=None,
key="supports_adaptive_thinking",
custom_llm_provider="anthropic",
key=key,
):
return True
except Exception:
pass
return AnthropicModelInfo._get_model_capability(model, key) is True

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 custom_llm_provider changed from None to "anthropic" in _supports_factory call

The original _is_adaptive_thinking_model called _supports_factory with custom_llm_provider=None, which allows the factory to resolve the model against all providers in the cost map. The refactored _supports_model_capability now passes custom_llm_provider="anthropic", which may cause _supports_factory to narrow its search to Anthropic-keyed entries and miss Bedrock- or Vertex AI-routed Claude 4.6/4.7 models (e.g., vertex_ai/claude-opus-4-6) that exist under a non-Anthropic provider key. The _get_model_capability fallback at line 388 recovers for most cases, but if _supports_factory had previously returned True for a model that _get_model_capability cannot resolve (e.g., a cross-region inference ARN with no cost-map entry), the name-based fallback _is_claude_4_6_model/_is_claude_4_7_model becomes the only safety net for adaptive-thinking detection.

kenany and others added 3 commits June 10, 2026 17:12
…gs (#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
…T-3593) (#30009)

* fix(proxy): authorize batch files using upload target_model_names (LIT-3593)

After replace_model_in_jsonl, body.model is a stripped provider id. Reverse-mapping it via resolve_model_name_from_model_id is first-match on model_list and caused false 403s when multiple deployments share the same stripped name. Use target_model_names from the unified file id instead.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)

Restores the reverse-lookup for the JSONL body.model fallback path so that
legacy/pre-target_model_names managed files still map stripped provider IDs
back to proxy aliases before auth. Also cleans up redundant `or None`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)"

This reverts commit 30d2e96.

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 2cd7e87)
@yuneng-berri yuneng-berri changed the title chore(release): backport #30064 (Claude Fable 5) to stable/1.85.x chore(release): backport #30064, #29991, #30009 to stable/1.85.x Jun 11, 2026
@yuneng-berri

Copy link
Copy Markdown
Collaborator

@greptileai

ai_guard_payload["user_id"] = user_id

extra_info: dict[str, str] = {}
user_email = metadata.get("user_api_key_user_email")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Spoofable guardrail user identity

metadata can include caller-supplied fields. If extra_info.user_name is populated from metadata.user_api_key_user_email, an authenticated caller can send that metadata key and make CrowdStrike AIDR receive an arbitrary user name. Only forward this value from internal auth-derived metadata, or omit it when the authenticated email is not available.

@veria-ai

veria-ai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

PR overview

This pull request backports selected changes to the stable/1.85.x release branch, including updates around the CrowdStrike AIDR guardrail integration in the LiteLLM proxy.

There is one open security concern involving how the CrowdStrike AIDR hook derives the user name it forwards from request metadata. An authenticated caller may be able to influence that forwarded identity value, which could affect downstream attribution or guardrail context, but the impact appears limited to spoofing that field rather than broader access or execution. No issues have been marked fixed or addressed yet.

Open issues (1)

Fixed/addressed: 0 · PR risk: 4/10

@yuneng-berri yuneng-berri enabled auto-merge June 11, 2026 01:14
@yuneng-berri yuneng-berri merged commit 44623a6 into stable/1.85.x Jun 11, 2026
67 of 75 checks passed
@yuneng-berri yuneng-berri deleted the litellm_fable5_stable_1_85_x branch June 11, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants