chore(release): backport five staged fixes into stable/1.87.x and cut 1.87.1#29631
Conversation
* fix(azure): preserve AD token refresh in v1 OpenAI client path
The /openai/v1/ code path (api_version in {"v1", "latest", "preview"})
constructs a plain OpenAI/AsyncOpenAI client, but only forwarded
`api_key` from `azure_client_params`. When `enable_azure_ad_token_refresh`
is set (or any AD-only auth), `api_key` is None and the client
constructor raised "The api_key client option must be set...", breaking
every Azure call with a v1 api_version.
The OpenAI SDK (>=2.20.0) accepts a callable for `api_key` and re-invokes
it on every request via `_refresh_api_key`, so we now forward
`azure_ad_token_provider` directly — preserving the per-request token
refresh behavior of the regular AzureOpenAI client and avoiding the
expiry hole that resolving the token once at client-creation time would
introduce. Static `azure_ad_token` strings fall through to `api_key`.
For the async path we wrap the sync provider returned by azure-identity
in an async function since AsyncOpenAI expects `Callable[[], Awaitable[str]]`.
Fixes #27945
https://claude.ai/code/session_01UnzrDSFUUgp5T2wRoPMxq5
* fix(azure): offload sync token provider to thread in v1 async wrapper
* fix(azure): include AD credential identity in v1 client cache key
---------
Co-authored-by: Claude <noreply@anthropic.com>
(cherry picked from commit 96a2e8b)
…9264) * fix(proxy): map stripped batch body.model to proxy alias for auth replace_model_in_jsonl rewrites JSONL body.model to the provider id before upload; batch file access checks must resolve that id back to model_name so keys granted the proxy alias are not rejected with 403. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(proxy): surface resolved proxy alias in batch file 403 detail --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com> (cherry picked from commit 70d2748)
* fix(proxy): resolve managed video model ids for auth Co-authored-by: Cursor <cursoragent@cursor.com> * test(proxy): cover character_id router model resolution Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> (cherry picked from commit d45e9e4)
…ams (#29310) * fix(key_generate): allow team members to create keys on org-scoped teams When a virtual key is created for a team, enterprise logic inherits the team's organization_id onto the key (add_team_organization_id). Since the VERIA-55 org-IDOR fix, /key/generate then required the caller to be an explicit LiteLLM_OrganizationMembership member of that org, returning 403 "Caller is not a member of organization_id=<uuid>". Admins normally only add users to teams (not orgs), so self-serve key creation regressed for any user on an org-scoped team (regression since v1.84.0-rc.1). Skip the org-membership check when organization_id was inherited from the key's team (organization_id == team_table.organization_id). Team-level authorization already gates this path, so team membership is sufficient. The membership check still runs when a caller assigns an organization_id that did not come from the key's team, preserving the IDOR protection. Adds regression tests covering both the team-inherited (allowed) and foreign-org (still blocked) cases. Co-authored-by: Cursor <cursoragent@cursor.com> * test(key_generate): cover mismatched team org IDOR path on generate Add test_generate_key_foreign_org_with_mismatched_team_still_enforces_membership for the case where a team is present but request organization_id differs from team_table.organization_id. Enterprise inheritance is no-op'd in the test so the guard is exercised directly; membership validation must still run. Addresses Greptile review on #29310. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> (cherry picked from commit b11833c)
… reject it (Haiku 4.5) (#29585) * fix(vertex): strip output_config.effort for models that reject it Haiku 4.5 on Vertex AI does not support output_config.effort and 400s with "output_config.effort: Extra inputs are not permitted". PR #27074 emptied VERTEX_UNSUPPORTED_OUTPUT_CONFIG_KEYS so effort would forward for Opus/Sonnet 4.6+, but that made the strip unconditional across every Vertex Anthropic model, including ones that don't support it. Claude Code injects effort into its default Messages payload, so `claude --model claude-haiku-4.5` started failing. Make the sanitizer model-aware: drop output_config.effort for models that don't advertise output_config support (or any reasoning effort level) while forwarding it for those that do. The fix covers both the chat-completion and Messages pass-through transformation paths since they share the helper. * chore(vertex): log at debug when dropping unsupported output_config.effort Operators pointing an unregistered Vertex Claude alias that does support effort would otherwise see it stripped with no signal. Debug level keeps it out of normal logs since Claude Code sends effort on every request. (cherry picked from commit cc55662)
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis release backport cherry-picks five targeted bug fixes from
Confidence Score: 4/5All five cherry-picks are narrow, well-tested fixes with no structural changes; safe to merge as a patch release. Each fix is a direct backport of an already-validated change. The auth and key-generation paths are touched, but the scope is limited — passing a router reference through existing call chains, and adding a single bypass condition tightly gated by team membership. Tests cover the happy path, the no-team case, and the mismatched-team-org case. The org-membership bypass in key_management_endpoints.py is logically sound and well-covered, sitting in a security-sensitive path but with no failing invariants found. litellm/proxy/management_endpoints/key_management_endpoints.py — the new org-membership bypass; litellm/llms/azure/common_utils.py — the AD token provider cache-key construction and async wrapping.
|
| Filename | Overview |
|---|---|
| litellm/llms/azure/common_utils.py | Preserves AD token refresh for v1 OpenAI client path; adds stable cache-key derivation for callables and hashes sensitive credential strings |
| litellm/llms/vertex_ai/vertex_ai_partner_models/anthropic/output_params_utils.py | Adds per-model gating for output_config.effort; delegates to AnthropicConfig._model_supports_effort_param which reads from model cost map, not hardcoded values |
| litellm/proxy/management_endpoints/key_management_endpoints.py | Fixes regression where team members on org-scoped teams could not generate keys; adds _org_inherited_from_team bypass for the org membership check |
| litellm/proxy/auth/auth_utils.py | Threads llm_router into managed resource ID resolution so video/character model IDs are resolved to proxy aliases before auth and budget checks |
| litellm/proxy/auth/user_api_key_auth.py | Passes llm_router to all _get_model_from_request_context call sites so model ID resolution is consistent throughout the auth builder |
| litellm/proxy/hooks/batch_rate_limiter.py | Maps stripped batch body.model back to proxy model_name via router before can_key_call_model check |
| litellm/proxy/spend_tracking/budget_reservation.py | Passes llm_router to get_model_from_request in both reserve_budget_for_request and estimate_request_max_cost |
| litellm/proxy/auth/auth_checks.py | Adds llm_router to common_checks call forwarded to _extract_model_candidates_from_request |
Reviews (1): Last reviewed commit: "chore: update uv.lock for 1.87.1" | Re-trigger Greptile
Relevant issues
Backports five fixes that already merged on
litellm_internal_stagingbut never reached the 1.87.x line. Each one is cherry-picked from the squashed commit that landed for its PR. This closes that gap and cuts 1.87.1Linear ticket
N/A
What is included
Cherry-picked in merge order:
body.modelback to the proxy alias so key access checks passoutput_config.effortfor Vertex Claude models that reject it, such as Haiku 4.5The last two commits are the version bump (1.87.0 to 1.87.1) and the matching uv.lock refresh
#29598 (passthrough duplicate logs) is deliberately excluded from this line. The fix works by setting the logging object stream flag so the streaming dedup guard fires, but that guard (
has_dispatched_final_stream_successand_is_assembled_stream_successinlitellm_logging.py) does not exist on 1.87.x, which branched separately from 1.86.x and carries a different passthrough logging path. Cherry-picking it here would be a no-op at best, so it is held back rather than applied blindly. The other four lines that do carry the guard (1.84.x, 1.85.x, 1.86.x, and the 1.88 rc) receive itPre-Submission checklist
make test-unitCI (LiteLLM team)
Link:
Link:
Links:
Screenshots / Proof of Fix
These are backports of changes already merged, released, and validated on other lines; each linked PR carries its own proof and review
Type
Bug Fix
Infrastructure
Changes
See the commit list above. No new code beyond the cherry-picks, the version bump, and the lockfile refresh