Skip to content

fix(azure): preserve AD token refresh in v1 OpenAI client path#28627

Merged
mateo-berri merged 3 commits into
litellm_internal_stagingfrom
litellm_claude/busy-edison-90ZbL
May 26, 2026
Merged

fix(azure): preserve AD token refresh in v1 OpenAI client path#28627
mateo-berri merged 3 commits into
litellm_internal_stagingfrom
litellm_claude/busy-edison-90ZbL

Conversation

@mateo-berri

@mateo-berri mateo-berri commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Automated copy of #28625 into litellm_internal_staging for pr-babysitter.

Original head: BerriAI/litellm:claude/busy-edison-90ZbL @ abfdb72798a7


Note

Medium Risk
Touches Azure authentication and client caching behavior; mistakes could break Azure v1 requests or accidentally reuse clients across different credentials, though changes are localized and test-covered.

Overview
Ensures the Azure v1 OpenAI client path forwards Azure AD auth by setting the OpenAI/AsyncOpenAI api_key to api_key || azure_ad_token_provider || azure_ad_token, preserving token refresh behavior when using a callable provider.

For async usage, wraps sync Azure Identity token providers with an async asyncio.to_thread wrapper to avoid blocking the event loop, and updates the in-memory client cache key to incorporate AD provider identity and hashed credential inputs (token, client secret, password) to prevent cross-credential client reuse. Adds regression tests covering provider forwarding, async token rotation, precedence rules, and cache separation/reuse.

Reviewed by Cursor Bugbot for commit ca50412. Bugbot is set up for automated code reviews on this repo. Configure here.

The /openai/v1/ code path (api_version in {"v1", "latest", "preview"})
constructs a plain OpenAI/AsyncOpenAI client, but only forwarded
`api_key` from `azure_client_params`. When `enable_azure_ad_token_refresh`
is set (or any AD-only auth), `api_key` is None and the client
constructor raised "The api_key client option must be set...", breaking
every Azure call with a v1 api_version.

The OpenAI SDK (>=2.20.0) accepts a callable for `api_key` and re-invokes
it on every request via `_refresh_api_key`, so we now forward
`azure_ad_token_provider` directly — preserving the per-request token
refresh behavior of the regular AzureOpenAI client and avoiding the
expiry hole that resolving the token once at client-creation time would
introduce. Static `azure_ad_token` strings fall through to `api_key`.

For the async path we wrap the sync provider returned by azure-identity
in an async function since AsyncOpenAI expects `Callable[[], Awaitable[str]]`.

Fixes #27945

https://claude.ai/code/session_01UnzrDSFUUgp5T2wRoPMxq5
@CLAassistant

CLAassistant commented May 22, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ mateo-berri
❌ claude
You have signed the CLA already but the status is still pending? Let us recheck it.

@greptile-apps

greptile-apps Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes Azure AD token refresh for the v1 API path (api_version in {"v1", "latest", "preview"}) by forwarding the azure_ad_token_provider (or azure_ad_token) as api_key to the OpenAI/AsyncOpenAI client. Sync providers used by async clients are wrapped with asyncio.to_thread to prevent event-loop blocking on token refresh.

  • Auth forwarding (common_utils.py): when no explicit api_key is set, the v1 branch now falls back to azure_ad_token_provider then azure_ad_token, preserving AD token rotation on every request via the OpenAI SDK's _refresh_api_key mechanism.
  • Cache-key hardening: the client cache key is extended with a hash of azure_ad_token, client_secret, azure_password, and an identity string for the token provider, so distinct AD configs don't share a cached client.
  • Regression tests: six new mock-only tests cover provider forwarding, token rotation, static AD token, api_key precedence, and cache isolation/reuse.

Confidence Score: 3/5

The auth-forwarding logic itself is sound, but the cache key uses id() of the AD token provider callable — a memory address that Python recycles after GC. Under concurrent or sequential workloads where provider objects are short-lived, a new caller's provider can land at the same address as a previously evicted one, causing the cache to return a client that carries a different tenant's credentials.

The core fix correctly threads the AD token provider through to the OpenAI client and the async wrapping with asyncio.to_thread is appropriate. However, the id()-based cache key introduces a real authentication cross-contamination risk for callers that supply azure_ad_token_provider directly without co-locating tenant_id/client_id/client_secret in litellm_params. Because those credentials aren't long-lived references in every usage pattern, the GC-alias scenario is plausible in production traffic.

litellm/llms/azure/common_utils.py — specifically the cache-key construction around line 464–471 where id(_ad_provider) is used as the sole discriminator for externally supplied callable providers.

Security Review

  • Cache aliasing via id() in common_utils.py line 465: the cache key for callable AD token providers is built from id(_ad_provider), the CPython object memory address. After the provider is garbage-collected, a new provider with different credentials can occupy the same address, causing the cache to return a client authenticated with the previous caller's Azure AD identity. Affects callers who supply azure_ad_token_provider directly without also setting tenant_id/client_id/client_secret.

Important Files Changed

Filename Overview
litellm/llms/azure/common_utils.py Extends the v1 API path to forward Azure AD token providers as api_key, and extends the cache key to distinguish AD configs. The cache key uses id() of the callable provider, which is a memory address that Python can reuse after GC, creating a risk of cache aliasing between callers with different credentials.
tests/test_litellm/llms/azure/test_azure_common_utils.py Adds six focused regression tests covering provider forwarding, token rotation, static AD token, api_key precedence, cache separation for distinct providers/credentials, and cache reuse for identical configs. All tests use mocks; no real network calls.

Reviews (3): Last reviewed commit: "fix(azure): include AD credential identi..." | Re-trigger Greptile

Comment thread litellm/llms/azure/common_utils.py Outdated
@mateo-berri

Copy link
Copy Markdown
Collaborator Author

@greptileai

@codecov

codecov Bot commented May 22, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@veria-ai

veria-ai Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

PR overview

Azure v1 AD token client handling updated

This PR preserves Azure AD token providers on the OpenAI v1 client path and adds AD credential material into the Azure client cache key. I checked the surrounding client cache construction, Azure SDK initialization, and call sites and did not find a new attacker-reachable security issue.

Security review

  • No new security issues were flagged in the latest review.
  • 1 previously flagged issue(s) appear fixed in the latest changes.
  • No review issues remain open on this pull request.

Risk: 2/10

Comment thread litellm/llms/azure/common_utils.py
@mateo-berri

Copy link
Copy Markdown
Collaborator Author

@greptileai

@mateo-berri

mateo-berri commented May 22, 2026

Copy link
Copy Markdown
Collaborator Author

The code path mentioned in the Greptile is not reachable. The OpenAI client stores the callable provider on self._api_key_provider (and the async wrapper's closure pins it as well), and that client sits in cache_dict[K1]. So cache_dict[K1] -> client -> provider keeps P1 alive for the entire lifetime of the cache entry, which means id(P1) cannot be reused while K1 is in the dict. Once the entry is evicted, cache_dict.pop(K1) removes it, so any later lookup at a string-equal key returns nothing regardless of whether a new provider happens to land at the same address. The strong-ref chain serializes the id() lifetime against the cache entry lifetime, so cross-credential reuse cannot occur.

@mateo-berri mateo-berri changed the title [internal copy of #28625] fix(azure): preserve AD token refresh in v1 OpenAI client path fix(azure): preserve AD token refresh in v1 OpenAI client path May 22, 2026
@mateo-berri mateo-berri requested a review from Sameerlite May 22, 2026 18:32
@Sameerlite

Copy link
Copy Markdown
Collaborator

nit: Code looks good. Can you add ss/video showing this works as expected? Thanks!

@mateo-berri

mateo-berri commented May 26, 2026

Copy link
Copy Markdown
Collaborator Author

End-to-end verification (before / after) — real Azure calls, no mocking

Reproduced the exact scenario this PR fixes: an Azure config that authenticates with an azure_ad_token_provider() callable and no static api_key, on a v1 api_version ("preview") so it routes through the plain-OpenAI-client branch in BaseAzureLLM.get_azure_openai_client. To prove the credential is the only thing authenticating the request, every fallback env var (OPENAI_API_KEY, AZURE_API_KEY, …) is removed before the client is built. The returned client then makes a real responses.create call against a live Azure OpenAI resource — both sync and async.

sync async
Before (base, no fix) OpenAIError: Missing credentials OpenAIError: Missing credentials
After (this PR) ✅ Azure replied 'hello from azure' ✅ Azure replied 'hello from azure async'

Before the fix the v1 branch forwarded only api_key (which is None for AD-only configs), so the OpenAI client can't even be constructed. After the fix the AD provider is forwarded as the client's callable api_key (and wrapped for the async client), so AD auth works end to end.

before / after

▶️ animated run (before → after)

run

Reproduction script and raw output captures: tests/e2e_verification/pr_28627/ (the v1 endpoint accepts the API key as a Bearer token, so it stands in for a real AD token here).

@mateo-berri mateo-berri merged commit 96a2e8b into litellm_internal_staging May 26, 2026
118 checks passed
@mateo-berri mateo-berri deleted the litellm_claude/busy-edison-90ZbL branch May 26, 2026 19:01
Jason869905 pushed a commit to Jason869905/litellm that referenced this pull request Jun 6, 2026
…AI#28627)

* fix(azure): preserve AD token refresh in v1 OpenAI client path

The /openai/v1/ code path (api_version in {"v1", "latest", "preview"})
constructs a plain OpenAI/AsyncOpenAI client, but only forwarded
`api_key` from `azure_client_params`. When `enable_azure_ad_token_refresh`
is set (or any AD-only auth), `api_key` is None and the client
constructor raised "The api_key client option must be set...", breaking
every Azure call with a v1 api_version.

The OpenAI SDK (>=2.20.0) accepts a callable for `api_key` and re-invokes
it on every request via `_refresh_api_key`, so we now forward
`azure_ad_token_provider` directly — preserving the per-request token
refresh behavior of the regular AzureOpenAI client and avoiding the
expiry hole that resolving the token once at client-creation time would
introduce. Static `azure_ad_token` strings fall through to `api_key`.

For the async path we wrap the sync provider returned by azure-identity
in an async function since AsyncOpenAI expects `Callable[[], Awaitable[str]]`.

Fixes BerriAI#27945

https://claude.ai/code/session_01UnzrDSFUUgp5T2wRoPMxq5

* fix(azure): offload sync token provider to thread in v1 async wrapper

* fix(azure): include AD credential identity in v1 client cache key

---------

Co-authored-by: Claude <noreply@anthropic.com>
(cherry picked from commit 96a2e8b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants