fix(azure): preserve AD token refresh in v1 OpenAI client path by mateo-berri · Pull Request #28627 · BerriAI/litellm

mateo-berri · 2026-05-22T17:10:39Z

Automated copy of #28625 into litellm_internal_staging for pr-babysitter.

Original head: BerriAI/litellm:claude/busy-edison-90ZbL @ abfdb72798a7

Note

Medium Risk
Touches Azure authentication and client caching behavior; mistakes could break Azure v1 requests or accidentally reuse clients across different credentials, though changes are localized and test-covered.

Overview
Ensures the Azure v1 OpenAI client path forwards Azure AD auth by setting the OpenAI/AsyncOpenAI api_key to api_key || azure_ad_token_provider || azure_ad_token, preserving token refresh behavior when using a callable provider.

For async usage, wraps sync Azure Identity token providers with an async asyncio.to_thread wrapper to avoid blocking the event loop, and updates the in-memory client cache key to incorporate AD provider identity and hashed credential inputs (token, client secret, password) to prevent cross-credential client reuse. Adds regression tests covering provider forwarding, async token rotation, precedence rules, and cache separation/reuse.

^{Reviewed by Cursor Bugbot for commit ca50412. Bugbot is set up for automated code reviews on this repo. Configure here.}

The /openai/v1/ code path (api_version in {"v1", "latest", "preview"}) constructs a plain OpenAI/AsyncOpenAI client, but only forwarded `api_key` from `azure_client_params`. When `enable_azure_ad_token_refresh` is set (or any AD-only auth), `api_key` is None and the client constructor raised "The api_key client option must be set...", breaking every Azure call with a v1 api_version. The OpenAI SDK (>=2.20.0) accepts a callable for `api_key` and re-invokes it on every request via `_refresh_api_key`, so we now forward `azure_ad_token_provider` directly — preserving the per-request token refresh behavior of the regular AzureOpenAI client and avoiding the expiry hole that resolving the token once at client-creation time would introduce. Static `azure_ad_token` strings fall through to `api_key`. For the async path we wrap the sync provider returned by azure-identity in an async function since AsyncOpenAI expects `Callable[[], Awaitable[str]]`. Fixes #27945 https://claude.ai/code/session_01UnzrDSFUUgp5T2wRoPMxq5

CLAassistant · 2026-05-22T17:10:48Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ mateo-berri
❌ claude
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

greptile-apps · 2026-05-22T17:14:19Z

Greptile Summary

This PR fixes Azure AD token refresh for the v1 API path (api_version in {"v1", "latest", "preview"}) by forwarding the azure_ad_token_provider (or azure_ad_token) as api_key to the OpenAI/AsyncOpenAI client. Sync providers used by async clients are wrapped with asyncio.to_thread to prevent event-loop blocking on token refresh.

Auth forwarding (common_utils.py): when no explicit api_key is set, the v1 branch now falls back to azure_ad_token_provider then azure_ad_token, preserving AD token rotation on every request via the OpenAI SDK's _refresh_api_key mechanism.
Cache-key hardening: the client cache key is extended with a hash of azure_ad_token, client_secret, azure_password, and an identity string for the token provider, so distinct AD configs don't share a cached client.
Regression tests: six new mock-only tests cover provider forwarding, token rotation, static AD token, api_key precedence, and cache isolation/reuse.

Confidence Score: 3/5

The auth-forwarding logic itself is sound, but the cache key uses id() of the AD token provider callable — a memory address that Python recycles after GC. Under concurrent or sequential workloads where provider objects are short-lived, a new caller's provider can land at the same address as a previously evicted one, causing the cache to return a client that carries a different tenant's credentials.

The core fix correctly threads the AD token provider through to the OpenAI client and the async wrapping with asyncio.to_thread is appropriate. However, the id()-based cache key introduces a real authentication cross-contamination risk for callers that supply azure_ad_token_provider directly without co-locating tenant_id/client_id/client_secret in litellm_params. Because those credentials aren't long-lived references in every usage pattern, the GC-alias scenario is plausible in production traffic.

litellm/llms/azure/common_utils.py — specifically the cache-key construction around line 464–471 where id(_ad_provider) is used as the sole discriminator for externally supplied callable providers.

Security Review

Cache aliasing via id() in common_utils.py line 465: the cache key for callable AD token providers is built from id(_ad_provider), the CPython object memory address. After the provider is garbage-collected, a new provider with different credentials can occupy the same address, causing the cache to return a client authenticated with the previous caller's Azure AD identity. Affects callers who supply azure_ad_token_provider directly without also setting tenant_id/client_id/client_secret.

Important Files Changed

Filename	Overview
litellm/llms/azure/common_utils.py	Extends the v1 API path to forward Azure AD token providers as `api_key`, and extends the cache key to distinguish AD configs. The cache key uses `id()` of the callable provider, which is a memory address that Python can reuse after GC, creating a risk of cache aliasing between callers with different credentials.
tests/test_litellm/llms/azure/test_azure_common_utils.py	Adds six focused regression tests covering provider forwarding, token rotation, static AD token, api_key precedence, cache separation for distinct providers/credentials, and cache reuse for identical configs. All tests use mocks; no real network calls.

_{Reviews (3): Last reviewed commit: "fix(azure): include AD credential identi..." | Re-trigger Greptile}

mateo-berri · 2026-05-22T17:16:51Z

@greptileai

codecov · 2026-05-22T17:20:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

veria-ai · 2026-05-22T17:47:03Z

PR overview

Azure v1 AD token client handling updated

This PR preserves Azure AD token providers on the OpenAI v1 client path and adds AD credential material into the Azure client cache key. I checked the surrounding client cache construction, Azure SDK initialization, and call sites and did not find a new attacker-reachable security issue.

Security review

No new security issues were flagged in the latest review.
1 previously flagged issue(s) appear fixed in the latest changes.
No review issues remain open on this pull request.

Risk: 2/10

mateo-berri · 2026-05-22T18:01:36Z

@greptileai

mateo-berri · 2026-05-22T18:12:36Z

The code path mentioned in the Greptile is not reachable. The OpenAI client stores the callable provider on self._api_key_provider (and the async wrapper's closure pins it as well), and that client sits in cache_dict[K1]. So cache_dict[K1] -> client -> provider keeps P1 alive for the entire lifetime of the cache entry, which means id(P1) cannot be reused while K1 is in the dict. Once the entry is evicted, cache_dict.pop(K1) removes it, so any later lookup at a string-equal key returns nothing regardless of whether a new provider happens to land at the same address. The strong-ref chain serializes the id() lifetime against the cache entry lifetime, so cross-credential reuse cannot occur.

Sameerlite · 2026-05-26T03:40:41Z

nit: Code looks good. Can you add ss/video showing this works as expected? Thanks!

mateo-berri · 2026-05-26T03:56:06Z

End-to-end verification (before / after) — real Azure calls, no mocking

Reproduced the exact scenario this PR fixes: an Azure config that authenticates with an azure_ad_token_provider() callable and no static api_key, on a v1 api_version ("preview") so it routes through the plain-OpenAI-client branch in BaseAzureLLM.get_azure_openai_client. To prove the credential is the only thing authenticating the request, every fallback env var (OPENAI_API_KEY, AZURE_API_KEY, …) is removed before the client is built. The returned client then makes a real responses.create call against a live Azure OpenAI resource — both sync and async.

	sync	async
Before (base, no fix)	❌ `OpenAIError: Missing credentials`	❌ `OpenAIError: Missing credentials`
After (this PR)	✅ Azure replied `'hello from azure'`	✅ Azure replied `'hello from azure async'`

Before the fix the v1 branch forwarded only api_key (which is None for AD-only configs), so the OpenAI client can't even be constructed. After the fix the AD provider is forwarded as the client's callable api_key (and wrapped for the async client), so AD auth works end to end.

▶️ animated run (before → after)

Reproduction script and raw output captures: tests/e2e_verification/pr_28627/ (the v1 endpoint accepts the API key as a Bearer token, so it stands in for a real AD token here).

…AI#28627) * fix(azure): preserve AD token refresh in v1 OpenAI client path The /openai/v1/ code path (api_version in {"v1", "latest", "preview"}) constructs a plain OpenAI/AsyncOpenAI client, but only forwarded `api_key` from `azure_client_params`. When `enable_azure_ad_token_refresh` is set (or any AD-only auth), `api_key` is None and the client constructor raised "The api_key client option must be set...", breaking every Azure call with a v1 api_version. The OpenAI SDK (>=2.20.0) accepts a callable for `api_key` and re-invokes it on every request via `_refresh_api_key`, so we now forward `azure_ad_token_provider` directly — preserving the per-request token refresh behavior of the regular AzureOpenAI client and avoiding the expiry hole that resolving the token once at client-creation time would introduce. Static `azure_ad_token` strings fall through to `api_key`. For the async path we wrap the sync provider returned by azure-identity in an async function since AsyncOpenAI expects `Callable[[], Awaitable[str]]`. Fixes BerriAI#27945 https://claude.ai/code/session_01UnzrDSFUUgp5T2wRoPMxq5 * fix(azure): offload sync token provider to thread in v1 async wrapper * fix(azure): include AD credential identity in v1 client cache key --------- Co-authored-by: Claude <noreply@anthropic.com> (cherry picked from commit 96a2e8b)

greptile-apps Bot reviewed May 22, 2026

View reviewed changes

Comment thread litellm/llms/azure/common_utils.py Outdated

fix(azure): offload sync token provider to thread in v1 async wrapper

068e5f6

veria-ai Bot reviewed May 22, 2026

View reviewed changes

Comment thread litellm/llms/azure/common_utils.py

fix(azure): include AD credential identity in v1 client cache key

ca50412

mateo-berri changed the title ~~[internal copy of #28625] fix(azure): preserve AD token refresh in v1 OpenAI client path~~ fix(azure): preserve AD token refresh in v1 OpenAI client path May 22, 2026

mateo-berri requested a review from Sameerlite May 22, 2026 18:32

mateo-berri mentioned this pull request May 26, 2026

test(azure): e2e verification evidence for v1 AD token fix (#28627) #28838

Closed

Sameerlite approved these changes May 26, 2026

View reviewed changes

mateo-berri merged commit 96a2e8b into litellm_internal_staging May 26, 2026
118 checks passed

mateo-berri deleted the litellm_claude/busy-edison-90ZbL branch May 26, 2026 19:01

oss-agent-shin mentioned this pull request May 27, 2026

fix(bedrock): surface JSON error bodies from streaming Bedrock responses (LIT-3274) #29052

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(azure): preserve AD token refresh in v1 OpenAI client path#28627

fix(azure): preserve AD token refresh in v1 OpenAI client path#28627
mateo-berri merged 3 commits into
litellm_internal_stagingfrom
litellm_claude/busy-edison-90ZbL

mateo-berri commented May 22, 2026 •

edited by cursor Bot

Loading

Uh oh!

CLAassistant commented May 22, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 22, 2026 •

edited

Loading

Security Review

Important Files Changed

Uh oh!

Uh oh!

mateo-berri commented May 22, 2026

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

veria-ai Bot commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

mateo-berri commented May 22, 2026

Uh oh!

mateo-berri commented May 22, 2026 •

edited

Loading

Uh oh!

Sameerlite commented May 26, 2026

Uh oh!

mateo-berri commented May 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

mateo-berri commented May 22, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Security Review

Important Files Changed

Uh oh!

Uh oh!

mateo-berri commented May 22, 2026

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

veria-ai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR overview

Azure v1 AD token client handling updated

Security review

Uh oh!

Uh oh!

mateo-berri commented May 22, 2026

Uh oh!

mateo-berri commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sameerlite commented May 26, 2026

Uh oh!

mateo-berri commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

End-to-end verification (before / after) — real Azure calls, no mocking

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mateo-berri commented May 22, 2026 •

edited by cursor Bot

Loading

CLAassistant commented May 22, 2026 •

edited

Loading

greptile-apps Bot commented May 22, 2026 •

edited

Loading

codecov Bot commented May 22, 2026 •

edited

Loading

veria-ai Bot commented May 22, 2026 •

edited

Loading

mateo-berri commented May 22, 2026 •

edited

Loading

mateo-berri commented May 26, 2026 •

edited

Loading