Litellm OpenAI double prefix bug#28661
Conversation
Router configs may expose models like openai/openai/<model>; normalize those strings before joining provider/model so model_cost resolves correctly. Co-authored-by: Cursor <cursoragent@cursor.com>
…atch state Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR fixes a double-provider-prefix bug in
Confidence Score: 4/5The change is narrowly scoped to the cost-calculation path and the core dedup logic is correct; two small rough edges remain but neither causes wrong costs in normal operation. The dedup loop and the guarded model_with_provider join work correctly for the targeted scenario and for all existing single-prefix callers. The assert statement used as a runtime invariant check is the main concern — it is silently removed under Python's -O flag, leaving downstream code to surface confusing errors if the provider can't be resolved. The region-based lookup key is a pre-existing pattern that now also applies to normalised double-prefix models, causing regional price variations to be skipped for that narrow combination; it doesn't affect correctness of cost totals. litellm/cost_calculator.py — the assert guard and the region key construction around lines 440–451 deserve a second look.
|
| Filename | Overview |
|---|---|
| litellm/cost_calculator.py | Adds a dedup loop and a smarter model_with_provider join to eliminate doubled provider prefixes (e.g. openai/openai/gpt-5.5). Logic is correct for the happy path; two minor issues: an assert used for a runtime invariant (disabled by -O), and the region-based lookup key still inherits the provider segment from the normalised model string. |
| tests/test_litellm/test_cost_calculator.py | Adds one new mock-only unit test covering the double-prefix scenario. Uses monkeypatch/local cost map correctly; no real network calls. |
Reviews (1): Last reviewed commit: "fix(mypy): narrow custom_llm_provider af..." | Re-trigger Greptile
| else: | ||
| _, custom_llm_provider, _, _ = litellm.get_llm_provider(model=model) | ||
|
|
||
| assert custom_llm_provider is not None # caller-supplied or get_llm_provider |
There was a problem hiding this comment.
Using
assert for a runtime invariant check is fragile: Python's -O (optimize) flag silently removes all assert statements, turning what looks like a guard into a no-op. If get_llm_provider ever returns None for an unrecognised model, the assertion disappears in optimised builds and downstream code proceeds with a None provider, producing hard-to-diagnose errors. Raise a descriptive ValueError instead.
| assert custom_llm_provider is not None # caller-supplied or get_llm_provider | |
| if custom_llm_provider is None: | |
| raise ValueError( | |
| f"Could not determine custom_llm_provider for model={model!r}. " | |
| "Pass custom_llm_provider explicitly or use a recognisable model string." | |
| ) |
| @@ -425,6 +447,9 @@ def cost_per_token( # noqa: PLR0915 | |||
| model_with_provider = model_with_provider_and_region | |||
There was a problem hiding this comment.
Region key includes a doubled provider segment after dedup
After the dedup loop strips "openai/openai/gpt-5.5" down to "openai/gpt-5.5", model still carries the provider prefix. The region key is then built as f"{custom_llm_provider}/{region_name}/{model}" → "openai/us-east-1/openai/gpt-5.5", which will never be found in model_cost_ref. The intended key is "openai/us-east-1/gpt-5.5". In practice the code falls through to the non-region pricing, so cost is not silently zeroed — but regional price variations are silently skipped for any caller using double-prefix deployments with a region_name.
…itellm_openai_double_prefix_bug
…model The provider-prefix dedup loop assumed `model` is always a string. When a non-string is passed (e.g. a MagicMock from a mocked transport in router tests), `model.startswith(...)` is always truthy and each slice returns a new object, so the loop never terminates — it spins and OOM-kills the test worker (observed as the litellm_router_testing CI regression, e.g. test_router_pattern_match_e2e). Only run the string-based dedup and prefix-join when `model` is actually a str, preserving the previous graceful behavior for non-string inputs.
bd2d0ad
into
litellm_internal_staging
Resolves LIT-3057
Summary
Proxy/router setups sometimes use deployment names where the LiteLLM provider prefix appears twice (for example openai/openai/gpt-5.5). Token spend uses cost_per_token, which builds a lookup key by combining custom_llm_provider with the model string. That combination assumed either a bare model name or a single prefix, so doubled prefixes broke model_cost lookup and produced wrong or failed pricing (often observed as $0 cost when lookup falls through incorrectly).
Cause
In cost_perculator.cost_per_token:
When custom_llm_provider is already set (common on proxy/router paths), the code did provider + "/" + model even when model already started with provider/.
That produced strings like openai/openai/openai/gpt-5.5 for lookups.
Stripping only the first path segment left openai/gpt-5.5, which often does not exist in model_prices_and_context_window.json for OpenAI-style entries keyed as gpt-5.5 (not openai/gpt-5.5), so the resolver never matched a priced row.
Fix
Resolve custom_llm_provider via get_llm_provider when it is missing (same behavior as before, structured earlier).
Normalize deployment strings by removing repeated {provider}/ chains at the start when the same provider is duplicated (e.g. openai/openai/gpt-5.5 → openai/gpt-5.5 → single-prefix form suitable for downstream stripping).
Build model_with_provider as provider/model only if the model string does not already begin with provider/, avoiding triple prefixes.
Added unit test: test_cost_per_token_duplicate_openai_prefix_matches_model_cost.
Steps to repro:
add this model to config and make a request from playground. it won't show spend.
Before:

After:
