Skip to content

Litellm OpenAI double prefix bug#28661

Merged
mateo-berri merged 6 commits into
litellm_internal_stagingfrom
litellm_openai_double_prefix_bug
May 26, 2026
Merged

Litellm OpenAI double prefix bug#28661
mateo-berri merged 6 commits into
litellm_internal_stagingfrom
litellm_openai_double_prefix_bug

Conversation

@shivamrawat1

@shivamrawat1 shivamrawat1 commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Resolves LIT-3057
Summary
Proxy/router setups sometimes use deployment names where the LiteLLM provider prefix appears twice (for example openai/openai/gpt-5.5). Token spend uses cost_per_token, which builds a lookup key by combining custom_llm_provider with the model string. That combination assumed either a bare model name or a single prefix, so doubled prefixes broke model_cost lookup and produced wrong or failed pricing (often observed as $0 cost when lookup falls through incorrectly).

Cause
In cost_perculator.cost_per_token:

When custom_llm_provider is already set (common on proxy/router paths), the code did provider + "/" + model even when model already started with provider/.
That produced strings like openai/openai/openai/gpt-5.5 for lookups.
Stripping only the first path segment left openai/gpt-5.5, which often does not exist in model_prices_and_context_window.json for OpenAI-style entries keyed as gpt-5.5 (not openai/gpt-5.5), so the resolver never matched a priced row.
Fix
Resolve custom_llm_provider via get_llm_provider when it is missing (same behavior as before, structured earlier).
Normalize deployment strings by removing repeated {provider}/ chains at the start when the same provider is duplicated (e.g. openai/openai/gpt-5.5 → openai/gpt-5.5 → single-prefix form suitable for downstream stripping).
Build model_with_provider as provider/model only if the model string does not already begin with provider/, avoiding triple prefixes.
Added unit test: test_cost_per_token_duplicate_openai_prefix_matches_model_cost.

Steps to repro:
add this model to config and make a request from playground. it won't show spend.

model_list:
  - model_name: openai/openai/gpt-5.5
    litellm_params:
      model: openai/gpt-5.5
      api_base: https://api.openai.com/v1
      api_key: os.environ/OPENAI_API_KEY
      tags: ["openai-test-it"]

    model_info:
      id: openai/openai/gpt-5.5
      mode: chat
      base_model: gpt-5.5
      access_groups:
        - default-models

Before:
Screenshot 2026-05-14 at 6 58 25 PM

After:
Screenshot 2026-05-14 at 6 58 08 PM

shivamrawat1 and others added 3 commits May 22, 2026 13:42
Router configs may expose models like openai/openai/<model>; normalize those
strings before joining provider/model so model_cost resolves correctly.

Co-authored-by: Cursor <cursoragent@cursor.com>
…atch state

Co-authored-by: Cursor <cursoragent@cursor.com>
@codecov

codecov Bot commented May 22, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a double-provider-prefix bug in cost_per_token where deployment names like openai/openai/gpt-5.5 caused the model cost lookup key to be built as openai/openai/openai/gpt-5.5, missing the price entry and silently returning $0 cost.

  • Introduces a dedup loop that strips repeated {provider}/ chains from the model string before any key construction, normalising e.g. openai/openai/gpt-5.5openai/gpt-5.5.
  • Changes model_with_provider assignment to skip the provider prepend when the model already starts with the provider prefix, preventing triple-prefix keys.
  • Adds a focused unit test using the local cost map via monkeypatch to confirm non-zero cost is returned for the double-prefix case.

Confidence Score: 4/5

The change is narrowly scoped to the cost-calculation path and the core dedup logic is correct; two small rough edges remain but neither causes wrong costs in normal operation.

The dedup loop and the guarded model_with_provider join work correctly for the targeted scenario and for all existing single-prefix callers. The assert statement used as a runtime invariant check is the main concern — it is silently removed under Python's -O flag, leaving downstream code to surface confusing errors if the provider can't be resolved. The region-based lookup key is a pre-existing pattern that now also applies to normalised double-prefix models, causing regional price variations to be skipped for that narrow combination; it doesn't affect correctness of cost totals.

litellm/cost_calculator.py — the assert guard and the region key construction around lines 440–451 deserve a second look.

Important Files Changed

Filename Overview
litellm/cost_calculator.py Adds a dedup loop and a smarter model_with_provider join to eliminate doubled provider prefixes (e.g. openai/openai/gpt-5.5). Logic is correct for the happy path; two minor issues: an assert used for a runtime invariant (disabled by -O), and the region-based lookup key still inherits the provider segment from the normalised model string.
tests/test_litellm/test_cost_calculator.py Adds one new mock-only unit test covering the double-prefix scenario. Uses monkeypatch/local cost map correctly; no real network calls.

Reviews (1): Last reviewed commit: "fix(mypy): narrow custom_llm_provider af..." | Re-trigger Greptile

else:
_, custom_llm_provider, _, _ = litellm.get_llm_provider(model=model)

assert custom_llm_provider is not None # caller-supplied or get_llm_provider

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Using assert for a runtime invariant check is fragile: Python's -O (optimize) flag silently removes all assert statements, turning what looks like a guard into a no-op. If get_llm_provider ever returns None for an unrecognised model, the assertion disappears in optimised builds and downstream code proceeds with a None provider, producing hard-to-diagnose errors. Raise a descriptive ValueError instead.

Suggested change
assert custom_llm_provider is not None # caller-supplied or get_llm_provider
if custom_llm_provider is None:
raise ValueError(
f"Could not determine custom_llm_provider for model={model!r}. "
"Pass custom_llm_provider explicitly or use a recognisable model string."
)

Comment on lines 440 to 447
@@ -425,6 +447,9 @@ def cost_per_token( # noqa: PLR0915
model_with_provider = model_with_provider_and_region

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Region key includes a doubled provider segment after dedup

After the dedup loop strips "openai/openai/gpt-5.5" down to "openai/gpt-5.5", model still carries the provider prefix. The region key is then built as f"{custom_llm_provider}/{region_name}/{model}""openai/us-east-1/openai/gpt-5.5", which will never be found in model_cost_ref. The intended key is "openai/us-east-1/gpt-5.5". In practice the code falls through to the non-region pricing, so cost is not silently zeroed — but regional price variations are silently skipped for any caller using double-prefix deployments with a region_name.

mateo-berri
mateo-berri previously approved these changes May 26, 2026

@mateo-berri mateo-berri left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks!

@mateo-berri mateo-berri dismissed their stale review May 26, 2026 17:28

Taking a look at unit test failures

…model

The provider-prefix dedup loop assumed `model` is always a string. When a
non-string is passed (e.g. a MagicMock from a mocked transport in router
tests), `model.startswith(...)` is always truthy and each slice returns a new
object, so the loop never terminates — it spins and OOM-kills the test worker
(observed as the litellm_router_testing CI regression, e.g.
test_router_pattern_match_e2e). Only run the string-based dedup and prefix-join
when `model` is actually a str, preserving the previous graceful behavior for
non-string inputs.
@mateo-berri mateo-berri enabled auto-merge (squash) May 26, 2026 18:40

@mateo-berri mateo-berri left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks

@mateo-berri mateo-berri merged commit bd2d0ad into litellm_internal_staging May 26, 2026
116 of 118 checks passed
@mateo-berri mateo-berri deleted the litellm_openai_double_prefix_bug branch May 26, 2026 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants