feat(openai): apply regional-processing cost uplift for EU/US data residency#28622
Draft
mateo-berri wants to merge 2 commits into
Draft
feat(openai): apply regional-processing cost uplift for EU/US data residency#28622mateo-berri wants to merge 2 commits into
mateo-berri wants to merge 2 commits into
Conversation
…sidency OpenAI charges a 10% uplift on the latest GPT models when requests are served from a regionalized hostname (eu./us.api.openai.com). Infer the region from `api_base`, expose it on `kwargs["litellm_params"]["data_residency"]`, and multiply the computed cost by a per-model `regional_processing_uplift_multiplier_<region>` field. https://claude.ai/code/session_012ebH44s7ohYxjoix5CXzTW
|
|
2 tasks
Contributor
Congrats! CodSpeed is installed 🎉
You will start to see performance impacts in the reports once the benchmarks are run from your default branch.
|
| ...(data.refresh_token ? { refresh_token: data.refresh_token } : {}), | ||
| }; | ||
| try { | ||
| window.sessionStorage.setItem(storageKey(serverId, userId), JSON.stringify(stored)); |
| api_key: effectiveFilters[FILTER_KEYS.KEY_HASH] || undefined, | ||
| team_id: effectiveFilters[FILTER_KEYS.TEAM_ID] || undefined, | ||
| request_id: effectiveFilters[FILTER_KEYS.REQUEST_ID] || undefined, | ||
| user_id: effectiveFilters[FILTER_KEYS.USER_ID] || (filterByCurrentUser ? userID ?? undefined : undefined), |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…r_* fields The schema validation in test_aaamodel_prices_and_context_window_json_is_valid runs with additionalProperties=False, so the new regional_processing_uplift_multiplier_eu / _us fields on gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-pro, gpt-4.1 family, and gpt-4o family entries must be declared in the test schema. https://claude.ai/code/session_012ebH44s7ohYxjoix5CXzTW
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
OpenAI charges a flat 10% uplift on the latest GPT models when requests are served from a regionalized hostname (docs). LiteLLM didn't account for this; cost-tracking under-reported for any project on
eu.api.openai.com/us.api.openai.com.This PR adds regional-processing cost tracking that's fully driven by
api_base— no new request parameter required.Approach
api_basehost. OpenAI enforces a hostname-per-region for regionalized projects and rejects mismatches (confirmed by the requester on both directions), so the host is an authoritative signal:eu.api.openai.com→"eu"us.api.openai.com→"us"None(global, no uplift)regional_processing_uplift_multiplier_eu/_usfields (e.g.1.10) to the latest GPT models (gpt-5 family, gpt-4.1 family, gpt-4o family). Multiplier is applied at the end ofgeneric_cost_per_tokenafter service-tier resolution, so it composes cleanly withflex/priority.litellm_params. Custom callbacks can readkwargs["litellm_params"]["data_residency"]without having to parse the URL themselves.Why multiplier instead of explicit
_eu/_uscost-key variants like_flex/_priority? OpenAI's uplift is a uniform flat percentage across every token type, so a single multiplier captures it without bloating the price map with ~18 new keys per affected model.Files
litellm/litellm_core_utils/data_residency.pyinfer_openai_data_residency(api_base)litellm/litellm_core_utils/get_litellm_params.pylitellm_params["data_residency"]litellm/litellm_core_utils/llm_cost_calc/utils.pygeneric_cost_per_tokenlitellm/litellm_core_utils/litellm_logging.pydata_residencyinto the cost calculatorlitellm/cost_calculator.py,litellm/llms/openai/cost_calculation.pydata_residencythroughcost_per_token/completion_cost/response_cost_calculatorlitellm/types/utils.pyDataResidencyenum, newModelInfofieldsmodel_prices_and_context_window.json(+ backup)regional_processing_uplift_multiplier_{eu,us}on gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-pro, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gpt-4o-2024-08-06, gpt-4o-2024-11-20, gpt-4o-miniDocs PR: BerriAI/litellm-docs#claude/epic-goodall-T4V9Y
Test plan
pytest tests/test_litellm/litellm_core_utils/test_data_residency.py— host-parsing helper (12 cases)pytest tests/test_litellm/litellm_core_utils/test_get_litellm_params.py::TestGetLitellmParamsDataResidency—litellm_params["data_residency"]plumbingpytest tests/test_litellm/litellm_core_utils/llm_cost_calc/test_llm_cost_calc_utils.py -k data_residency— uplift applied, composes withpriority, no-op for unmarked models /Nonepytest tests/test_litellm/test_cost_calculator.py— full cost-calculator regression, 53 passingpytest tests/test_litellm/llms/openai/— OpenAI provider tests, 500 passinghttps://claude.ai/code/session_012ebH44s7ohYxjoix5CXzTW
Generated by Claude Code