fix(pricing): add 1h cache-write cost for Anthropic Sonnet 4.5/4.6#30474
Conversation
The native anthropic claude-sonnet-4-5/4-6 price-map entries were missing cache_creation_input_token_cost_above_1hr (and the >200K long-context sub-tier for 4.5), so 1-hour-TTL cache writes were costed at the 5-minute rate. Adds 6e-06 regular (and 1.2e-05 long-context) = 2x base input, matching the vertex_ai/azure_ai/bedrock siblings and the older claude-sonnet-4-20250514 entry. Adds a regression test.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR fixes missing 1-hour prompt-cache write pricing fields on four native Anthropic/Bedrock Sonnet 4.5 and 4.6 model entries, which caused 1-hour-TTL cache writes to silently fall back to the cheaper 5-minute rate and undercount spend.
Confidence Score: 5/5Safe to merge — the change is additive data-only additions to two JSON price files, with no logic changes and a new regression test that confirms the values. Only two JSON files are modified (adding missing fields that every sibling entry already carries), and the new test independently validates the values and ratio from the source file. There are no logic changes, no production code edits, and no risk of breaking existing behavior. No files require special attention; the single style note on the test docstring does not affect correctness.
|
| Filename | Overview |
|---|---|
| model_prices_and_context_window.json | Adds the missing cache_creation_input_token_cost_above_1hr (6e-06) and cache_creation_input_token_cost_above_1hr_above_200k_tokens (1.2e-05) fields to four Sonnet 4.5/4.6 entries; values are consistent with all sibling entries and the 1.6× ratio holds. |
| litellm/model_prices_and_context_window_backup.json | Identical pricing additions as the primary JSON; changes mirror the primary file correctly. |
| tests/test_litellm/test_anthropic_sonnet_1hr_cache_pricing.py | New regression test validates the 1hr pricing fields and the 1.6× ratio invariant; no network calls. Minor: module docstring describes all tested models as "non-bedrock" but claude-sonnet-4-5-20250929-v1:0 has litellm_provider: bedrock. |
Reviews (2): Last reviewed commit: "fix(pricing): add 1h cache-write cost fo..." | Re-trigger Greptile
Relevant issues
N/A — pricing-data correctness fix; no existing issue located.
Pre-Submission checklist
tests/test_litellm/test_anthropic_sonnet_1hr_cache_pricing.py(parametrized; fails on currentlitellm_internal_staging, passes with this change)make test-unit@greptileaiand received a Confidence Score of at least 4/5 — Greptile returned 5/5. (The repo's Veria AI reviewer also ran and passed: "No security issues found".)Type
🐛 Bug Fix
Changes
The native (first-party) Anthropic
claude-sonnet-4-5*/claude-sonnet-4-6entries in theprice map were missing the 1-hour prompt-cache write tier, so 1-hour-TTL cache writes were
costed at the 5-minute rate (undercounting spend). Every sibling already carried it —
vertex_ai/claude-sonnet-4-*,azure_ai/claude-sonnet-4-*, the*.anthropic.*Bedrockprofiles — as did the older
claude-sonnet-4-20250514.This adds, in both
model_prices_and_context_window.jsonandlitellm/model_prices_and_context_window_backup.json:cache_creation_input_token_cost_above_1hr = 6e-06toclaude-sonnet-4-5,claude-sonnet-4-5-20250929,claude-sonnet-4-5-20250929-v1:0,claude-sonnet-4-6(= 2× the
3e-06base input; 1.6× the 5-minute write, the standard ratio).cache_creation_input_token_cost_above_1hr_above_200k_tokens = 1.2e-05to the threeSonnet 4.5 variants, which carry a long-context (>200K) tier (Sonnet 4.6 has no >200K tier).
Plus a parametrized regression test asserting the values and the 1.6× ratio invariant.
Proof of fix
Before (current
litellm_internal_staging):After (this branch): all four entries return
6e-06(and1.2e-05for the 4.5 >200K tier);test_anthropic_sonnet_1hr_cache_write_pricingpasses.black --checkandruff checkclean.