Skip to content

fix(pricing): add 1h cache-write cost for Anthropic Sonnet 4.5/4.6#30474

Open
Bochenski wants to merge 1 commit into
BerriAI:litellm_internal_stagingfrom
Bochenski:fix/sonnet-4-5-4-6-1h-cache-write-cost-staging
Open

fix(pricing): add 1h cache-write cost for Anthropic Sonnet 4.5/4.6#30474
Bochenski wants to merge 1 commit into
BerriAI:litellm_internal_stagingfrom
Bochenski:fix/sonnet-4-5-4-6-1h-cache-write-cost-staging

Conversation

@Bochenski

@Bochenski Bochenski commented Jun 15, 2026

Copy link
Copy Markdown

Relevant issues

N/A — pricing-data correctness fix; no existing issue located.

Pre-Submission checklist

  • I have added meaningful tests — tests/test_litellm/test_anthropic_sonnet_1hr_cache_pricing.py (parametrized; fails on current litellm_internal_staging, passes with this change)
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible; it only solves 1 specific problem (Sonnet 4.5/4.6 1-hour cache-write pricing)
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 — Greptile returned 5/5. (The repo's Veria AI reviewer also ran and passed: "No security issues found".)

Type

🐛 Bug Fix

Changes

The native (first-party) Anthropic claude-sonnet-4-5* / claude-sonnet-4-6 entries in the
price map were missing the 1-hour prompt-cache write tier, so 1-hour-TTL cache writes were
costed at the 5-minute rate (undercounting spend). Every sibling already carried it —
vertex_ai/claude-sonnet-4-*, azure_ai/claude-sonnet-4-*, the *.anthropic.* Bedrock
profiles — as did the older claude-sonnet-4-20250514.

This adds, in both model_prices_and_context_window.json and
litellm/model_prices_and_context_window_backup.json:

  • cache_creation_input_token_cost_above_1hr = 6e-06 to claude-sonnet-4-5,
    claude-sonnet-4-5-20250929, claude-sonnet-4-5-20250929-v1:0, claude-sonnet-4-6
    (= 2× the 3e-06 base input; 1.6× the 5-minute write, the standard ratio).
  • cache_creation_input_token_cost_above_1hr_above_200k_tokens = 1.2e-05 to the three
    Sonnet 4.5 variants, which carry a long-context (>200K) tier (Sonnet 4.6 has no >200K tier).

Plus a parametrized regression test asserting the values and the 1.6× ratio invariant.

Proof of fix

Before (current litellm_internal_staging):

claude-sonnet-4-5 : cache_creation_input_token_cost_above_1hr = None
claude-sonnet-4-6 : cache_creation_input_token_cost_above_1hr = None

After (this branch): all four entries return 6e-06 (and 1.2e-05 for the 4.5 >200K tier);
test_anthropic_sonnet_1hr_cache_write_pricing passes. black --check and ruff check clean.

The native anthropic claude-sonnet-4-5/4-6 price-map entries were missing
cache_creation_input_token_cost_above_1hr (and the >200K long-context
sub-tier for 4.5), so 1-hour-TTL cache writes were costed at the 5-minute
rate. Adds 6e-06 regular (and 1.2e-05 long-context) = 2x base input,
matching the vertex_ai/azure_ai/bedrock siblings and the older
claude-sonnet-4-20250514 entry. Adds a regression test.
@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Bochenski

Copy link
Copy Markdown
Author

@greptileai

@CLAassistant

CLAassistant commented Jun 15, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes missing 1-hour prompt-cache write pricing fields on four native Anthropic/Bedrock Sonnet 4.5 and 4.6 model entries, which caused 1-hour-TTL cache writes to silently fall back to the cheaper 5-minute rate and undercount spend.

  • Adds cache_creation_input_token_cost_above_1hr: 6e-06 to claude-sonnet-4-5, claude-sonnet-4-5-20250929, claude-sonnet-4-5-20250929-v1:0, and claude-sonnet-4-6 in both JSON price files; also adds the >200K long-context tier (1.2e-05) to the three Sonnet 4.5 entries that publish it.
  • Values follow the established 1.6× (5-minute → 1-hour) ratio already present on all sibling entries (vertex_ai/, azure_ai/, *.anthropic.* Bedrock profiles, claude-sonnet-4-20250514).
  • Adds a parametrized regression test that reads the JSON directly and asserts both the exact values and the ratio invariant; no network calls.

Confidence Score: 5/5

Safe to merge — the change is additive data-only additions to two JSON price files, with no logic changes and a new regression test that confirms the values.

Only two JSON files are modified (adding missing fields that every sibling entry already carries), and the new test independently validates the values and ratio from the source file. There are no logic changes, no production code edits, and no risk of breaking existing behavior.

No files require special attention; the single style note on the test docstring does not affect correctness.

Important Files Changed

Filename Overview
model_prices_and_context_window.json Adds the missing cache_creation_input_token_cost_above_1hr (6e-06) and cache_creation_input_token_cost_above_1hr_above_200k_tokens (1.2e-05) fields to four Sonnet 4.5/4.6 entries; values are consistent with all sibling entries and the 1.6× ratio holds.
litellm/model_prices_and_context_window_backup.json Identical pricing additions as the primary JSON; changes mirror the primary file correctly.
tests/test_litellm/test_anthropic_sonnet_1hr_cache_pricing.py New regression test validates the 1hr pricing fields and the 1.6× ratio invariant; no network calls. Minor: module docstring describes all tested models as "non-bedrock" but claude-sonnet-4-5-20250929-v1:0 has litellm_provider: bedrock.

Reviews (2): Last reviewed commit: "fix(pricing): add 1h cache-write cost fo..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants