fix: completion_cost AttributeError on streaming Anthropic web_search responses (#26153) by ishaan-berri · Pull Request #27346 · BerriAI/litellm

ishaan-berri · 2026-05-07T00:06:51Z

Relevant issues

Fixes #26153

Linear ticket

(no Linear ticket — bug fix from external user)

Pre-Submission checklist

I have added testing in tests/test_litellm/ — 3 new test files, 18 new tests
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible — fixes one specific bug
I have requested a Greptile review by commenting @greptileai

CI (LiteLLM team)

Branch creation CI run — link:
CI run for the last commit — link:
Merge / cherry-pick CI run — links:

Screenshots / Proof of Fix

Before:

>>> r = stream_chunk_builder(chunks)
>>> completion_cost(completion_response=r)
AttributeError: 'dict' object has no attribute 'web_search_requests'

After:

>>> r = stream_chunk_builder(chunks)
>>> type(r.usage.server_tool_use)
<class 'litellm.types.utils.ServerToolUse'>
>>> completion_cost(completion_response=r)
0.00347  # works

Type

Bug Fix

Changes

Two-pronged fix per the issuer's recommendation:

Root cause: Usage.__init__ and stream_chunk_builder now coerce server_tool_use to ServerToolUse pydantic when reconstructing usage from streaming chunks. Previously the streaming path round-tripped through model_dump() and re-built Usage(**dump) which left server_tool_use as a plain dict.
Defensive at call sites: added _get_web_search_requests helper that handles None / dict / ServerToolUse uniformly. Used in tool_call_cost_tracking.py and anthropic/cost_calculation.py.

Both fixes shipped together. Tests cover the regression + the helper's tolerance for all input types.

…26153)

)

…ropic web search test

…e type)

CLAassistant · 2026-05-07T00:06:58Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ mateo-berri
❌ ishaan-berri
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

greptile-apps · 2026-05-07T00:08:53Z

Greptile Summary

Fixes AttributeError: 'dict' object has no attribute 'web_search_requests' (#26153) that occurred when calling completion_cost on a streaming Anthropic web-search response rebuilt by stream_chunk_builder. The root cause was that the streaming chunk processor stored server_tool_use as a plain dict rather than coercing it to the ServerToolUse Pydantic model, and downstream cost-calculation code accessed .web_search_requests as an attribute.

Coercion at source: streaming_chunk_builder_utils.py now coerces any incoming dict to ServerToolUse before storing it in UsagePerChunk, so all downstream consumers see a typed object.
Defensive helper: A new _get_web_search_requests utility in llm_cost_calc/utils.py tolerates None, plain dicts, and ServerToolUse instances uniformly; both tool_call_cost_tracking.py and anthropic/cost_calculation.py now use it instead of direct attribute access.
Tests: Three new test files and an updated existing test assert the coercion, the helper's edge cases, and the end-to-end completion_cost call — no real-network calls, consistent with the repo's test policy.

Confidence Score: 5/5

Safe to merge — changes are narrowly scoped to the streaming cost-calculation path with no effect on request handling or auth.

The fix touches only the streaming chunk accumulator and two cost-calculation call sites. The defensive helper is trivially correct, the coercion in the chunk processor is well-guarded with isinstance checks, and the existing Usage.init dict→ServerToolUse coercion already provided a backstop at the final round-trip. The modified existing test is strictly stronger (type check added), new tests cover None/dict/pydantic inputs end-to-end, and no real-network calls are made.

No files require special attention.

Important Files Changed

Filename	Overview
litellm/litellm_core_utils/llm_cost_calc/utils.py	Added `_get_web_search_requests` helper that safely reads `web_search_requests` from None, dict, or ServerToolUse — canonical definition used by both downstream modules.
litellm/litellm_core_utils/streaming_chunk_builder_utils.py	Coerces incoming dict/arbitrary `server_tool_use` to `ServerToolUse` pydantic during chunk accumulation, fixing the root cause of the AttributeError.
litellm/litellm_core_utils/llm_cost_calc/tool_call_cost_tracking.py	Two attribute-access guard chains replaced with `_get_web_search_requests` helper; logic is equivalent and now dict-safe.
litellm/llms/anthropic/cost_calculation.py	Replaced chained `usage.server_tool_use.web_search_requests` attribute access with `_get_web_search_requests` and `getattr` guard; no logic change.
tests/test_litellm/litellm_core_utils/test_streaming_chunk_builder_utils.py	Existing test strengthened: dict-index assertion replaced with explicit `isinstance(ServerToolUse)` + attribute access, correctly catching the regression.
tests/test_litellm/litellm_core_utils/test_streaming_chunk_builder_server_tool_use.py	New regression test file; exercises the full `stream_chunk_builder` → `completion_cost` path with a dict-shaped `server_tool_use` chunk.
tests/test_litellm/litellm_core_utils/llm_cost_calc/test_tool_call_cost_tracking_dict_safety.py	New unit tests for dict/pydantic/None handling in `StandardBuiltInToolCostTracking`; imports `_get_web_search_requests` via `tool_call_cost_tracking` re-export rather than its canonical `utils` module.
tests/test_litellm/llms/anthropic/test_cost_calculation_dict_safety.py	New unit tests for `get_cost_for_anthropic_web_search` with dict/pydantic/None inputs; imports `_get_web_search_requests` via the `cost_calculation` re-export rather than its canonical `utils` module.

_{Reviews (5): Last reviewed commit: "refactor(types): drop redundant server_t..." | Re-trigger Greptile}

codspeed-hq · 2026-05-07T00:08:53Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_fix-issue-26153 (0c3c626) with main (6ff668c)}

DmitriyAlergant · 2026-05-12T18:54:11Z

@ishaan-berri this fixes the prior regression (spend calculation no longer fails), but the cost of Anthropic Web Search is still not included into the total spend

…utils

mateo-berri · 2026-06-11T02:43:27Z

@greptileai

OpenAI shut down the gpt-4o-realtime-preview family (incl. the undated alias) on 2026-05-07, causing the live realtime test to fail with a 4000 invalid_request_error.invalid_model close. gpt-realtime is the GA successor; switch the live-call tests to it, matching the base branch.

mateo-berri · 2026-06-11T02:47:14Z

@greptileai

…itellm_fix-issue-26153

mateo-berri · 2026-06-11T03:03:02Z

@greptileai

codecov · 2026-06-11T03:05:43Z

Codecov Report

❌ Patch coverage is 88.88889% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...itellm_core_utils/streaming_chunk_builder_utils.py	60.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

…nit__

mateo-berri · 2026-06-11T03:13:59Z

@greptileai

mateo-berri

LGTM; thanks!

… responses (#26153) (#27346) Cherry-picked from staging squash 4a3860d. stable/1.88.x predates the Usage.__init__ server_tool_use dict->ServerToolUse coercion that staging carries (it landed via the squashed OSS sync #29932 / 32c88ca, not as a standalone commit). The calculate_usage Usage(**returned_usage.model_dump()) round-trip on this line re-serializes server_tool_use to a plain dict, so without that coercion the rebuilt usage holds a dict and the regression test asserting a ServerToolUse type fails. Restored the coercion in litellm/types/utils.py to satisfy the prerequisite -- it matches #27346's own first commit (coerce server_tool_use dict to ServerToolUse in Usage.__init__), which was dropped from the squash only because staging already carried it.

… responses (#26153) (#27346) Cherry-picked from staging squash 4a3860d. The rc line predates the Usage.__init__ server_tool_use dict->ServerToolUse coercion that staging carries (it landed via the squashed OSS sync #29932 / 32c88ca, not as a standalone commit). The calculate_usage Usage(**returned_usage.model_dump()) round-trip re-serializes server_tool_use to a plain dict, so without that coercion the rebuilt usage holds a dict and the regression test asserting a ServerToolUse type fails. Restored the coercion in litellm/types/utils.py to satisfy the prerequisite -- it matches #27346's own first commit (coerce server_tool_use dict to ServerToolUse in Usage.__init__), which was dropped from the squash only because staging already carried it.

…AIDR, Mantle SigV4, NetApp streaming-cost fix, and team-scoped Datadog toward v1.89.0-rc.3 (#30179) * fix(proxy): authorize batch files using upload target_model_names (LIT-3593) (#30009) * fix(proxy): authorize batch files using upload target_model_names (LIT-3593) After replace_model_in_jsonl, body.model is a stripped provider id. Reverse-mapping it via resolve_model_name_from_model_id is first-match on model_list and caused false 403s when multiple deployments share the same stripped name. Use target_model_names from the unified file id instead. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593) Restores the reverse-lookup for the JSONL body.model fallback path so that legacy/pre-target_model_names managed files still map stripped provider IDs back to proxy aliases before auth. Also cleans up redundant `or None`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)" This reverts commit 30d2e96. --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 2cd7e87) * feat(guardrails): capture user and model metadata in CrowdStrike AIDR (cherry picked from commit 6fc715c) * fix(guardrails): read CrowdStrike AIDR identity from both metadata bags (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c) * feat(bedrock_mantle): add SigV4/IAM auth to Responses API route (#29788) Applied as the squash diff of PR #29788 (head 9800b2f), which landed upstream inside the litellm_oss_staging_080626 sync (32c88ca, #29932) and has no standalone commit to cherry-pick. The rc line already carries the prerequisite #29490 Responses route via the 040626 sync. * fix: completion_cost AttributeError on streaming Anthropic web_search responses (#26153) (#27346) Cherry-picked from staging squash 4a3860d. The rc line predates the Usage.__init__ server_tool_use dict->ServerToolUse coercion that staging carries (it landed via the squashed OSS sync #29932 / 32c88ca, not as a standalone commit). The calculate_usage Usage(**returned_usage.model_dump()) round-trip re-serializes server_tool_use to a plain dict, so without that coercion the rebuilt usage holds a dict and the regression test asserting a ServerToolUse type fails. Restored the coercion in litellm/types/utils.py to satisfy the prerequisite -- it matches #27346's own first commit (coerce server_tool_use dict to ServerToolUse in Usage.__init__), which was dropped from the squash only because staging already carried it. * feat(datadog): add team-scoped Datadog callback support (#29947) Cherry-picked from the PR head 9c049da (single-commit PR, merged to litellm_oss_branch). Applied cleanly; no conflicts. Note: black --check in this worktree flags pre-existing multi-line string formatting in litellm_core_utils/litellm_logging.py (lines ~1006-1050) that is already present on the patch/v1.89.0-rc.1 base and is untouched by this pick -- left as-is to avoid reformatting unrelated lines. --------- Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Kenan Yildirim <kenan@kenany.me> Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: Kent <kingdooo@gmail.com> Co-authored-by: ishaan-berri <155045088+ishaan-berri@users.noreply.github.com> Co-authored-by: aanchal22 <12680748+aanchal22@users.noreply.github.com>

… responses (BerriAI#26153) (BerriAI#27346) * fix: coerce server_tool_use dict to ServerToolUse in Usage.__init__ (BerriAI#26153) * fix: coerce server_tool_use to ServerToolUse in stream_chunk_builder (BerriAI#26153) * fix: dict/pydantic-tolerant access in tool_call_cost_tracking (BerriAI#26153) * fix: dict/pydantic-tolerant access in anthropic cost_calculation (BerriAI#26153) * test: assert ServerToolUse type in existing stream_chunk_builder anthropic web search test * test: regression test for BerriAI#26153 (stream_chunk_builder server_tool_use type) * test: dict/pydantic safety for tool_call_cost_tracking helper * test: dict/pydantic safety for anthropic web_search cost * refactor: consolidate _get_web_search_requests into shared cost-calc utils * test(realtime): use gpt-realtime; openai retired gpt-4o-realtime-preview OpenAI shut down the gpt-4o-realtime-preview family (incl. the undated alias) on 2026-05-07, causing the live realtime test to fail with a 4000 invalid_request_error.invalid_model close. gpt-realtime is the GA successor; switch the live-call tests to it, matching the base branch. * refactor(types): drop redundant server_tool_use coercion in Usage.__init__ --------- Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>

… responses (BerriAI#26153) (BerriAI#27346) * fix: coerce server_tool_use dict to ServerToolUse in Usage.__init__ (BerriAI#26153) * fix: coerce server_tool_use to ServerToolUse in stream_chunk_builder (BerriAI#26153) * fix: dict/pydantic-tolerant access in tool_call_cost_tracking (BerriAI#26153) * fix: dict/pydantic-tolerant access in anthropic cost_calculation (BerriAI#26153) * test: assert ServerToolUse type in existing stream_chunk_builder anthropic web search test * test: regression test for BerriAI#26153 (stream_chunk_builder server_tool_use type) * test: dict/pydantic safety for tool_call_cost_tracking helper * test: dict/pydantic safety for anthropic web_search cost * refactor: consolidate _get_web_search_requests into shared cost-calc utils * test(realtime): use gpt-realtime; openai retired gpt-4o-realtime-preview OpenAI shut down the gpt-4o-realtime-preview family (incl. the undated alias) on 2026-05-07, causing the live realtime test to fail with a 4000 invalid_request_error.invalid_model close. gpt-realtime is the GA successor; switch the live-call tests to it, matching the base branch. * refactor(types): drop redundant server_tool_use coercion in Usage.__init__ --------- Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com> (cherry picked from commit 4a3860d)

Prerequisite for BerriAI#27346 (completion_cost AttributeError on streaming Anthropic web_search). BerriAI#27346's per-chunk coercion is undone by the existing Usage(**returned_usage.model_dump()) reconstruction in calculate_usage, which round-trips server_tool_use back to a plain dict; without this Usage.__init__ coercion the cost path still does attribute access on a dict and raises. The same prerequisite was bundled into the stable/1.88.x backport of BerriAI#27346 (24b9655). Content-verified present on litellm_internal_staging via aggregator 32c88ca (Litellm oss staging 080626, BerriAI#29932); this restores only the two-line coercion, not the rest of that aggregator. (cherry picked from commit 32c88ca)

… responses (BerriAI#26153) (BerriAI#27346) * fix: coerce server_tool_use dict to ServerToolUse in Usage.__init__ (BerriAI#26153) * fix: coerce server_tool_use to ServerToolUse in stream_chunk_builder (BerriAI#26153) * fix: dict/pydantic-tolerant access in tool_call_cost_tracking (BerriAI#26153) * fix: dict/pydantic-tolerant access in anthropic cost_calculation (BerriAI#26153) * test: assert ServerToolUse type in existing stream_chunk_builder anthropic web search test * test: regression test for BerriAI#26153 (stream_chunk_builder server_tool_use type) * test: dict/pydantic safety for tool_call_cost_tracking helper * test: dict/pydantic safety for anthropic web_search cost * refactor: consolidate _get_web_search_requests into shared cost-calc utils * test(realtime): use gpt-realtime; openai retired gpt-4o-realtime-preview OpenAI shut down the gpt-4o-realtime-preview family (incl. the undated alias) on 2026-05-07, causing the live realtime test to fail with a 4000 invalid_request_error.invalid_model close. gpt-realtime is the GA successor; switch the live-call tests to it, matching the base branch. * refactor(types): drop redundant server_tool_use coercion in Usage.__init__ --------- Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>

… responses (BerriAI#26153) (BerriAI#27346) * fix: coerce server_tool_use dict to ServerToolUse in Usage.__init__ (BerriAI#26153) * fix: coerce server_tool_use to ServerToolUse in stream_chunk_builder (BerriAI#26153) * fix: dict/pydantic-tolerant access in tool_call_cost_tracking (BerriAI#26153) * fix: dict/pydantic-tolerant access in anthropic cost_calculation (BerriAI#26153) * test: assert ServerToolUse type in existing stream_chunk_builder anthropic web search test * test: regression test for BerriAI#26153 (stream_chunk_builder server_tool_use type) * test: dict/pydantic safety for tool_call_cost_tracking helper * test: dict/pydantic safety for anthropic web_search cost * refactor: consolidate _get_web_search_requests into shared cost-calc utils * test(realtime): use gpt-realtime; openai retired gpt-4o-realtime-preview OpenAI shut down the gpt-4o-realtime-preview family (incl. the undated alias) on 2026-05-07, causing the live realtime test to fail with a 4000 invalid_request_error.invalid_model close. gpt-realtime is the GA successor; switch the live-call tests to it, matching the base branch. * refactor(types): drop redundant server_tool_use coercion in Usage.__init__ --------- Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com> (cherry picked from commit 4a3860d)

Prerequisite for BerriAI#27346 (completion_cost AttributeError on streaming Anthropic web_search). BerriAI#27346's per-chunk coercion is undone by the existing Usage(**returned_usage.model_dump()) reconstruction in calculate_usage, which round-trips server_tool_use back to a plain dict; without this Usage.__init__ coercion the cost path still does attribute access on a dict and raises. The same prerequisite was bundled into the stable/1.88.x backport of BerriAI#27346 (24b9655). Content-verified present on litellm_internal_staging via aggregator 32c88ca (Litellm oss staging 080626, BerriAI#29932); this restores only the two-line coercion, not the rest of that aggregator. (cherry picked from commit 32c88ca)

ishaan-berri added 8 commits May 6, 2026 17:05

fix: coerce server_tool_use dict to ServerToolUse in Usage.__init__ (#…

0baf01b

…26153)

fix: coerce server_tool_use to ServerToolUse in stream_chunk_builder (#…

e43c5e6

…26153)

fix: dict/pydantic-tolerant access in tool_call_cost_tracking (#26153)

f9836fc

fix: dict/pydantic-tolerant access in anthropic cost_calculation (#26153

99e9192

)

test: assert ServerToolUse type in existing stream_chunk_builder anth…

b80ada3

…ropic web search test

test: regression test for #26153 (stream_chunk_builder server_tool_us…

32f86aa

…e type)

test: dict/pydantic safety for tool_call_cost_tracking helper

16e6ee7

test: dict/pydantic safety for anthropic web_search cost

0c3c626

ishaan-berri mentioned this pull request May 7, 2026

[Bug]: completion_cost() raises AttributeError on streaming Anthropic responses with web_search in v1.83.10 #26153

Closed

1 task

greptile-apps Bot reviewed May 7, 2026

View reviewed changes

Comment thread litellm/llms/anthropic/cost_calculation.py Outdated

shivamrawat1 changed the base branch from main to litellm_internal_staging May 12, 2026 19:05

ririnto mentioned this pull request Jun 6, 2026

fix: normalize Anthropic passthrough server tool usage #29827

Merged

6 tasks

mateo-berri self-requested a review June 11, 2026 02:37

refactor: consolidate _get_web_search_requests into shared cost-calc …

0c71231

…utils

Merge remote-tracking branch 'origin/litellm_internal_staging' into l…

e072749

…itellm_fix-issue-26153

mateo-berri mentioned this pull request Jun 11, 2026

chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 #30144

Merged

2 tasks

refactor(types): drop redundant server_tool_use coercion in Usage.__i…

740d638

…nit__

mateo-berri mentioned this pull request Jun 11, 2026

chore(release): patch v1.89.0-rc.2 with batch-file auth, CrowdStrike AIDR, Mantle SigV4, NetApp streaming-cost fix, and team-scoped Datadog toward v1.89.0-rc.3 #30179

Merged

2 tasks

mateo-berri enabled auto-merge (squash) June 11, 2026 04:19

mateo-berri approved these changes Jun 11, 2026

View reviewed changes

mateo-berri merged commit 4a3860d into litellm_internal_staging Jun 11, 2026
118 checks passed

mateo-berri deleted the litellm_fix-issue-26153 branch June 11, 2026 04:20

This was referenced Jun 14, 2026

chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x #30408

Merged

chore(release): backport 1.84.8 patch set + MCP/model-info/DB fixes to stable/1.89.x and cut 1.89.1 #30502

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: completion_cost AttributeError on streaming Anthropic web_search responses (#26153)#27346

fix: completion_cost AttributeError on streaming Anthropic web_search responses (#26153)#27346
mateo-berri merged 12 commits into
litellm_internal_stagingfrom
litellm_fix-issue-26153

ishaan-berri commented May 7, 2026

Uh oh!

CLAassistant commented May 7, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 7, 2026 •

edited

Loading

Important Files Changed

Uh oh!

codspeed-hq Bot commented May 7, 2026

Uh oh!

Uh oh!

DmitriyAlergant commented May 12, 2026

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

codecov Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

mateo-berri left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

ishaan-berri commented May 7, 2026

Relevant issues

Linear ticket

Pre-Submission checklist

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

CLAassistant commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

codspeed-hq Bot commented May 7, 2026

Merging this PR will not alter performance

Uh oh!

Uh oh!

DmitriyAlergant commented May 12, 2026

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

codecov Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mateo-berri commented Jun 11, 2026

Uh oh!

mateo-berri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CLAassistant commented May 7, 2026 •

edited

Loading

greptile-apps Bot commented May 7, 2026 •

edited

Loading

codecov Bot commented Jun 11, 2026 •

edited

Loading