fix(advisor): attribute advisor sub-call spend to the originating key/user#30481
fix(advisor): attribute advisor sub-call spend to the originating key/user#30481bse-ai wants to merge 1 commit into
Conversation
…/user The advisor orchestration sub-call did not forward the parent request's proxy auth/attribution context (litellm_metadata / user_api_key_dict / proxy_server_request) that the executor leg already spreads via **kwargs. With no key/user/team in scope the proxy cost-tracking callback skips the SpendLogs write entirely, so advisor spend is attributed to nobody — it runs on resolved provider credentials and is visible only in raw provider invocation logs, never in per-user litellm logs. Forward the proxy context to the advisor leg, excluding litellm_logging_obj so the advisor sub-call mints its own logging object and its spend is not double-counted against the parent request's call id (api_key/api_base are also excluded as they are passed explicitly). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR fixes advisor sub-call spend attribution by forwarding the parent request's proxy auth context (
Confidence Score: 4/5The production change is a small, well-scoped dict filter that adds context to one previously context-free call; it is additive and backward-compatible when no proxy context is present. The implementation correctly excludes the three keys that would conflict with explicit arguments or cause double-counting, and keys that could cause duplicate parameter errors were already removed before the loop. The only gap is that the new test does not verify api_key and api_base are absent from the forwarded dict. The test file would benefit from adding assertions that api_key and api_base are excluded from the advisor passthrough.
|
| Filename | Overview |
|---|---|
| litellm/llms/anthropic/experimental_pass_through/messages/interceptors/advisor.py | Forwards proxy auth/attribution context to the advisor sub-call by building advisor_passthrough from kwargs minus litellm_logging_obj, api_key, and api_base. Keys that could conflict with explicit params (metadata, litellm_call_id) are already popped before the loop, so the passthrough is clean. |
| tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_advisor_integration.py | Adds test_advisor_subcall_forwards_proxy_attribution which verifies the three attribution fields are forwarded and litellm_logging_obj is excluded. Does not assert that api_key/api_base are excluded from the passthrough. |
Reviews (1): Last reviewed commit: "fix(advisor): attribute advisor sub-call..." | Re-trigger Greptile
| with patch( | ||
| "litellm.llms.anthropic.experimental_pass_through.messages.interceptors.advisor._call_messages_handler", | ||
| side_effect=mock_handler, | ||
| ): | ||
| await AdvisorOrchestrationHandler().handle( | ||
| model="openai/gpt-4o-mini", | ||
| messages=MESSAGES, | ||
| tools=[ADVISOR_TOOL], | ||
| stream=False, | ||
| max_tokens=512, | ||
| custom_llm_provider="openai", | ||
| litellm_metadata=sentinel_meta, | ||
| user_api_key_dict=sentinel_key, | ||
| proxy_server_request=sentinel_psr, | ||
| litellm_logging_obj=object(), | ||
| ) | ||
|
|
||
| advisor_legs = [c for c in captured if c["tools"] is None] | ||
| assert advisor_legs, "advisor sub-call (tools=None) must have fired" | ||
| adv = advisor_legs[0]["kwargs"] | ||
| assert adv.get("litellm_metadata") == sentinel_meta | ||
| assert adv.get("user_api_key_dict") is sentinel_key | ||
| assert adv.get("proxy_server_request") == sentinel_psr | ||
| # Own logging object → not stamped onto the parent request. | ||
| assert "litellm_logging_obj" not in adv |
There was a problem hiding this comment.
Test doesn't cover
api_key/api_base exclusion
The filter in advisor_passthrough explicitly excludes api_key and api_base to prevent a caller-supplied key from silently overriding the advisor's own credentials (advisor_api_key/advisor_api_base). If that filter were accidentally removed, the advisor would call the provider with the wrong key, but this test wouldn't catch it. Adding api_key="sk-parent" and api_base="https://parent.example.com" to the handle() invocation and asserting they don't appear in adv would close this gap.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
| # is excluded so the advisor leg gets its own logging object and its | ||
| # spend is not double-counted against the parent request's call id; | ||
| # api_key/api_base are excluded because they are passed explicitly. | ||
| advisor_passthrough = { |
There was a problem hiding this comment.
Medium: Credential forwarding into advisor sub-call
advisor_tool["model"] comes from the request, but this forwards every parent kwarg except api_key/api_base. For routed deployments that use other credential fields like aws_access_key_id, aws_secret_access_key, vertex_credentials, or litellm_credential_name, a caller can invoke an arbitrary advisor model with the parent deployment's cloud credentials instead of only inheriting attribution context.
| advisor_passthrough = { | |
| advisor_attribution_keys = { | |
| "litellm_metadata", | |
| "user_api_key_dict", | |
| "proxy_server_request", | |
| } | |
| advisor_passthrough = { | |
| k: v for k, v in kwargs.items() if k in advisor_attribution_keys | |
| } |
PR overviewThis PR updates the Anthropic experimental pass-through advisor interceptor so advisor sub-call spend can be attributed back to the originating key or user. The touched code handles how request context is passed into advisor tool model calls. There is one open security concern around the advisor sub-call inheriting more parent request parameters than intended. Because the advisor model can come from the request, routed deployments that use nonstandard credential fields could allow a caller to run an advisor call using the parent deployment’s cloud credentials rather than only receiving attribution context. No issues have been fixed yet, so the PR still needs tightening around which fields are forwarded into advisor calls. Open issues (1)
Fixed/addressed: 0 · PR risk: 6/10 |
|
Thanks for the contribution! A couple of things to address before this is ready for merge:
Once those are in, we'll take another look! |
What
The advisor orchestration sub-call (
AdvisorOrchestrationHandler.handle) does not forward the parent request's proxy auth/attribution context to the advisor leg, so advisor spend is never attributed to the originating key/user in SpendLogs.Why
The executor leg dispatches with
**kwargs, solitellm_metadata/user_api_key_dict/proxy_server_requestreach the@clientwrapper and the call is logged + cost-attributed. The advisor leg calls_call_messages_handler(...)without spreadingkwargs, so none of that context is present. The proxy cost-tracking callback's gate (_should_track_cost_callback) requires a non-Noneuser_api_key/user_id/team_id/end_user_id, so for the advisor leg it returns False and the SpendLogs write is skipped entirely. The advisor sub-call still runs on resolved provider credentials, so its spend is real and visible in raw provider invocation logs but invisible in per-user litellm logs.Impact
For any deployment using the advisor tool on a non-native provider, every advisor sub-call's spend is missing from SpendLogs and unattributable to a key/user/team — so per-user cost reporting silently undercounts, by the full cost of the advisor (typically a larger/more expensive reviewer model) on every invocation. The spend is real (it bills against provider credentials); it is simply not recorded by the gateway.
Fix
Forward the parent proxy context to the advisor leg, excluding:
litellm_logging_obj— so the advisor leg mints its own logging object and its spend is not double-counted against the parent request'slitellm_call_id;api_key/api_base— passed explicitly as the advisor's own credentials.Additive and behaviour-preserving when there is no proxy context (SDK / native use): the passthrough dict is simply empty.
Test plan
test_advisor_subcall_forwards_proxy_attribution— asserts the advisor leg receiveslitellm_metadata/user_api_key_dict/proxy_server_requestand does not receivelitellm_logging_objtests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_advisor_integration.pytests/test_litellm/llms/anthropic/messages/test_advisor_orchestration.py