fix(otel): export SERVER span on management-endpoint success without http_request#28792
Closed
yassin-berriai wants to merge 4 commits into
Closed
fix(otel): export SERVER span on management-endpoint success without http_request#28792yassin-berriai wants to merge 4 commits into
yassin-berriai wants to merge 4 commits into
Conversation
…ut http_request management_endpoint_wrapper only ran async_management_endpoint_success_hook — which stamps http.response.status_code=200 and end()s the parent SERVER span — when the handler declared an http_request param. 45 of 65 wrapped endpoints (all /key/*, /user/*, /mcp/*, ...) lack it, so on success their SERVER span, created in auth, was never ended and thus never exported. Mirror the failure branch: invoke the success hook regardless, falling back to func.__name__ for the route. Add a wrapper-level regression test (fails before this change) and an otel_verify/ end-to-end harness (config + curl matrix + span verifier) verifying PRs #28273, #28362, #28364, #28405 against a running proxy. https://claude.ai/code/session_016u6Pe2S2zBVrUuFF1N6GkJ
|
|
Contributor
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
|
||
| def main(): | ||
| traces = {} | ||
| for line in open(sys.argv[1] if len(sys.argv) > 1 else "spans.jsonl"): |
| spans = traces.get(tid, []) | ||
| by_name = {} | ||
| for s in spans: | ||
| by_name.setdefault(s["name"], s).update() if False else by_name.setdefault(s["name"], s) |
| cats_ok = False | ||
| try: | ||
| cats_ok = sorted(json.loads(cats_raw)) == sorted(g["categories"]) | ||
| except Exception: |
Re-runs CI against the corrected base branch and tidies the verification harness (Black formatting + removal of an unused by_name block). https://claude.ai/code/session_016u6Pe2S2zBVrUuFF1N6GkJ
Adding the success-path fix pushed management_endpoint_wrapper to 52 statements (ruff PLR0915 limit is 50). Extract the shared success/failure OTEL span emission into _emit_management_endpoint_otel_span, which also de-duplicates the two near-identical inline blocks. Behavior is unchanged: the parent SERVER span is stamped + ended on both paths, including for handlers without an http_request param. https://claude.ai/code/session_016u6Pe2S2zBVrUuFF1N6GkJ
…guards Bring patch coverage of _emit_management_endpoint_otel_span and the wrapper to 100%: add tests for the failure path, the http_request-present path, the no-OTEL-logger early return, and the non-blocking post-success error path. Remove two unreachable `if kwargs is None` guards (**kwargs is always a dict). https://claude.ai/code/session_016u6Pe2S2zBVrUuFF1N6GkJ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
While building an end-to-end verification matrix for the four OTEL-fidelity PRs shipped in v1.87.0 — #28273 (team attrs on all spans), #28362 (serialize
guardrail_response), #28364 (guardrail span on failure + status/categories), #28405 (http.response.status_codeon the SERVER span) — the matrix surfaced one real gap in #28405's coverage, fixed here.The bug
management_endpoint_wrapperonly invokesasync_management_endpoint_success_hook(which stampshttp.response.status_code=200andend()s the parent SERVER span) when the handler declares anhttp_requestparameter — theif _http_request:guard atlitellm/proxy/management_helpers/utils.py:471.45 of 65 wrapped management endpoints — including all
/key/*,/user/*,/mcp/*, and several/team/member*— have nohttp_requestparam. On their success path the SERVER span (created inuser_api_key_auth) was therefore never ended and never exported. The failure path was unaffected (it ends the parent via afunc.__name__fallback), so dashboards saw failed admin calls but not successful ones — contradicting the goal of "HTTP status + path + duration on the root span for all statuses."The fix
Mirror the failure branch in the success branch: run the success hook regardless, falling back to
func.__name__for the route whenhttp_requestis absent. One file:litellm/proxy/management_helpers/utils.py.Verification
Unit: new
test_management_wrapper_success_ends_server_span_without_http_requestdrives the real wrapper around anhttp_request-less handler and asserts the SERVER span exports with 200. It fails before this change ("SERVER span never finished") and passes after. Fulltests/test_litellm/integrations/open_telemetry/suite: 110 passed.End-to-end (
otel_verify/): a self-contained harness (mock model + custom file span-exporter + a Bedrock-style guardrail) drives a 12-case curl matrix against a running proxy, each with a deterministictraceparent, and asserts the spans:litellm_request/raw_gen_ai_requestFailed Proxy Server Request/v1/messagessuccess/key/generatesuccess/key/generatefailure / 422violation_categories; #28362guardrail_responsevalid JSONsuccess; #28362 valid JSONResult: 12/12 pass after the fix (was 11/12). All four original PRs are confirmed working at runtime.
Harness files:
config.yaml,setup.sh(team+key),otel_file_exporter.py,verify_guardrail.py,run_matrix.sh,verify_spans.py. Runtime artifacts (incl. the generated virtual key) are gitignored.Notes
/v1/responseswas not exercised live (the matrix focuses on chat/messages/admin). The existing unit suite covers it.Generated by Claude Code