fix(v1.87.0): unblock e2e --mock-only — 3 fails → 0 (Layer 1) by songkuan-zheng · Pull Request #67 · GhishaDev/litellm

songkuan-zheng · 2026-06-04T13:15:18Z

Layer 1 of post-bump audit: e2e --mock-only had 3 fails. Root causes diagnosed by agents, one real prod regression (case 20 streaming fast_path) + 2 e2e harness fixes (case 07 auth default, case 23 in_flight drain). After: 15 PASS / 0 FAIL / 7 SKIP.

Layer 1 of the post-bump audit. E2E mock-only run found 3 fails. Root causes were independent; one is a real production regression (case 20), the other two are e2e-harness wiring. After this commit: `e2e/tools/run-all-cases --mock-only` reports 15 PASS / 0 FAIL / 7 SKIP (skips are Tier=real cases that require a real provider). ## 1. case 07 metrics-endpoint-smoke (e2e config) Upstream v1.84.0 flipped the default of `litellm.require_auth_for_metrics_endpoint` to True. Anonymous `curl /metrics` now returns 401 in the e2e harness, so the test's `families=0 ct_ok=0` was a misread auth failure rather than a Prometheus emission bug. The case 07 runbook explicitly assumes anonymous /metrics access (the scrape posture for a trusted-network local Prometheus). Restore that posture in the e2e config — production unaffected (this only ships in the autogenerated `e2e/_config/.litellm.rendered.yaml`). File: `e2e/tools/proxy` — added `require_auth_for_metrics_endpoint: false` to the rendered `litellm_settings` block. ## 2. case 20 returned_model_name streaming /v1/messages (real bug) **Production regression.** Wave 5b placed the `message_start.message.model` SSE rewrite AFTER upstream's `async_post_call_streaming_hook` call. Upstream PR BerriAI#28289 (in v1.87.0) introduced a `fast_path` short-circuit before that hook for the dominant config (no guardrails, default `include_cost_in_streaming_usage`), so the rewrite was being skipped on every streaming /v1/messages request where `returned_model_name` is set. The upstream model id leaked. Fix: move the rewrite block BEFORE the `fast_path` short-circuit. Pay near-zero overhead in the unset-override case (one dict get + one substring test in the SSE byte rewriter). File: `litellm/proxy/common_request_processing.py:2113-2173`. ## 3. case 23 mock-memory-pressure (e2e harness ordering) Case 23 reset the mock counter without waiting for the mock-side in_flight to drain. Case 20's last assertion (A4 streaming /v1/chat/completions) returns `[DONE]` to the client well before the mock-side handler finishes writing chunks (the mock counter is bumped post-`wfile.flush()`). The leaked stream from case 20 finished during case 23's burst window, bumping the counter and producing a false 6-of-5 fail. Fix: in case 23, poll `/__mock__/state` for `in_flight == 0` (50 × 100ms, bounded ~5s) before issuing the reset. File: `e2e/cases/data/23_mock_memory_pressure.sh`. ## Verification ```bash e2e/tools/run-all-cases --mock-only # → 15 PASS / 0 FAIL / 7 SKIP (skipped are Tier=real, expected) ``` Tier: C (case 20 — universal bug fix on streaming SSE) + B (case 07, 23 — internal e2e infrastructure).

songkuan-zheng merged commit e1031db into ship/v1.87.0 Jun 4, 2026

songkuan-zheng deleted the fix/v1.87.0-e2e-mock-only-greens branch June 4, 2026 13:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(v1.87.0): unblock e2e --mock-only — 3 fails → 0 (Layer 1)#67

fix(v1.87.0): unblock e2e --mock-only — 3 fails → 0 (Layer 1)#67
songkuan-zheng merged 1 commit into
ship/v1.87.0from
fix/v1.87.0-e2e-mock-only-greens

songkuan-zheng commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

songkuan-zheng commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant