Skip to content

test(proxy): pin proxy_server.py non-route surface behavior (PR1)#28856

Merged
yuneng-berri merged 2 commits into
litellm_yj_may29from
claude/epic-gauss-SY9hx
May 29, 2026
Merged

test(proxy): pin proxy_server.py non-route surface behavior (PR1)#28856
yuneng-berri merged 2 commits into
litellm_yj_may29from
claude/epic-gauss-SY9hx

Conversation

@yuneng-berri

Copy link
Copy Markdown
Collaborator

Summary

PR1 of the proxy_server.py behavior-pinning project (Notion plan). Fills the seven PR1 placeholder files under tests/test_litellm/proxy/proxy_server/ (created by the harness PR #28827) with behavior pins for the non-route surface of litellm/proxy/proxy_server.py.

Pinned domains (101 symbols total, from the PR1 pin list):

  • Lifecycle (8): proxy_startup_event, proxy_shutdown_event, _initialize_shared_aiohttp_session, cleanup_router_config_variables, save_worker_config, initialize, load_from_azure_key_vault, cost_tracking
  • Misc helpers (5): check_request_disconnection, _resolve_typed_dict_type, _resolve_pydantic_type, get_litellm_model_info, run_ollama_serve
  • OpenAPI customization (8): _generate_stable_operation_id, _strip_operation_id_method_suffix, ensure_unique_openapi_operation_ids, _inject_websocket_stubs_into_openapi_schema, get_openapi_schema, custom_openapi, mount_swagger_ui, _get_cors_config
  • Exception handlers (4): openai_exception_handler, _close_dangling_otel_server_span, otel_request_validation_exception_handler, otel_unhandled_exception_handler
  • Spend counters (15): get_current_spend, increment_spend_counters, update_cache, and 12 supporting helpers
  • Background health (8): _get_process_rss_mb, _rss_mb_for_log, _run_direct_health_check_with_instrumentation, _schedule_background_health_check_db_save, _get_endpoint_exception_status, _write_health_state_to_router_cache, _adaptive_router_flusher_loop, _run_background_health_check
  • Config scrubbers (3): _is_remote_module_url, _scrub_guardrail_inner, _scrub_db_overlay_remote_module_loads
  • Streaming helpers (10): data_generator, async_data_generator, async_assistants_data_generator, select_data_generator, and 6 supporting helpers
  • ProxyConfig class methods (40): __init__, is_yaml, _load_yaml_file, _get_config_from_file, _process_includes, save_config, _check_for_os_environ_vars, _get_team_config, load_team_config, _init_cache, switch_on_llm_response_caching, get_config, update_config_state, get_config_state, load_credential_list, parse_search_tools, _load_environment_variables, load_config, _init_non_llm_configs, _init_policy_engine, _load_alerting_settings, initialize_secret_manager, get_model_info_with_id, _delete_deployment, _add_deployment, decrypt_model_list_from_db, _update_llm_router, _add_callback_from_db_to_in_memory_litellm_callbacks, _add_callbacks_from_db_config, _encrypt_env_variables, _decrypt_and_set_db_env_variables, _decrypt_db_variables, _encrypt_env_variables_for_db, _parse_router_settings_value, _get_hierarchical_router_settings, _add_router_settings_from_db_config, _add_general_settings_from_db_config, _reschedule_spend_log_cleanup_job, _update_general_settings, _update_config_fields

Each pin has ≥1 happy-path test with a strong assertion (normalize(...) == {...}, .model_validate(...), or dict-eq with ≥3 keys) and ≥1 error-path test (raises, error-named, or 4xx/5xx).

Numbers

Gate Target This PR
Pin coverage (_pin_check.py) all 101 pins PASS (101 pins, 214 happy+error pairs)
Line coverage on proxy_server.py (new-tests-only) ≥25% 32.80%
Branch coverage (new-tests-only) ≥18% 20.91%
Runtime, -n 4 <3 min ~22 s (233 tests)

Test plan

  • uv run pytest tests/test_litellm/proxy/proxy_server/ -n 4 — 233 passed, 0 failed
  • python tests/test_litellm/proxy/proxy_server/_coverage_check.py --pr-target 1PASS (line 32.80% / 25%, branch 20.91% / 18%)
  • python tests/test_litellm/proxy/proxy_server/_pin_check.py --list .pin_list.txtPASS
  • uv run black tests/test_litellm/proxy/proxy_server/ — clean

Out of scope

  • Forwarding routes (PR2) and control-plane routes (PR3) — separate PRs branched off this one's merge per the plan's sequencing.
  • Cleanup of pre-existing scattered tests — explicitly deferred (Decision return usage for all providers - As OpenAI does #11 in the plan).

Generated by Claude Code

Fills the 7 PR1 placeholder files under tests/test_litellm/proxy/proxy_server/
with behavior pins for the non-route surface of proxy_server.py:
lifecycle/init/shutdown, ProxyConfig class methods, DB-overlay config scrubbers,
spend counters, background-health helpers, OpenAPI customization, exception
handlers, and streaming-generator helpers.

233 tests cover 101 pin-list symbols (1+ happy + 1+ error each). New-tests-only
coverage on litellm/proxy/proxy_server.py: 32.80% line / 20.91% branch (PR1
gate: 25% line / 18% branch). Full directory runs in ~22s with -n 4.

Plan: https://www.notion.so/Plan-Pin-proxy_server-py-behavior-2026-05-25-36c43b8acdab81ee845fd5365128a2fc
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov

codecov Bot commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fills seven placeholder test files under tests/test_litellm/proxy/proxy_server/ with 233 behavior-pin tests covering 101 symbols from the non-route surface of litellm/proxy/proxy_server.py. All changes are additive test code; no production source is modified.

  • 101 symbols pinned across lifecycle, spend counters, OpenAPI customization, exception handlers, background health, config scrubbers, streaming helpers, and all 40+ ProxyConfig methods — each with at least one happy-path and one error-path assertion.
  • Coverage gates met: 32.80% line coverage and 20.91% branch coverage on proxy_server.py from the new tests alone (targets were 25% / 18%); .gitignore gains entries for the two local artifact files produced by the pin-check and coverage scripts.

Confidence Score: 4/5

Safe to merge — all changes are additive test code with no production source touched. The two findings are minor test-quality issues with no runtime impact.

The PR adds 233 new tests with no modifications to existing tests or production code. One assertion in test_lifecycle.py is tautological (both sides of a param_count comparison use the same runtime expression), so it will never detect a signature regression on that axis. A second test in the same file spins for a real 1.2-second wall-clock window because asyncio.sleep is not patched, which could add latency or flakiness under CI load. Both are confined to a single test file and do not affect other tests or the production path.

tests/test_litellm/proxy/proxy_server/test_lifecycle.py — tautological param_count assertion and unmocked sleep timeout

Important Files Changed

Filename Overview
tests/test_litellm/proxy/proxy_server/test_lifecycle.py Replaces placeholder with 26 behavior-pin tests for lifecycle, helpers, and small utilities. Two minor issues: a tautological param_count assertion in the initialize signature test, and a real 1.2-second asyncio.wait_for timeout in the disconnection test.
tests/test_litellm/proxy/proxy_server/test_spend_counters.py Replaces placeholder with 30 behavior-pin tests covering all 15 spend-counter symbols. Good use of helper factories and consistent assertion patterns; no issues found.
tests/test_litellm/proxy/proxy_server/test_proxy_config.py Replaces placeholder with 80+ behavior-pin tests covering all 40+ ProxyConfig methods plus 3 module-level scrubbers. All external dependencies properly mocked; no issues found.
tests/test_litellm/proxy/proxy_server/test_background_health.py Replaces placeholder with behavior-pin tests for all 8 background health-check helpers. asyncio.sleep is correctly patched in every loop test; no issues found.
tests/test_litellm/proxy/proxy_server/test_exception_handlers.py Replaces placeholder with behavior-pin tests for the 4 exception handlers. Well-structured; tests both happy path and edge cases such as empty error lists and missing OTEL spans.
tests/test_litellm/proxy/proxy_server/test_openapi_customization.py Replaces placeholder with behavior-pin tests for all 8 OpenAPI/CORS helpers. Fresh FastAPI instances are used in place of the global app, keeping tests fully isolated.
tests/test_litellm/proxy/proxy_server/test_streaming_helpers.py Replaces placeholder with behavior-pin tests for all 10 streaming helper symbols. Real async generators are used correctly; all mocks are scoped to the function under test.
.gitignore Adds .pin_list.txt and .cov_new.xml to gitignore — local artifacts from the pin-check and coverage scripts.

Reviews (1): Last reviewed commit: "test(proxy): pin proxy_server.py non-rou..." | Re-trigger Greptile

Comment on lines +224 to +235
observed = {
"is_async": inspect.iscoroutinefunction(initialize),
"param_count": len(sig.parameters),
"has_model": "model" in sig.parameters,
"has_config": "config" in sig.parameters,
}
assert normalize(observed) == {
"is_async": True,
"param_count": len(sig.parameters),
"has_model": True,
"has_config": True,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The param_count entry in both the observed dict and the expected dict uses len(sig.parameters) — the same expression — so the assertion param_count == param_count is always true regardless of the actual parameter count. If initialize gains or loses parameters this test will not catch it. Replace the right-hand side with a hard-coded integer that reflects the current signature.

Suggested change
observed = {
"is_async": inspect.iscoroutinefunction(initialize),
"param_count": len(sig.parameters),
"has_model": "model" in sig.parameters,
"has_config": "config" in sig.parameters,
}
assert normalize(observed) == {
"is_async": True,
"param_count": len(sig.parameters),
"has_model": True,
"has_config": True,
}
expected_param_count = len(sig.parameters) # pin the exact count here
observed = {
"is_async": inspect.iscoroutinefunction(initialize),
"param_count": len(sig.parameters),
"has_model": "model" in sig.parameters,
"has_config": "config" in sig.parameters,
}
assert normalize(observed) == {
"is_async": True,
"param_count": expected_param_count,
"has_model": True,
"has_config": True,
}

Comment on lines +357 to +368
async def test_check_request_disconnection_invalid_when_connected_times_out():
"""With a connected request the function loops for up to 10 minutes —
wrap in wait_for and assert it times out (using real asyncio.sleep)."""
request = MagicMock()
request.is_disconnected = AsyncMock(return_value=False)
task = MagicMock()

with pytest.raises(asyncio.TimeoutError):
await asyncio.wait_for(
check_request_disconnection(request=request, llm_api_call_task=task),
timeout=1.2,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unmocked 1.2-second wall-clock wait per run

asyncio.sleep is not patched here, so check_request_disconnection spins through real sleeps until asyncio.wait_for cancels it after 1.2 s. This adds real elapsed time to every suite run and can be flaky under CPU pressure on CI. Consider patching asyncio.sleep to AsyncMock(return_value=None) at the start of the test and lowering the wait_for timeout to something like 0.05 s; the test would still verify the TimeoutError path without blocking for over a second.

- test_initialize_signature_is_async_with_expected_params: hard-code
  expected_param_count so a signature change actually trips the gate
  (previously both sides of the comparison were len(sig.parameters)).
- test_check_request_disconnection_invalid_when_connected_times_out:
  patch asyncio.sleep so the test no longer spins for ~1.2 s of real
  wall-clock; timeout lowered to 0.05 s.
yuneng-berri pushed a commit that referenced this pull request May 26, 2026
Apply the same review feedback Greptile gave on PR1 (#28856) and PR3
(#28850) to PR2's forwarding-route tests:

- Replace permissive `>= 400` / `in (X, Y)` status assertions with the
  exact 500/405 the handler actually returns, so a regression that
  silently shifts the code now fails the pin.
- Add a body-presence check alongside each tightened status assertion
  to satisfy _pin_check.py's no-status-only rule.
yuneng-berri added a commit that referenced this pull request May 29, 2026
* test(proxy): pin proxy_server.py forwarding-route behavior

PR2 of the proxy_server.py behavior-pinning project: fills the 12
forwarding-route test files added by the harness PR with happy + error
pins for all 52 LLM-facing routes (models, chat/completions, completions,
embeddings, moderations, audio, assistants, threads, utils, model-info,
model-metrics, queue). Every happy-path test asserts the full response
dict via normalize() so the gate enforces real shape pinning rather
than status codes.

* test(proxy): drop task-plumbing comments from PR2 test files

* test(proxy): tighten PR2 error-path status-code pins

Apply the same review feedback Greptile gave on PR1 (#28856) and PR3
(#28850) to PR2's forwarding-route tests:

- Replace permissive `>= 400` / `in (X, Y)` status assertions with the
  exact 500/405 the handler actually returns, so a regression that
  silently shifts the code now fails the pin.
- Add a body-presence check alongside each tightened status assertion
  to satisfy _pin_check.py's no-status-only rule.

---------

Co-authored-by: Claude <noreply@anthropic.com>
@yuneng-berri yuneng-berri changed the base branch from litellm_internal_staging to litellm_yj_may29 May 29, 2026 22:41
@yuneng-berri yuneng-berri merged commit 2a77947 into litellm_yj_may29 May 29, 2026
46 checks passed
yuneng-berri added a commit that referenced this pull request May 30, 2026
* test(proxy/proxy_server): pin forwarding routes (PR2) (#28887)

* test(proxy): pin proxy_server.py forwarding-route behavior

PR2 of the proxy_server.py behavior-pinning project: fills the 12
forwarding-route test files added by the harness PR with happy + error
pins for all 52 LLM-facing routes (models, chat/completions, completions,
embeddings, moderations, audio, assistants, threads, utils, model-info,
model-metrics, queue). Every happy-path test asserts the full response
dict via normalize() so the gate enforces real shape pinning rather
than status codes.

* test(proxy): drop task-plumbing comments from PR2 test files

* test(proxy): tighten PR2 error-path status-code pins

Apply the same review feedback Greptile gave on PR1 (#28856) and PR3
(#28850) to PR2's forwarding-route tests:

- Replace permissive `>= 400` / `in (X, Y)` status assertions with the
  exact 500/405 the handler actually returns, so a regression that
  silently shifts the code now fails the pin.
- Add a body-presence check alongside each tightened status assertion
  to satisfy _pin_check.py's no-status-only rule.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* test(proxy): pin proxy_server.py non-route surface behavior (PR1) (#28856)

* test(proxy): pin proxy_server.py non-route surface behavior (PR1)

Fills the 7 PR1 placeholder files under tests/test_litellm/proxy/proxy_server/
with behavior pins for the non-route surface of proxy_server.py:
lifecycle/init/shutdown, ProxyConfig class methods, DB-overlay config scrubbers,
spend counters, background-health helpers, OpenAPI customization, exception
handlers, and streaming-generator helpers.

233 tests cover 101 pin-list symbols (1+ happy + 1+ error each). New-tests-only
coverage on litellm/proxy/proxy_server.py: 32.80% line / 20.91% branch (PR1
gate: 25% line / 18% branch). Full directory runs in ~22s with -n 4.

Plan: https://www.notion.so/Plan-Pin-proxy_server-py-behavior-2026-05-25-36c43b8acdab81ee845fd5365128a2fc

* test(proxy): address Greptile review comments on test_lifecycle.py

- test_initialize_signature_is_async_with_expected_params: hard-code
  expected_param_count so a signature change actually trips the gate
  (previously both sides of the comparison were len(sig.parameters)).
- test_check_request_disconnection_invalid_when_connected_times_out:
  patch asyncio.sleep so the test no longer spins for ~1.2 s of real
  wall-clock; timeout lowered to 0.05 s.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* test(proxy/proxy_server): pin control-plane routes (PR3) (#28850)

* test(proxy/proxy_server): pin misc routes (PR3, partial)

Adds happy + error tests for the misc control-plane routes:
GET /, /routes, /adaptive_router/state, /get_logo_url,
/get_image, /get_favicon.

Also gitignores .pin_list.txt (used by the pin gate).

* test(proxy/proxy_server): pin login/SSO routes (PR3, partial)

Adds happy + error tests for the 5 login/SSO control-plane routes:
GET /fallback/login, POST /login, POST /v2/login, POST /v3/login,
POST /v3/login/exchange. Mocks authenticate_user and
create_ui_token_object at their imported location.

* test(proxy/proxy_server): pin onboarding routes (PR3, partial)

Adds happy + error tests for the 2 onboarding control-plane routes:
GET /onboarding/get_token, POST /onboarding/claim_token. Wires a
MagicMock async context manager for prisma_client.db.tx() and
signs the onboarding JWT with the patched master_key.

* test(proxy/proxy_server): pin model_cost_map reload routes (PR3, partial)

Adds happy + error tests for the 5 model-cost-map control-plane routes:
POST /reload/model_cost_map, POST|DELETE|GET
/schedule/model_cost_map_reload(/status), GET /model/cost_map/source.
Attaches litellm_config to mock_prisma per-test (the table is not in
the default _PRISMA_TABLES fixture).

* test(proxy/proxy_server): pin anthropic_beta_headers reload routes (PR3, partial)

Adds happy + error tests for the 4 anthropic-beta-headers control-plane
routes: POST /reload/anthropic_beta_headers, POST|DELETE|GET
/schedule/anthropic_beta_headers_reload(/status). Stubs
db.litellm_config (not in default _PRISMA_TABLES) and monkeypatches
reload_beta_headers_config so no network calls fire.

* test(proxy/proxy_server): pin invitation routes (PR3, partial)

Adds happy + error tests for the 4 invitation control-plane routes:
POST /invitation/new, GET /invitation/info, POST /invitation/update,
POST /invitation/delete. Patches _user_has_admin_privileges /
_user_has_admin_view to avoid extensive get_user_object mocking.

* test(proxy/proxy_server): pin config CRUD routes (PR3, partial)

Adds happy + error tests for the 8 config-CRUD control-plane routes:
POST /config/update, POST|GET /config/field/update|info, GET /config/list,
POST /config/field/delete, POST /config/callback/delete,
GET /get/config/callbacks, GET /config/yaml. Attaches litellm_config
to mock_prisma per-test.

* test(proxy/proxy_server): tighten pin assertions per review

- test_routes_misc.py: `b"" in response.content` is trivially true;
  replace with `len(response.content) > 0` so an empty 405 body trips
  the gate.
- test_routes_login_sso.py: `len(response.content) >= 0` is trivially
  true; tighten to `> 0`.
- test_routes_anthropic_beta.py: replace brittle string-literal checks
  on the serialized JSON (`'"interval_hours": 12' in payload`) with
  `json.loads` + dict access so the assertion survives any serializer
  spacing.
- test_routes_config.py: `assert status_code in (404, 500)` was too
  permissive; the handler re-raises HTTPException(404) verbatim, so
  pin 404 strictly.

---------

Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants