Skip to content

docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1)#225

Merged
yassin-berriai merged 2 commits into
yuneng/release-notes-v1-87-0-rc-1from
claude/gracious-edison-1mBAx
May 25, 2026
Merged

docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1)#225
yassin-berriai merged 2 commits into
yuneng/release-notes-v1-87-0-rc-1from
claude/gracious-edison-1mBAx

Conversation

@yassin-berriai

@yassin-berriai yassin-berriai commented May 25, 2026

Copy link
Copy Markdown
Contributor

Resolves LIT-3334

Stacked on top of yuneng/release-notes-v1-87-0-rc-1 — contributes my depth-item write-up to the v1.87.0rc1 release notes (per @yuneng-jiang's request in #214).

Summary

Expands the Anthropic streaming hot-path perf entry (#28289) — my depth item for this release — from a one-liner into a full write-up:

  • The four optimization groups (skip no-op per-chunk work, de-duplicate per-request work, cheaper end-of-stream reconstruction, cheaper hot-path logging).
  • Source-of-truth metrics from internal Week-4 real-deployment testing (4-pod m7i.xlarge, no HPA, 256 text_delta chunks/request, validated on both Anthropic direct and Bedrock Invoke): TTFT overhead ~90% lower (p50 2220% → 165%, p95 3057% → 316%, p99 3111% → 328%) and TPM +12% / +6% / +4% (p50 / p95 / p99).
  • Notes the wire-output parity testing.

Also leads the matching Key Highlights line with the headline result.

Metrics now come from the internal Week-4 performance doc (real deployment), replacing the local mock-SSE benchmark figures from the PR description.

Not included

  • #28794 (management-endpoint SERVER span) — merged 2026-05-25, after this RC was cut, so it is not part of v1.87.0rc1. It belongs in the next release's notes.
  • The other PRs I merged this week (#28273, #28362, #28364, #28395) were already documented by Yuneng and need no change.

Test plan

  • Docusaurus builds and the note renders at /release_notes

@vercel

vercel Bot commented May 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment May 25, 2026 9:11pm

Request Review

@yassin-berriai yassin-berriai force-pushed the claude/gracious-edison-1mBAx branch from 5430012 to a6e4dba Compare May 25, 2026 20:45
@yassin-berriai yassin-berriai changed the title docs(release): add v1.87.0-rc.1 release notes docs(release): add PR #28794 to v1.87.0rc1 release notes May 25, 2026
@yassin-berriai yassin-berriai changed the base branch from main to yuneng/release-notes-v1-87-0-rc-1 May 25, 2026 20:45
Flesh out the #28289 entry in the v1.87.0rc1 notes with the specific
optimizations and benchmark numbers (depth-item write-up), and lead the
Key Highlights line with the headline speedup.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu
@yassin-berriai yassin-berriai force-pushed the claude/gracious-edison-1mBAx branch from a6e4dba to dbb30a6 Compare May 25, 2026 20:52
@yassin-berriai yassin-berriai changed the title docs(release): add PR #28794 to v1.87.0rc1 release notes docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1) May 25, 2026
Replace the local mock-benchmark figures with the source-of-truth
metrics from internal Week-4 testing (4-pod m7i.xlarge, Anthropic +
Bedrock Invoke): TTFT overhead ~90% lower, TPM +12/6/4%.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu
@yassin-berriai yassin-berriai marked this pull request as ready for review May 25, 2026 21:11
@yassin-berriai yassin-berriai merged commit c7a358b into yuneng/release-notes-v1-87-0-rc-1 May 25, 2026
1 of 2 checks passed
@yassin-berriai yassin-berriai deleted the claude/gracious-edison-1mBAx branch May 25, 2026 21:11
ryan-crabbe-berri pushed a commit that referenced this pull request May 30, 2026
…table (#214)

* docs(release-notes): add v1.87.0rc1

* docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1) (#225)

* docs(release): expand Anthropic streaming perf depth-item write-up

Flesh out the #28289 entry in the v1.87.0rc1 notes with the specific
optimizations and benchmark numbers (depth-item write-up), and lead the
Key Highlights line with the headline speedup.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu

* docs(release): use internal real-deployment perf numbers for #28289

Replace the local mock-benchmark figures with the source-of-truth
metrics from internal Week-4 testing (4-pod m7i.xlarge, Anthropic +
Bedrock Invoke): TTFT overhead ~90% lower, TPM +12/6/4%.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu

---------

Co-authored-by: Claude <noreply@anthropic.com>

* docs(release): add perf metrics table for #28289 write-up (#226)

Present the internal Week-4 before/after numbers (TPM, TTFT overhead)
as a markdown table and move the exact figures out of the prose to
avoid duplication.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu

Co-authored-by: Claude <noreply@anthropic.com>

* docs(release): add MCP Credential Store entry for #28917

Per-server env vars with admin-managed global scope and per-user
dashboard scope, interpolated as ${NAME} into static_headers; tool
listing stays best-effort when per-user values are unset.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(release): add Loom demo link for MCP Credential Store

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(release): promote MCP Credential Store with Loom embed

Add a featured subsection between Key Highlights and the catalog
(matching the v1.86.0rc1 'Claude Code compatibility coverage' layout)
with a Loom walkthrough of the env-vars flow. Drop the redundant
demo link from the MCP Gateway bullet now that the embed lives in
the dedicated section.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(release): rewrite MCP Credential Store as user-facing 'why upgrade'

- Video moved above the paragraph.
- Paragraph rewritten from implementation detail (${NAME} interpolation,
  MCPMissingUserEnvVarsError) to the user value: per-user credential
  storage for non-OAuth MCP servers (GitHub PAT example) and split
  instance/per-user variables for partially-shared credentials (DB
  protocol/host shared, user/password per-user).
- Both 'instance variables' and 'per-user variables' named explicitly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(release): drop em dashes from MCP Credential Store copy

Replace em dashes with periods, commas, and sentence breaks across
the featured subsection, the Key Highlights bullet, and the MCP
Gateway bullet. Drop the ' - ' separator before the PR link.
Keep standard compound-modifier hyphens (per-user, admin-managed).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(release): remove MCP Credential Store from v1.87.0rc1

The feature is not landing in this RC. Reverts the additions in
bd849fc, 2e38e68, 29d4271, 51fa7e0, and 85f223b.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(release-notes): promote v1.86.0rc1 to v1.86.0 stable; wire v1.87.0rc1 onto landing page

The v1.86.0 git tag points at the same SHA as v1.86.0-rc.1, so the rc1
write-up is the stable write-up. Rename release_notes/v1.86.0rc1/ to
release_notes/v1.86.0/, drop the rc markers from the frontmatter
(title, slug), the Deploy block (Docker tag and pip pin both bare 1.86.0),
the Full Changelog comparison (v1.85.0...v1.86.0), and the bottom date
stamp. No "Changes since rc1" section because there are zero post-rc
commits on the stable tag.

Update release_notes/index.md: v1.86.0 is now the Latest Release entry,
v1.87.0rc1 takes over the Latest Release Candidate slot (the write-up
for it already landed earlier on this branch), and v1.86.0 is added at
the top of the Recent Releases table.

---------

Co-authored-by: Yassin Kortam <yassin@berri.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants