Skip to content

docs(benchmarking): Record GLM 5.2 corpus run#423

Merged
dcramer merged 1 commit into
mainfrom
docs/glm-52-benchmark
Jul 2, 2026
Merged

docs(benchmarking): Record GLM 5.2 corpus run#423
dcramer merged 1 commit into
mainfrom
docs/glm-52-benchmark

Conversation

@dcramer

@dcramer dcramer commented Jul 2, 2026

Copy link
Copy Markdown
Member

Record the OpenRouter GLM 5.2 high Sentry corpus benchmark result in the docs benchmark data and add the matching readout. The row captures the validated high-effort run after failed shard repair, including the lower recall and the run cleanup needed to make the artifacts comparable.

Benchmark Result

Add the traced GLM 5.2 result JSON with 15 of 86 known corpus findings found, 18 emitted findings, and the validated token, cost, and timing summaries.

Run Notes

Document the GLM 5.2 no-finding parser issue, the combined-clean shard artifacts, and the seer_rpc.py lower-parallelism repair so the result is interpreted as benchmark data with an operational caveat.

Add the OpenRouter GLM 5.2 high Sentry corpus benchmark result and the matching docs readout. Note the no-finding JSON parser issue and the repaired shard handling so reviewers can distinguish benchmark performance from run cleanup.

Co-Authored-By: GPT-5 Codex <noreply@anthropic.com>
@dcramer dcramer force-pushed the docs/glm-52-benchmark branch from 7fa7449 to 2a6d84d Compare July 2, 2026 22:17
@dcramer dcramer marked this pull request as ready for review July 2, 2026 22:19
@dcramer dcramer merged commit c16f264 into main Jul 2, 2026
22 checks passed
@dcramer dcramer deleted the docs/glm-52-benchmark branch July 2, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant