Skip to content

fix(code-reviews): show model and tokens in review summary for v2 reviews#978

Merged
alex-alecu merged 2 commits intomainfrom
fix/code-review-token-usage
Mar 10, 2026
Merged

fix(code-reviews): show model and tokens in review summary for v2 reviews#978
alex-alecu merged 2 commits intomainfrom
fix/code-review-token-usage

Conversation

@alex-alecu
Copy link
Copy Markdown
Contributor

@alex-alecu alex-alecu commented Mar 10, 2026

Summary

PR #407 added model + token info to the PR review summary comment (the "Reviewed by claude-sonnet-4.6 · 12,345 tokens" footer). But it only works for v1 (SSE-based) reviews. All reviews now run on v2 (cloud-agent-next), which is callback-based — it never streams SSE events, so usage data is never collected and the model, total_tokens_in, total_tokens_out columns stay null.

The fix: when the code_reviews record has no usage data, we query the billing tables (microdollar_usage + microdollar_usage_metadata) by cli_session_id. The billing system already tracks every LLM call with model, tokens, and cost — we just aggregate it per session. We also back-fill the code_reviews record so future reads don't repeat the aggregation.

Verification

  • pnpm typecheck — no new errors (pre-existing errors in kiloclaw only)
  • pnpm test usage-footer — 10/10 pass
  • Read through the billing schema to confirm microdollar_usage_metadata.session_id matches code_reviews.cli_session_id

Visual Changes

N/A

Reviewer Notes

  • The billing query groups by model and picks the one with the most tokens (the primary review model). This handles sessions that use multiple models (e.g. a cheaper model for sub-tasks).
  • The back-fill write to code_reviews is fire-and-forget (.catch()) — if it fails, the footer still shows correctly; we just won't cache the result.
  • Long-term, cloud-agent-next could include usage data in its ExecutionCallbackPayload, but that's a bigger change. This fix works today with no changes outside the Next.js app.

…eviews

PR #407 added model/token tracking to the review summary, but it only
works for v1 (SSE) reviews. All reviews now use v2 (cloud-agent-next),
which skips SSE stream processing entirely, so usage is never reported.

Fix: when the code_reviews record has no usage data, query the billing
tables (microdollar_usage + metadata) by cli_session_id to get the
model, tokens, and cost. Back-fill the code_reviews record so future
reads skip the aggregation.
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot bot commented Mar 10, 2026

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
src/app/api/internal/code-review-status/[reviewId]/route.ts 136 The new polling loop adds a fixed ~1.4s delay to every v2 completion callback before the billing fallback runs.
src/lib/code-reviews/db/code-reviews.ts 515 The new billing fallback filters microdollar_usage_metadata.session_id without an index, so review completion will scan the billing tables as they grow.

Fix these issues in Kilo Cloud

Other Observations (not in diff)

None.

Files Reviewed (2 files)
  • src/app/api/internal/code-review-status/[reviewId]/route.ts - 1 issue
  • src/lib/code-reviews/db/code-reviews.ts - 1 issue

The billing query grouped by model and picked one row, so tokens from
other models in the same session were lost. Split into two queries:
one for session-wide totals, one for the dominant model name.
@alex-alecu alex-alecu enabled auto-merge March 10, 2026 11:25
let review = await getCodeReviewById(reviewId);

// Short poll: usage may arrive from the orchestrator just before the callback
for (let attempt = 0; attempt < MAX_RETRIES && review && !review.model; attempt++) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: This adds a fixed delay to every v2 completion callback

The comment above says cloud-agent-next reviews never receive orchestrator usage, but this loop still waits through the full exponential backoff whenever review.model is empty. For v2 reviews that means every completion path pays ~1.4s before we even try the billing fallback, which delays the reaction/comment update for every successful review.

cliSessionId: string
): Promise<SessionUsageSummary | null> {
try {
const sessionFilter = eq(microdollar_usage_metadata.session_id, cliSessionId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: This fallback filters on an unindexed column

microdollar_usage_metadata only has a created_at index in the schema today, so both aggregation queries will end up scanning the metadata table by session_id on every completed v2 review. Because this runs on the completion callback path, larger billing tables will make review completion and summary updates noticeably slower.

@alex-alecu alex-alecu merged commit 7e81de7 into main Mar 10, 2026
18 checks passed
@alex-alecu alex-alecu deleted the fix/code-review-token-usage branch March 10, 2026 11:29
alex-alecu added a commit that referenced this pull request Mar 10, 2026
…on v2 reviews (#979)

## Summary

Follow-up to [PR #978](#978). The
billing fallback query that fetches token/model data for v2 reviews was
timing out in production, so the usage footer ("Reviewed by model · X
tokens") was never shown.

**Root cause:** the query filters `microdollar_usage_metadata` by
`session_id`, but that column has no index. The table has ~469M rows, so
every query did a full table scan and timed out. The `catch` block
silently returned `null`, and the footer was skipped.

**Fix:**
- Add a `created_at >= reviewCreatedAt` lower bound to the billing
query. This lets Postgres use the existing `created_at` index (query
cost drops from full-scan to ~288). Billing rows can't exist before the
review was created, so the bound is exact.
- Skip the v1 poll loop for v2 reviews (saves ~1.4s of wasted retries).
- Remove the `session_id` index migration — with the time bound, it's
not needed.
- Clean up the admin dashboard: remove agent version filter and
performance chart that are no longer useful now that all reviews are v2.

## Verification

- [x] `pnpm typecheck` — no new errors (only pre-existing kiloclaw
errors)
- [x] `pnpm test usage-footer` — 10/10 pass
- [x] `pnpm test schema` — 15/15 pass (no unmigrated schema changes)
- [x] Checked `EXPLAIN` plan on prod DB — query uses
`idx_microdollar_usage_metadata_created_at` with cost ~288
- [x] Confirmed billing data exists for test session
`ses_3282e02f5ffe2vPRBSqdpc0e40` (PR #981 review) — 8 rows returned in
<1s with time-bounded query

## Visual Changes

N/A

## Reviewer Notes

- Every completed v2 review in prod has `model = NULL` — the billing
fallback has never worked. This fix unblocks all future v2 reviews.
- The back-fill write (fire-and-forget) still runs after fetching
billing data, so repeat reads skip the aggregation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants