Tracked under ADR 0002 — Phase A. Detail: implementation plan §A2.
Problem
Every LLM call blocks the UI behind a full-screen spinner (static/js/script.js:407). A 3000-character draft on gemma4:31b can take 60–180 seconds; the user has no signal that progress is happening. Comparable products (Claude, ChatGPT, Gemini) stream tokens, which dramatically improves perceived responsiveness.
Scope
- Add SSE support to
internal/infrastructure/llamacpp/client.go (OpenAI-compatible stream: true).
- Wire streaming into:
application/brief.InterviewService.GenerateFollowUp — chunks append to the live deep-dive bubble.
application/draft.GenerateDraftService.Generate — chunks append to the right-rail draft preview.
application/authorstyle.AnalyzeAuthorStyleService stays non-streaming (single short call, no UX win).
- Frontend swaps the global spinner for token-by-token append.
- New endpoints (or
Accept: text/event-stream overload of existing endpoints): clearly documented in the implementation plan.
Acceptance criteria
Out of scope
- Persona/format awareness (Phase B).
- Retry-on-disconnect resilience (note as follow-up if needed).
Files likely touched
internal/infrastructure/llamacpp/client.go
internal/application/brief/service.go
internal/application/draft/service.go
internal/handlers/workflow.go
static/js/script.js
Tracked under ADR 0002 — Phase A. Detail: implementation plan §A2.
Problem
Every LLM call blocks the UI behind a full-screen spinner (
static/js/script.js:407). A 3000-character draft ongemma4:31bcan take 60–180 seconds; the user has no signal that progress is happening. Comparable products (Claude, ChatGPT, Gemini) stream tokens, which dramatically improves perceived responsiveness.Scope
internal/infrastructure/llamacpp/client.go(OpenAI-compatiblestream: true).application/brief.InterviewService.GenerateFollowUp— chunks append to the live deep-dive bubble.application/draft.GenerateDraftService.Generate— chunks append to the right-rail draft preview.application/authorstyle.AnalyzeAuthorStyleServicestays non-streaming (single short call, no UX win).Accept: text/event-streamoverload of existing endpoints): clearly documented in the implementation plan.Acceptance criteria
gemma4:31b.text/event-streamcontent type with incremental chunks.go test ./....Out of scope
Files likely touched
internal/infrastructure/llamacpp/client.gointernal/application/brief/service.gointernal/application/draft/service.gointernal/handlers/workflow.gostatic/js/script.js