Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Note記事のURLまたはユーザー名を入力し、ローカルLLMが新し

今後の大規模改修方針は [ADR 0001](docs/adrs/0001-three-phase-local-article-generation.md) と [3フェーズ実装計画](docs/implementation-plans/three-phase-local-article-generation.md) に整理しています。単一フォームで記事を生成する方式から、文体分析、記事条件の一問一答、記事生成と評価の3フェーズへ移行します。

ADR 0001 を踏まえた次の進化方針は [ADR 0002 — Multi-Persona, Multi-Format Article Generation](docs/adrs/0002-multi-persona-multi-format-extension.md) と [Multi-persona / multi-format 実装計画](docs/implementation-plans/multi-persona-multi-format.md) にまとめています。てりすけ本人と架空キャラ「宇宙野クラウディア」を別人格として扱い、note / cor-jp.com ブログ / Zenn / Qiita / ホームページHTML を切り替え可能にします。

実装 issue と ADR の対応、各層の責務、テスト条件は [Issue and ADR guardrails](docs/implementation-plans/issue-adr-guardrails.md) にまとめています。

## 主な機能
Expand Down
2 changes: 1 addition & 1 deletion docs/adrs/0001-three-phase-local-article-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Date: 2026-05-01

## Status

Accepted
Accepted. Extended by [ADR 0002](0002-multi-persona-multi-format-extension.md), which adds the orthogonal axes of author persona and output format on top of the three-phase pipeline. The phases described below remain authoritative; the registries and strategies introduced by ADR 0002 plug into them rather than replacing them.

## Context

Expand Down
233 changes: 233 additions & 0 deletions docs/adrs/0002-multi-persona-multi-format-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
# ADR 0002: Multi-Persona, Multi-Format Article Generation

Date: 2026-05-02

## Status

Accepted. Extends and partially supersedes [ADR 0001](0001-three-phase-local-article-generation.md). The three-phase workflow (style analysis → interview → draft) is preserved. This ADR adds two orthogonal axes — **author persona** and **output format** — and reshapes UX, persistence, and prompt construction accordingly.

## Context

ADR 0001 produced a working three-phase pipeline for `cor_instrument` (Terisuke) on note.com. Real usage by the project owner exposed three structural gaps:

1. **The owner publishes under multiple identities and to multiple platforms.**
- Terisuke (本人): [note.com/cor_instrument](https://note.com/cor_instrument), [cor-jp.com/blog](https://cor-jp.com/blog/) — reflective entrepreneur essays mixed with technical experience reports. First person 「僕」「私」, narrative arcs ("~した話", "~てしまった件"), philosophical framing.
- Cloudia / 宇宙野クラウディア (架空キャラクター, also operated by the owner): [zenn.dev/cloudia](https://zenn.dev/cloudia), [qiita.com/Cloudia_Cor_Inc](https://qiita.com/Cloudia_Cor_Inc) — character-branded technical tutorials. Hakata-ben (博多弁) flavour, exclamation-driven titles ("クラウディア流!…", "AI探検記【前編】〜最強のお助けAIを探せ!〜"), strong code-block density.
- Treating these as one author profile contaminates the style guide and produces drafts that read as neither voice.

2. **The output format is hard-coded to "note.com paste-ready Markdown article".**
- `internal/domain/article/draft.go:25-35` requires a `# title` first line and rejects code fences.
- `internal/application/draft/prompt.go:12-51` injects "Noteにそのまま貼り付けられる日本語Markdown記事" into every system message.
- `internal/infrastructure/note/fetcher.go:234-246` rejects any host other than `note.com` for source ingestion.
- HTML homepage sections, Zenn/Qiita Markdown articles (which carry frontmatter), and Astro-blog posts cannot be produced today.

3. **The current UX does not feel like working with a writing partner.**
- The interview is a forward-only, edit-locked Q&A; users cannot revise an earlier answer.
- The completed brief is rendered as `JSON.stringify` (`static/js/script.js:178`); accumulated knowledge is not legible.
- The draft textarea is `readonly` (`static/index.html:153`); there is no per-section regenerate.
- The full-screen spinner blocks the page during every LLM call; nothing streams.
- There is no visible history of past style profiles or past sessions; reuse is invisible.

The user's stated comparators are Claude, ChatGPT, and Gemini chat UIs — continuous transcripts with editable history, streaming output, and a side-pinned working artifact.

## Decision

Add two orthogonal first-class concepts to the domain and let the rest of the system depend on them via strategy boundaries.

### Persona

A `Persona` is a named, reusable bundle of:

- one or more `AuthorSource` adapters that supply training material,
- one canonical `WritingStyleGuide` (regenerable when sources change),
- a default `OutputFormat` (overridable per article),
- a default question-set variant for the interview phase,
- optional preset values for the brief (preferred first person, signature opening patterns, recurring themes, anti-patterns).

Two personas ship pre-loaded:

| Persona | Display name | Sources | Default format | Voice notes |
|---|---|---|---|---|
| `terisuke` | てりすけ | `note.com/cor_instrument`, `cor-jp.com/blog/*` | `note_article` | 一人称「僕」/「私」、内省+実体験ナラティブ、起業・キャリア・AI駆動開発、「~した話」「~てしまった件」 |
| `cloudia` | 宇宙野クラウディア | `zenn.dev/cloudia`, `qiita.com/Cloudia_Cor_Inc` | `zenn_article` | 一人称「クラウディア」/「うち」、博多弁混じり、感嘆符・【前編】等の装飾、AI/JS/Pythonチュートリアル、感情的訴求 (「劇的に」「最強の」) |

Personas are user-extensible. Adding a third persona requires only registering it (no code changes inside the prompt builder).

### OutputFormat

An `OutputFormat` is a typed strategy with:

- a Markdown/HTML template surface (frontmatter, heading rules, code-fence policy, length envelope),
- a validator (e.g., Zenn requires frontmatter; note rejects code fences; homepage sections forbid `# title`),
- a default question-set extension (technical formats add "対象スタック", "実行環境", "想定読者の前提知識"; narrative formats add "導入エピソード", "結末で読者に届けたい感情"),
- a system-prompt fragment merged into the draft prompt (instead of a single hard-coded fragment).

Formats shipped in v2:

| Format | Validator highlights | Source/target |
|---|---|---|
| `note_article` | Existing rules: starts with `# `, no code fences, `ですます調` default | note.com paste |
| `markdown_blog` | Frontmatter optional, allows code fences, `## ` first heading allowed (Astro blogs at `cor-jp.com/blog`) | cor-jp.com |
Comment on lines +69 to +70

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

MD038(コードスパン内のスペース): # / ## # / ## に言い換えてください。

note_articlemarkdown_blog の Validator 説明で # ## のように末尾スペースを含むインラインコードが使われています。markdownlint-cli2 の MD038 に該当するため、CI で警告/失敗になり得ます。

意図が「見出しの先頭が # である」なので、# / ## に言い換えた上で「# の後に空白が続く」などを本文で補足するのが良いです。

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 69-69: Spaces inside code span elements

(MD038, no-space-in-code)


[warning] 70-70: Spaces inside code span elements

(MD038, no-space-in-code)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/adrs/0002-multi-persona-multi-format-extension.md` around lines 69 - 70,
The inline code in the Validator descriptions for the `note_article` and
`markdown_blog` entries uses trailing-space code spans like `# ` and `## ` which
triggers markdownlint MD038; update those code spans to `#` and `##`
respectively and add a short clarifying phrase in the surrounding text (e.g.,
"heading markers `#` / `##` followed by a space") to preserve the original
meaning that a space must follow the marker; update the `note_article` and
`markdown_blog` description lines accordingly.

| `zenn_article` | Frontmatter required (`title`, `emoji`, `type`, `topics`, `published`), code fences allowed, technical tone bias | zenn.dev/cloudia |
| `qiita_article` | Frontmatter required (`title`, `tags`), code blocks expected, technical tone bias | qiita.com/Cloudia_Cor_Inc |
| `homepage_section` | HTML output, no `# title`, semantic sectioning, CTA placement guidance | static HTML embed |

The two axes compose: any persona can produce any format. The application layer rejects nonsensical combinations only when the persona explicitly excludes a format (configurable, not hard-coded).

### UX direction

The single-page form becomes a **conversation-first workspace**:

- **Left rail**: persona switcher, project history, "previous style guides" picker.
- **Centre**: chat-style transcript. Past answers are clickable and editable, which forks a new session draft from that point. Deep-dive questions render with the parent question quoted as context.
- **Right rail (artifact panel)**: live brief card and live draft preview, replacing the current JSON dump. Updates as the conversation progresses.
- **Streaming**: SSE for follow-up generation and draft generation. The full-screen spinner is removed.
- **Draft editor**: editable in-place, with "regenerate this section" (selection-based) and "copy to clipboard" buttons for both raw Markdown and rendered HTML output.

The static-JS prototype is preserved as a fallback. The new UX can ship as a progressive overlay (no SPA-framework rewrite required for v2; if desired, a later ADR can introduce one).

### Persistence direction

The flat `data/workflow_store.json` snapshot is replaced by a SQLite-backed store with explicit aggregates:

- `personas` — id, display name, default format, source bundle, current canonical guide id.
- `author_sources` — persona-scoped fetched articles with raw text snapshot for re-derivation.
- `writing_style_guides` — versioned per persona; previous versions retained.
- `projects` — a writing project (e.g., "Q3 cor-jp blog series") that owns multiple articles.
- `articles` — one target deliverable: persona id, output format, brief id, draft history.
- `brief_sessions`, `brief_answers` — unchanged in shape, gain a `parent_answer_id` for fork-on-edit.
- `drafts` — versioned per article with score history.

Acceptance criterion: any prior session can be reopened, its accumulated context shown as a transcript, and a new draft regenerated from any point in history.

This subsumes Issue [#14](https://github.com/terisuke/note_maker/issues/14) (queryable database). Issue [#14](https://github.com/terisuke/note_maker/issues/14) is kept open as the umbrella tracker; the SQLite migration becomes its acceptance.

### Source acquisition direction

`SourceFetcher` becomes a strategy interface with concrete implementations:

- `NoteFetcher` — existing `note.com` page + RSS (kept).
- `ZennFetcher` — public articles (`https://zenn.dev/{user}/articles/{slug}`) and the user's article list.
- `QiitaFetcher` — public REST (no auth needed for public posts; respect rate limits).
- `GenericRSSFetcher` — for Astro/Jekyll/Hugo blogs that expose RSS (cor-jp.com is included here once its RSS is confirmed).
- `HTMLFetcher` — fallback for arbitrary HTML pages with semantic-content extraction.

Per-fetcher rate-limit and User-Agent policy live alongside each adapter.

## Domain Model Changes

New domain types under `internal/domain`:

- `persona` package
- `Persona` (id, display_name, sources, default_format, default_questions_extension)
- `PersonaRegistry` (in-memory + persisted, seeded with `terisuke` and `cloudia`)
- `format` package
- `OutputFormat` (id, validator, template fragment, default_question_extension)
- `FormatRegistry`
- `Validator` interface; per-format implementations (e.g., `NoteValidator`, `ZennValidator`)
- `domain/article`
- `Draft` gains a `format_id`; `NewDraft` dispatches to the format's validator instead of hard-coded rules.
- `domain/brief`
- `ArticleBriefSession` gains `persona_id`, `output_format_id`, `parent_answer_id`.
- `FixedQuestions` becomes a base list extended by the persona and the format.
- `domain/project` (new)
- `Project`, `Article` aggregates as defined in the persistence section.

## Application Service Changes

- `AnalyzeAuthorStyleService` accepts a `persona_id` and persists the resulting guide as a new version under that persona; previous versions are preserved.
- `InterviewService` consults the active persona and format to assemble the question list before the first question.
- `GenerateDraftService` resolves the prompt template fragment from the format's strategy and merges persona-specific tone hints.
- New `RegenerateSectionService` accepts a draft id, a section selector (heading anchor or character range), the brief, and the persona+format; returns a candidate replacement for human review.
- New `StreamingDraftService` produces SSE chunks for the draft phase.

## Infrastructure Changes

- `internal/infrastructure/repository/sqlite` — new package implementing every repository interface; the JSON file repository becomes an export/import utility for portability.
- `internal/infrastructure/source/{note,zenn,qiita,rss,html}` — per-source fetchers behind a common interface in `internal/domain/source` (or kept under `infrastructure` and bound by interface in `domain/persona`).
- `internal/infrastructure/llamacpp` — gains streaming (SSE) support; existing non-streaming path retained for tests.

## API Changes

Additions:

- `GET /api/personas` / `POST /api/personas` / `PATCH /api/personas/{id}` — persona CRUD.
- `GET /api/formats` — read-only registry of available formats.
- `POST /api/projects` / `GET /api/projects` / `GET /api/projects/{id}` — project management.
- `GET /api/sessions/{id}/transcript` — chat-style transcript including parent links.
- `POST /api/sessions/{id}/answers/{answer_id}/edit` — fork-on-edit for past answers.
- `POST /api/drafts/{id}/regenerate-section` — section-level regenerate.
- `POST /api/author-style/analyze`, `POST /api/drafts` — gain `Accept: text/event-stream` for streaming.

Existing `/api/generate` remains a compatibility facade.

## Testing Strategy

- Unit tests added per format validator, per persona seed, per fetcher.
- Integration tests for full transcript edit-and-fork flow.
- Scenario tests:
- `cmd/scenario/persona_terisuke_note` — current behaviour.
- `cmd/scenario/persona_terisuke_blog` — new, targets cor-jp blog format.
- `cmd/scenario/persona_cloudia_zenn` — new, validates Zenn frontmatter and code-fence presence.
- `cmd/scenario/persona_cloudia_qiita` — new.
- HTTP handler tests added for every endpoint in `internal/handlers/workflow.go` (currently uncovered).
- Playwright E2E (extends Issue [#13](https://github.com/terisuke/note_maker/issues/13)) covers persona switch, format switch, edit-and-fork, streaming completion, copy-clipboard, regenerate-section.

## Phased Rollout

The full work is broken into four phases tracked by issues. Each phase is independently shippable and behind a UI toggle until ready.

- **Phase A — Conversation UX (1–2 weeks)**
- Chat transcript with editable answers, streaming, deep-dive context display, draft editor + per-section regenerate.
- Issues: [#17](https://github.com/terisuke/note_maker/issues/17), [#18](https://github.com/terisuke/note_maker/issues/18), [#19](https://github.com/terisuke/note_maker/issues/19), [#20](https://github.com/terisuke/note_maker/issues/20).
- **Phase B — Multi-persona, multi-format (2–3 weeks)**
- Persona registry seeded with `terisuke` and `cloudia`, OutputFormat strategy, source fetcher generalisation, format-aware question sets.
- Issues: [#21](https://github.com/terisuke/note_maker/issues/21), [#22](https://github.com/terisuke/note_maker/issues/22), [#23](https://github.com/terisuke/note_maker/issues/23), [#24](https://github.com/terisuke/note_maker/issues/24), [#25](https://github.com/terisuke/note_maker/issues/25).
- **Phase C — Memory & history (2 weeks, integrates Issue [#14](https://github.com/terisuke/note_maker/issues/14))**
- SQLite store with project/article schema, profile/session reuse UI, brief and guide rendered as cards.
- Issues: [#26](https://github.com/terisuke/note_maker/issues/26), [#27](https://github.com/terisuke/note_maker/issues/27), [#28](https://github.com/terisuke/note_maker/issues/28).
- **Phase D — Quality & ops**
- Handler test coverage, Issue [#11](https://github.com/terisuke/note_maker/issues/11) (style threshold), Issue [#13](https://github.com/terisuke/note_maker/issues/13) (Playwright), Issue [#15](https://github.com/terisuke/note_maker/issues/15) (desktop packaging) follow-up.
- Issue: [#29](https://github.com/terisuke/note_maker/issues/29).

Recommended order: A → C → B → D. Phase C is sequenced before B because the persona registry needs durable storage to be useful; running B on the JSON store would force a second migration.

## Tracked issues

Filed 2026-05-02 as part of the PR that introduced this ADR.

- A1 — [#17](https://github.com/terisuke/note_maker/issues/17) Refactor interview UI to chat-style transcript with editable past answers
- A2 — [#18](https://github.com/terisuke/note_maker/issues/18) Stream LLM responses via SSE for follow-up and draft generation
- A3 — [#19](https://github.com/terisuke/note_maker/issues/19) Editable draft Markdown + per-section regenerate API
- A4 — [#20](https://github.com/terisuke/note_maker/issues/20) Surface deep-dive question rationale in prompt and UI
- B1 — [#21](https://github.com/terisuke/note_maker/issues/21) Introduce Persona and OutputFormat domain concepts (registry + strategy)
- B2 — [#22](https://github.com/terisuke/note_maker/issues/22) Generalize SourceFetcher beyond note.com (Zenn, Qiita, RSS, HTML)
- B3 — [#23](https://github.com/terisuke/note_maker/issues/23) Format-specific prompt templates and draft validators (note / markdown_blog / zenn / qiita / homepage_section)
- B4 — [#24](https://github.com/terisuke/note_maker/issues/24) Seed persona library with `terisuke` and `cloudia` profiles
- B5 — [#25](https://github.com/terisuke/note_maker/issues/25) Format- and persona-aware fixed question sets
- C1 — [#26](https://github.com/terisuke/note_maker/issues/26) Replace JSON store with SQLite-backed schema (extends [#14](https://github.com/terisuke/note_maker/issues/14))
- C2 — [#27](https://github.com/terisuke/note_maker/issues/27) Persona / past-session picker UI
- C3 — [#28](https://github.com/terisuke/note_maker/issues/28) Render brief and style guide as human-readable cards
- D1 — [#29](https://github.com/terisuke/note_maker/issues/29) HTTP handler tests for `internal/handlers/workflow.go` (currently 0% coverage)

## Consequences

Positive:

- The product can serve both Terisuke (本人) and Cloudia (キャラクター) without contaminating either voice.
- Adding a new platform (e.g., Substack, dev.to) becomes one fetcher + one format, not a fork.
- Memory becomes legible: persona library, project history, and brief cards make accumulated context visible.
- The conversation feel matches the Claude/ChatGPT/Gemini comparator class.

Tradeoffs:

- Domain surface grows. The win is mitigated by registries and strategy boundaries instead of conditionals.
- SQLite migration is unavoidable; the JSON file becomes export-only.
- Streaming requires keeping non-streaming paths for tests; carries minor duplication.

## Rejected alternatives

- **One persona with `style_variant` flag.** Rejected because the two voices share *no* tonal substrate; mixing them in one guide demonstrably degrades both. Two distinct guides are operationally cleaner than one guide with branches.
- **Output format as a free-text instruction in the brief.** Rejected because validators must be code-enforced (Zenn frontmatter, note `# title`, HTML semantics). Free text gives no validator handle.
- **Single-page rewrite to React/Vue first.** Rejected as premature; the UX wins (chat transcript, streaming, editable draft) are achievable as progressive enhancements. A framework rewrite is worth a later ADR if profiling shows the static prototype slows iteration.
- **Skip persona model and let the user paste a custom style guide.** Rejected because re-deriving the guide from real articles is the strongest signal we have; manual paste defeats the style-comparator scoring.
Loading
Loading