C4 Model Architecture for European Parliament Intelligence Platform
📐 System Context • 📦 Container View • 🔧 Component Design
📋 Document Owner: CEO | 📄 Version: 1.2 | 📅 Last Updated:
2026-04-20 (UTC) | 📦 Release: v0.8.40
🔄 Review Cycle: Quarterly | ⏰ Next Review: 2026-07-20
The April-2026 release migrated from an AI-authored-HTML pipeline to a deterministic aggregator pipeline. Article HTML is now rendered by
src/aggregator/**from committed Stage-B analysis artifacts — there is no AI-authored HTML step, no per-article-type strategies, no AI_MARKER/FALLBACK_TEMPLATE sentinel contract, and nosrc/utils/content-validator.ts/validate-articles.ts/validate-analysis-completeness.tsruntime validators.Canonical references for the current release:
- 🟢 Render entry point:
src/aggregator/article-generator.ts(CLI:npm run generate-article -- --run <analysis-run-dir>)- 📦 Aggregator modules:
artifact-order.ts,clean-artifact.ts,analysis-aggregator.ts,markdown-renderer.ts,article-html.ts,article-metadata.ts(5-tier editorial-highlight resolver for<title>/<meta description>— manifest override → first artefact H1 → aggregated H1 → first strong prose → localized template)- 🤖 Agentic workflows: 8 unified
news-<type>.mdfiles (Stages A → B → C → D → E in one session) +news-translate.md; the split-family workflows (news-<type>-analysis.md+news-<type>-article.md) and the manualnews-article-generator.mdhelper were deleted- 💰 Economic-context enforcement: editorial Stage-C agent-side review over
intelligence/economic-context.md(the Wave-2 OR-gate and Wave-3/Wave-4 strict runtime gates insrc/utils/content-validator.tswere purged with the rest of the validator layer; enforcement moved to the Stage-C completeness review protocol in.github/prompts/03-analysis-completeness-gate.mdand the depth floors inanalysis/methodologies/reference-quality-thresholds.json)The C4 Container and Component diagrams in this document have been rewritten against the post-migration aggregator stack — they no longer reference deleted strategies/builders/
content-validator.ts.
This document serves as the primary entry point for the EU Parliament Monitor's architectural documentation. It provides a comprehensive view of the system's design using the C4 model approach, starting from a high-level system context and drilling down to component interactions.
| Document | Focus | Description | Documentation Link |
|---|---|---|---|
| Architecture | 🏛️ Architecture | C4 model showing current system structure | View Source |
| Future Architecture | 🏛️ Architecture | C4 model showing future system structure | View Source |
| Mindmaps | 🧠 Concept | Current system component relationships | View Source |
| Future Mindmaps | 🧠 Concept | Future capability evolution | View Source |
| SWOT Analysis | 💼 Business | Current strategic assessment | View Source |
| Future SWOT Analysis | 💼 Business | Future strategic opportunities | View Source |
| Data Model | 📊 Data | Current data structures and relationships | View Source |
| Future Data Model | 📊 Data | Enhanced European Parliament data architecture | View Source |
| Flowcharts | 🔄 Process | Current data processing workflows | View Source |
| Future Flowcharts | 🔄 Process | Enhanced AI-driven workflows | View Source |
| State Diagrams | 🔄 Behavior | Current system state transitions | View Source |
| Future State Diagrams | 🔄 Behavior | Enhanced adaptive state transitions | View Source |
| Security Architecture | 🛡️ Security | Current security implementation | View Source |
| Future Security Architecture | 🛡️ Security | Security enhancement roadmap | View Source |
| Threat Model | 🎯 Security | STRIDE threat analysis | View Source |
| Classification | 🏷️ Governance | CIA classification & BCP | View Source |
| CRA Assessment | 🛡️ Compliance | Cyber Resilience Act | View Source |
| Workflows | ⚙️ DevOps | CI/CD documentation | View Source |
| Future Workflows | 🚀 DevOps | Planned CI/CD enhancements | View Source |
| Business Continuity Plan | 🔄 Resilience | Recovery planning | View Source |
| Financial Security Plan | 💰 Financial | Cost & security analysis | View Source |
| End-of-Life Strategy | 📦 Lifecycle | Technology EOL planning | View Source |
| Unit Test Plan | 🧪 Testing | Unit testing strategy | View Source |
| E2E Test Plan | 🔍 Testing | End-to-end testing | View Source |
| Performance Testing | ⚡ Performance | Performance benchmarks | View Source |
| Security Policy | 🔒 Security | Vulnerability reporting & security policy | View Source |
EU Parliament Monitor is developed and maintained in accordance with Hack23 AB's Information Security Management System (ISMS), which is aligned with ISO 27001:2022, NIST CSF 2.0, and CIS Controls v8.1.
| Policy | Description | Relevance to EU Parliament Monitor |
|---|---|---|
| Information Security Policy | Establishes organization-wide security governance and risk management framework | Defines overall security posture, risk assessment methodology, and management responsibilities for the project |
| Secure Development Policy | Defines secure coding standards, code review requirements, and SDLC security gates | Mandates security-first development practices: input validation, dependency scanning, SAST/DAST integration, secure CI/CD pipelines |
| Open Source Policy | Governs use, contribution, and licensing of open source software | Ensures compliance with Apache-2.0 License, dependency license compatibility, and transparent open source contribution practices |
| Classification Policy | Defines data classification scheme (Public, Internal, Confidential, Restricted) and handling requirements | All project content classified as PUBLIC; establishes data handling controls for any future sensitive data integration |
| AI Policy | Governs responsible AI usage, transparency, and human oversight requirements | Governs LLM usage for content generation: transparency requirements, human review workflows, bias mitigation, prompt injection protection |
| Access Control Policy | Defines authentication, authorization, least privilege, and privileged access management | Controls GitHub repository access, branch protection rules, secret management, and deployment permissions |
| Cryptography Policy | Establishes cryptographic standards for data protection (algorithms, key management, TLS) | Mandates HTTPS-only content delivery, TLS 1.2+ (TLS 1.3 where supported) for outbound HTTPS API communications; EP MCP integration uses a local stdio JSON-RPC channel (no TLS); ensures secure secret storage for LLM API keys |
ISO 27001:2022 Controls Implemented:
- A.5.10 - Information Security Policy (documented and reviewed quarterly)
- A.8.3 - Secure Coding (ESLint security rules, CodeQL SAST scanning)
- A.8.23 - Web Filtering (planned CSP headers via CloudFront, XSS prevention)
- A.8.24 - Cryptography (HTTPS-only, TLS 1.2+ / TLS 1.3 where supported, site delivery via CloudFront)
- A.8.28 - Secure Coding (input validation, dependency scanning)
NIST CSF 2.0 Functions Addressed:
- Identify (ID): Asset inventory, risk assessment, vulnerability management
- Protect (PR): Access control, data security, secure development
- Detect (DE): Security monitoring, vulnerability scanning, anomaly detection
- Respond (RS): Incident response procedures, GitHub Security Advisories
- Recover (RC): Business Continuity Plan, backup/restore procedures
CIS Controls v8.1 Implemented:
- Control 1: Inventory and Control of Enterprise Assets (documented in repo)
- Control 4: Secure Configuration (branch protection, security policies)
- Control 6: Access Control Management (GitHub RBAC, least privilege)
- Control 8: Audit Log Management (GitHub audit logs, workflow logs)
- Control 10: Malware Defenses (Dependabot, npm audit, CodeQL)
- Control 16: Application Software Security (SAST, dependency scanning, secure coding)
Evidence of ISMS compliance is maintained through:
- Policy Documents: All policies stored in Hack23/ISMS-PUBLIC
- Security Architecture: SECURITY_ARCHITECTURE.md maps controls to implementations
- Threat Model: THREAT_MODEL.md documents STRIDE analysis and mitigations
- Classification: CLASSIFICATION.md defines data classification and handling
- Audit Trail: GitHub audit logs, workflow execution logs, dependency scan reports
- Security Scanning: CodeQL results, Dependabot alerts, npm audit reports
EU Parliament Monitor is a TypeScript-first static site generator and political intelligence platform that creates multi-language news articles about European Parliament activities. Content is produced by a fleet of 9 agentic GitHub Workflows (gh-aw — 8 unified news-<type>.md + news-translate.md) that drive AI agents (Claude Opus 4.7 via GitHub Copilot) through the Stage A→E protocol, consuming structured data from three data surfaces:
- European Parliament MCP Server
v1.2.18+(primary — 60+ tools including plenary, MEPs, votes, committees, procedures, adopted texts, sliding-window + fixed-window feeds, analytical tools, and a three-state voting fallback to the EP Open Data Portal) - World Bank Open Data MCP (non-economic only — WDI social/health/education/environment/governance indicators)
- IMF REST (SDMX 3.0) native TypeScript fetch client — primary economic source: WEO + Fiscal Monitor + IFS + BOP + ER + PCPS + GFSR + EREO + FSI + GFS + DOT
TypeScript code handles data acquisition, analysis orchestration, HTML structure, and validation; AI agents author all narrative content under a strict two-pass AI-First Quality regime.
Enable democratic transparency by providing automated, multilingual coverage of European Parliament activities through a secure, maintainable static site architecture.
- Minimal Runtime Dependencies: Pure static HTML/CSS output with no server-side
execution; one pinned production dependency (
european-parliament-mcp-server@1.2.18) plus one optional dependency (worldbank-mcp@1.0.1) used only at build time;markdown-it+ plugins (markdown-it-anchor,markdown-it-footnote,markdown-it-attrs,markdown-it-deflist) vendored in the aggregator for deterministic artifact rendering - TypeScript Source: All source in
src/written in TypeScript 6.0.3 (strict, ESM,"type": "module"), compiled viatsc—rootDir: ./src,outDir: ./scripts,target: ES2025,module: NodeNext - Multi-Language Support: Generates content in 14 languages (
en, sv, da, no, fi, de, fr, es, nl, ar, he, ja, ko, zh), defined insrc/constants/language-core.ts::ALL_LANGUAGES - Article Types: 8 production content types (
breaking,committee-reports,month-ahead,month-in-review,motions,propositions,week-ahead,week-in-review) — each type is a slug, not a strategy module; the aggregator renders the same canonical artifact order for every type and per-type content differences are carried by the Stage-B artifacts themselves - Agentic Workflows: 9 unified gh-aw markdown workflows — 8
news-<type>.mdarticle types (Stages A → B → C → D → E in one session, active-work budget 22–27 min before the single safe-outputscreate_pull_requestcall, 75-min hard timeout) +news-translate.md(14-language flush translation, exempt from the single-PR rule) — compiled to.lock.ymlviagh aw compile --validate(pinnedGH_AW_VERSION: v0.69.0) - Analysis-Artifact-Driven Article Pipeline: Agents author the full Stage-B artifact set under
analysis/daily/<date>/<slug>-run<NN>/and commit it. The deterministic aggregator (src/aggregator/**, invoked vianpm run generate-article -- --run <analysis-run-dir>for a single run ornpm run generate-article:allfor batch regen) walksmanifest.json, cleans each artifact, and emits the final HTML with the shared site chrome (stacked header + embedded 14-language switcher + TOC sidebar + footer stats) and 14-language hreflang entries. There is no AI-authored HTML step, no strategies, no builders, no section-builders - Economic Data (IMF-primary, Wave-4 strict default editorial): IMF REST is the primary source for every economic claim in
intelligence/economic-context.md; World Bank MCP provides complementary non-economic context only. Enforcement is editorial at the Stage-C completeness review — the legacy runtime gates (articlePolicyHasEconomicContext,articlePolicyHasIMFEconomicEvidence,isWave3IMFStrictEnabled) insrc/utils/content-validator.tswere purged in April-2026; the Stage-C reviewer applies the IMF-required-for-policy rule directly over the committed artifact - Quality-Through-Artifact Principle: Mandatory 2-pass iterative improvement during Stage B (~60% pass 1, ~40% pass 2); ≥ 80 words/SWOT item, ≥ 150 words/stakeholder perspective, ≥ 1 Mermaid or Chart.js visualisation per core artifact, 0
[AI_ANALYSIS_REQUIRED]sentinel markers in any committed file (enforced at Stage-C agent-side review againstreference-quality-thresholds.json) - MCP Integration: Spawned as local child processes via stdio JSON-RPC at build time; inside agentic workflows via the
awmggateway athttp://host.docker.internal:8080/mcp/european-parliament - Security by Design: Minimal attack surface through static architecture; 5-layer gh-aw security (AWF Squid firewall allowlist, sandboxed Docker, safe-output constraints, JSONL audit trail, lock file compilation); agent prose-injection class of defects eliminated at the root by the aggregator migration (no AI-authored HTML step means no template-prose leak vector)
- AWS Hosted: AWS S3 + CloudFront (primary, via
deploy-s3.ymlwith OIDC auth); GitHub Pages retained as documented fallback; npm package published toregistry.npmjs.org/euparliamentmonitorwith SLSA Level 3 provenance
👤 User Focus: Shows how different user types interact with the EU Parliament Monitor system and what external systems it depends on.
🌐 Integration Focus: Illustrates the relationships with GitHub infrastructure, European Parliament APIs, and LLM services.
C4Context
title EU Parliament Monitor - System Context Diagram
Person(citizen, "European Citizen", "Reads news about European Parliament activities in their native language")
Person(journalist, "Journalist", "Uses site as research source for European political coverage")
Person(researcher, "Political Researcher", "Analyzes EP activities and trends")
Person(contributor, "Developer/Contributor", "Maintains and improves the news generation system")
System(epmonitor, "EU Parliament Monitor", "Static site with multilingual news about European Parliament activities")
System_Ext(github, "GitHub", "Hosts repository, runs CI/CD (GitHub Actions)")
System_Ext(aws, "AWS (S3 + CloudFront)", "Serves static site globally via CDN")
System_Ext(ep_mcp, "European Parliament MCP Server", "Provides structured access to EP data")
System_Ext(ep_api, "European Parliament APIs", "Official EP data sources (plenary, committees, documents)")
System_Ext(llm, "LLM Service", "Generates article content from structured EP data")
Rel(citizen, epmonitor, "Reads news", "HTTPS")
Rel(journalist, epmonitor, "Researches stories", "HTTPS")
Rel(researcher, epmonitor, "Analyzes data", "HTTPS")
Rel(contributor, github, "Contributes code", "Git/HTTPS")
Rel(epmonitor, github, "Built and deployed via", "GitHub Actions")
Rel(epmonitor, aws, "Hosted on", "S3 + CloudFront")
Rel(github, epmonitor, "Generates site via", "GitHub Actions")
Rel(epmonitor, ep_mcp, "Fetches EP data via", "MCP Protocol")
Rel(ep_mcp, ep_api, "Queries EP data", "HTTPS/JSON")
Rel(epmonitor, llm, "Generates content via", "API/SDK")
UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
| Element | Type | Description | Technology |
|---|---|---|---|
| European Citizen | User | Primary audience seeking EP news in native language | Web Browser |
| Journalist | User | Professional using site for research and story development | Web Browser |
| Political Researcher | User | Academic or analyst studying EP activities | Web Browser |
| Developer/Contributor | User | Maintainer improving system | Git, Node.js, VS Code |
| EU Parliament Monitor | System | Core static site generator | Node.js, TypeScript |
| GitHub | External System | Source control, CI/CD | GitHub Actions |
| EP MCP Server | External System | Structured EP data access | MCP Protocol, TypeScript |
| EP APIs | External System | Official data sources | REST APIs, JSON |
| LLM Service | External System | Content generation | API (OpenAI/Anthropic/etc.) |
graph TB
subgraph "Public Internet - Untrusted Zone"
Users["Web Users\nCitizens, Journalists, Researchers"]
end
subgraph "GitHub Infrastructure - Trusted Zone"
subgraph "Build Environment"
Actions["GitHub Actions Runner\nGitHub-hosted Ubuntu runner\nubuntu-latest + Node.js 25"]
EPServer["European Parliament\nMCP Server\nLocal process, stdio JSON-RPC"]
end
subgraph "Source Control"
Repo["Git Repository\nVersion Control"]
end
end
subgraph "AWS Hosting - Cloud Infrastructure Zone"
Pages["AWS S3 + CloudFront CDN\nHTTPS via ACM"]
end
subgraph "External Services - Partially Trusted Zone"
EPAPI["European Parliament\nOfficial APIs"]
LLM["LLM Service\nOpenAI/Anthropic"]
end
Users -->|"HTTPS GET\nRead-Only"| Pages
Actions -->|"Spawns locally\nstdio JSON-RPC"| EPServer
EPServer -->|"HTTPS/JSON\nData Queries"| EPAPI
Actions -->|"API Calls\nContent Gen"| LLM
Actions -->|"Git Push\nAuthenticated"| Repo
Actions -->|"S3 Sync + CF Invalidation\nAuthenticated via OIDC"| Pages
classDef users fill:#CE93D8,stroke:#6A1B9A,stroke-width:2px,color:#000000
classDef hosting fill:#A5D6A7,stroke:#2E7D32,stroke-width:2px,color:#000000
classDef actions fill:#90CAF9,stroke:#1565C0,stroke-width:2px,color:#000000
classDef external fill:#FFE082,stroke:#F57C00,stroke-width:2px,color:#000000
class Users users
class Pages hosting
class Actions actions
class EPServer,EPAPI,LLM external
Trust Boundary Analysis:
| Zone | Trust Level | Security Controls | Threat Model |
|---|---|---|---|
| Public Internet | Untrusted | HTTPS-only, planned CSP headers, static content only | DDoS, XSS attempts (mitigated by static architecture) |
| GitHub Infrastructure | Trusted | GitHub authentication, branch protection, optional signed commits, secret scanning | Supply chain attacks (mitigated by Dependabot, CodeQL) |
| AWS Hosting | Trusted | ACM certificate, HTTPS redirect, DDoS protection via CloudFront | Hosting infrastructure compromise (mitigated by AWS security controls, OIDC deploy auth) |
| External Services | Partially Trusted | API authentication, basic input parsing/shape validation; planned systematic sanitization/escaping and rate limiting | Data poisoning, API compromise (mitigated by validation, monitoring, planned hardening) |
Key Security Boundaries:
- User → CloudFront: Read-only HTTPS access, no authentication required (public content)
- GitHub Actions → External APIs: Authenticated API calls, input validation, error handling
- GitHub Actions → AWS S3: Authenticated S3 sync + CloudFront invalidation, only static files deployed
- External Services → System: Data parsed and basic shape-validated before use; comprehensive sanitization/escaping and rate limiting are planned controls
📦 Container Focus: Major containers (applications, data stores, MCP clients) of the post-April-2026 aggregator pipeline.
🔄 Data Flow Focus: How agentic workflows produce analysis artifacts and how the deterministic aggregator renders them into 14-language HTML.
%%{init: {"theme":"dark","themeVariables":{"primaryColor":"#1565C0","primaryTextColor":"#fff","lineColor":"#90CAF9","fontFamily":"Inter, Helvetica, Arial, sans-serif"}}}%%
C4Container
title EU Parliament Monitor — Container Diagram (April-2026 aggregator pipeline)
Person(user, "Reader", "Reads multilingual EP news at euparliamentmonitor.com")
Person(contributor, "Contributor", "Maintains code, methodologies, translations")
Person(researcher, "Researcher / Journalist", "Audits analysis artifacts via the Political Intelligence Hub")
Container_Boundary(epmonitor, "EU Parliament Monitor") {
Container(aw_orchestrator, "gh-aw Orchestrator", "Agentic Workflows (Claude Opus 4.7)", "9 agentic workflows: 8 unified news-<type>.md + news-translate.md")
Container(prompt_lib, "Prompt Library", "10 bounded contexts", ".github/prompts/00-scope … 09-troubleshooting; lint:prompts drift-guard")
Container(methodology_lib, "Methodology Library", "Markdown methodologies + JSON thresholds", "17 methodologies + reference-quality-thresholds.json (analysis/methodologies/)")
Container(template_lib, "Template Library", "51 Markdown templates", "39 core + 12 extended (analysis/templates/)")
ContainerDb(analysis_runs, "Analysis Runs", "Markdown + JSON", "analysis/daily/YYYY-MM-DD/<type>/{manifest.json,intelligence/,classification/,risk-scoring/,threat-assessment/,documents/,extended/}")
Container(aggregator, "Aggregator (5 modules)", "TypeScript", "src/aggregator/**: artifact-order · clean-artifact · analysis-aggregator · markdown-renderer · article-html · article-metadata · article-generator (CLI)")
Container(ep_client, "EP MCP Client", "TypeScript", "Stdio JSON-RPC to european-parliament-mcp-server@1.2.18+; 60+ tools; getVotingRecordsWithFallback() to EP Open Data Portal (src/mcp/ep-mcp-client.ts)")
Container(wb_client, "World Bank MCP Client", "TypeScript", "WORLD_BANK_MCP_TOOLS — non-economic indicators only (src/mcp/wb-mcp-client.ts)")
Container(imf_client, "IMF REST Client", "TypeScript", "IMF_MCP_TOOLS — primary economic source: WEO/Fiscal Monitor/IFS/BOP/ER/PCPS (src/mcp/imf-mcp-client.ts)")
Container(stage_c_review, "Stage-C Review", "Editorial agent + thresholds", "Reads .github/prompts/03-analysis-completeness-gate.md + reference-quality-thresholds.json — replaces purged content-validator.ts")
Container(news_indexes, "News Indexes & Sitemap", "TypeScript", "Per-language index pages + sitemap.xml + sitemap_<lang>.html (src/generators/news-indexes.ts, sitemap.ts)")
ContainerDb(static_files, "Static Site Output", "HTML/CSS/JS/JSON", "news/<slug>-<lang>.html (14 langs) · news/<slug>.en.md · article.md per run · sitemap.xml · articles-metadata.json")
}
Container_Boundary(github_infra, "GitHub Infrastructure") {
Container(actions, "GitHub Actions", "CI/CD + gh-aw runtime", "9 news + ~15 standard workflows; SHA-pinned actions; OpenSSF Scorecard")
ContainerDb(repo, "Git Repository", "Version control", "Source + analysis runs + generated content; SLSA L3 provenance")
}
Container_Boundary(aws_infra, "AWS Infrastructure") {
Container(cf_s3, "CloudFront + S3", "CDN / object storage", "Primary hosting · ACM HTTPS · OIDC GithubWorkFlowRole · cache: HTML 1h, immutable assets 1y")
}
System_Ext(ep_mcp, "European Parliament MCP Server v1.2.18+", "60+ tools — plenary, voting, motions, committee, MEPs, declarations, procedures, analytical (voting-anomaly, coalition, MEP-influence)")
System_Ext(ep_open_data, "EP Open Data Portal", "https://data.europarl.europa.eu — voting-records fallback (/api/v2/decision)")
System_Ext(wb_mcp, "World Bank Open Data MCP", "Non-economic WDI indicators (health, education, environment, governance)")
System_Ext(imf_api, "IMF SDMX 3.0 REST", "https://dataservices.imf.org/REST/SDMX_3.0/")
System_Ext(copilot, "GitHub Copilot / Claude Opus 4.7", "Authors analysis Markdown under 2-pass AI-First Quality regime — never authors HTML")
Rel(user, cf_s3, "Reads HTML in 14 langs", "HTTPS")
Rel(researcher, repo, "Audits analysis/daily/", "Git/HTTPS")
Rel(contributor, repo, "Commits code + methodologies", "Git/HTTPS")
Rel(actions, aw_orchestrator, "Triggers on schedule / manual", "gh-aw engine")
Rel(aw_orchestrator, copilot, "Delegates analysis authoring", "Copilot CLI")
Rel(aw_orchestrator, prompt_lib, "Imports prompts", "Markdown")
Rel(aw_orchestrator, methodology_lib, "Reads methodologies", "Markdown")
Rel(aw_orchestrator, template_lib, "Fills templates", "Markdown")
Rel(aw_orchestrator, ep_client, "Stage A — fetch", "fn")
Rel(aw_orchestrator, wb_client, "Stage A — context (optional)", "fn")
Rel(aw_orchestrator, imf_client, "Stage A — economic context", "fn")
Rel(aw_orchestrator, analysis_runs, "Stage B — write artifacts", "fs.write")
Rel(aw_orchestrator, stage_c_review, "Stage C — completeness gate", "agent review")
Rel(stage_c_review, analysis_runs, "Reads + grades", "fn")
Rel(aw_orchestrator, aggregator, "Stage D — npm run generate-article", "CLI")
Rel(aggregator, analysis_runs, "Reads manifest.json + artifacts", "fs.read")
Rel(aggregator, static_files, "Writes 14 HTML + Markdown", "fs.write")
Rel(news_indexes, static_files, "Writes index pages", "fs.write")
Rel(ep_client, ep_mcp, "stdio JSON-RPC", "MCP")
Rel(ep_client, ep_open_data, "Voting fallback", "HTTPS REST")
Rel(wb_client, wb_mcp, "stdio JSON-RPC", "MCP")
Rel(imf_client, imf_api, "HTTPS / SDMX", "REST")
Rel(static_files, repo, "Stage E — single PR", "Git")
Rel(actions, cf_s3, "Deploy via OIDC", "S3 sync + CloudFront invalidation")
UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
| 🧱 Container | ⚙️ Technology | 🎯 Purpose | 🔄 Data flow |
|---|---|---|---|
| 🤖 gh-aw Orchestrator | Claude Opus 4.7 + gh-aw v0.69.0+ | Runs 9 agentic workflows; produces analysis artifacts | Triggers via cron / manual; commits one PR per run |
| 📚 Prompt / Methodology / Template libraries | Markdown + JSON | Bounded-context prompts (10), methodologies (17), templates (51) | Read by every agentic workflow at start-of-session |
| 🧠 Analysis Runs | Markdown + JSON | Per-run intelligence tree under analysis/daily/<date>/<type>/ |
Written by Stage B agents; consumed by Stage C and aggregator |
| 🟢 Aggregator (5 modules) | TypeScript | Reads manifest.json and Markdown artifacts; renders 14-language HTML deterministically |
npm run generate-article -- --run <dir> |
| 🔌 EP MCP Client | TypeScript | 60+ EP tools + voting fallback to EP Open Data Portal /api/v2/decision |
Stage A data collection |
| 💰 IMF / 🌱 World Bank Clients | TypeScript | Economic context (IMF) + non-economic indicators (WB) | Stage A wave-2 context |
| ⚖️ Stage-C Review | Editorial agent + JSON thresholds | Per-artifact line floors + tradecraft signals (Admiralty / WEP / ICD-203) | Replaces the purged content-validator.ts runtime gate |
| 🌐 News Indexes & Sitemap | TypeScript | 14-language index pages, sitemap.xml, hreflang alternates | npm run prebuild |
| 📦 Static Site Output | HTML / CSS / JS / JSON / Markdown | Public deliverable: news pages + article.md source per run |
Committed to main, deployed to S3 |
| 🚀 GitHub Actions | CI/CD + gh-aw | 9 news + ~15 standard workflows | Daily news + on-PR validation |
| ☁️ CloudFront + S3 | CDN / object storage | Primary hosting via OIDC GithubWorkFlowRole |
HTTPS + immutable asset cache |
| Container | Security responsibility | Implementation | Controls |
|---|---|---|---|
| 🤖 gh-aw Orchestrator | Sandboxed AWF runtime, Squid egress allowlist, capability-bounded safe outputs | Runs in GitHub-hosted ephemeral VMs; safe-outputs.create-pull-request.max: 1; step-security/harden-runner egress block |
A.5.10, A.8.28 (ISO 27001), CIS 16 |
| 🟢 Aggregator | Deterministic Markdown→HTML; explicit markdown-it plugin allowlist; clean-artifact.ts strips SPDX/banners; script-src 'self' CSP |
No AI-authored HTML; vendored Mermaid/Chart.js/D3 under js/vendor/ |
A.8.23, A.8.28 (ISO 27001), OWASP A03 |
| 🔌 EP MCP Client | Local stdio JSON-RPC; per-request timeout + retry backoff; envelope validation; voting-records three-state fallback | safeCallTool() + callToolWithRetry() wrappers in ep-mcp-client.ts |
A.8.24 (ISO 27001), CIS 16 |
| 🧠 Analysis Artifacts | Tradecraft grading (Admiralty A1–F6 + WEP) + provenance manifest | manifest.json cross-reference map; methodology-reflection.md audit |
A.5.12 (ISO 27001) |
| 📦 Static Files | Public-data only; integrity via Git + SLSA L3 attestation | All EP/IMF/WB content is public; SBOM, REUSE licence headers | A.5.10 (ISO 27001) |
| 🚀 GitHub Actions | OIDC for AWS deploy; Secrets at job-scope; SHA-pinned third-party actions | GithubWorkFlowRole IAM with least privilege; harden-runner egress allowlist |
A.8.3, CIS 6 |
| ☁️ CloudFront + S3 | HTTPS-only via ACM; bucket policy denies public ACLs; CloudFront cache-control by file class | Long-cache immutable assets, short-cache HTML | A.13.1, A.5.23 (ISO 27001) |
| Amazon CloudFront + S3 | HTTPS-only, CDN security, DDoS protection | Forces HTTPS redirect via ACM certificate, CloudFront with DDoS mitigation, HSTS headers (configured externally in CloudFront distribution) | A.8.24 (ISO 27001) |
| Git Repository | Access control, branch protection, signed commits | RBAC with least privilege, protected main branch, optional signed commits | CIS Control 6, A.8.3 |
graph TB
subgraph "Generation Layer - Build Time Security"
NewsGen["News Generator\nInput Validation\nData Sanitization"]
MCPClient["MCP Client\nLocal stdio JSON-RPC\nConnection Retry\nRequest Timeout"]
Template["Template Engine\nXSS Prevention\nCSP Generation\nHTML Sanitization"]
end
subgraph "Storage Layer - Version Control Security"
GitRepo["Git Repository\nBranch Protection\nCode Review\nAudit Logs"]
Secrets["GitHub Secrets\nEncrypted Storage\nLeast Privilege"]
end
subgraph "Delivery Layer - Runtime Security"
Pages["Amazon CloudFront + S3\nHTTPS-Only\nHSTS Headers\nDDoS Protection"]
CDN["CloudFront Edge\nTLS Termination\nEdge Caching\nGeographic Distribution"]
end
subgraph "External Layer - Third-Party Security"
EPMCP["EP MCP Server\nMCP Protocol\nData Validation"]
LLM["LLM Service\nAPI Key Auth\nPrompt Injection Prevention"]
end
NewsGen -->|Validated Data| Template
NewsGen -->|"Spawns locally via stdio"| MCPClient
MCPClient -->|JSON-RPC| EPMCP
NewsGen -->|Secured API Calls| LLM
Template -->|Safe HTML| GitRepo
Secrets -->|Inject at Runtime| NewsGen
GitRepo -->|Deploy to S3| Pages
Pages -->|Cached Content| CDN
classDef generation fill:#90CAF9,stroke:#1565C0,stroke-width:2px,color:#000000
classDef storage fill:#A5D6A7,stroke:#2E7D32,stroke-width:2px,color:#000000
classDef delivery fill:#FFCC80,stroke:#F57C00,stroke-width:2px,color:#000000
classDef external fill:#CE93D8,stroke:#6A1B9A,stroke-width:2px,color:#000000
class NewsGen,Template,MCPClient generation
class GitRepo,Secrets storage
class Pages,CDN delivery
class EPMCP,LLM external
🔧 Component Focus: Internal components of the deterministic aggregator (src/aggregator/**) and supporting MCP / methodology modules.
🎯 Responsibility Focus: How analysis Markdown artifacts produced by the agentic workflows become 14-language HTML deliverables.
%%{init: {"theme":"dark","themeVariables":{"primaryColor":"#1565C0","primaryTextColor":"#fff","lineColor":"#90CAF9","fontFamily":"Inter, Helvetica, Arial, sans-serif"}}}%%
C4Component
title EU Parliament Monitor — Aggregator Components (post-April-2026)
Container_Boundary(aggregator_c, "Aggregator (src/aggregator/)") {
Component(article_generator, "article-generator.ts", "TypeScript CLI", "Entry point: npm run generate-article -- --run <dir>; walks manifest.json")
Component(artifact_order, "artifact-order.ts", "TypeScript", "ARTIFACT_SECTIONS — canonical 19-section order")
Component(clean_artifact, "clean-artifact.ts", "TypeScript", "Strips SPDX/banner/provenance front matter from each artifact before merge")
Component(analysis_aggregator, "analysis-aggregator.ts", "TypeScript", "aggregateAnalysisRun() — filters manifestFiles to .md only excluding data/runs/pass1; emits Provenance & Audit block at END")
Component(markdown_renderer, "markdown-renderer.ts", "TypeScript", "markdown-it + plugins (anchor, footnote, attrs, deflist); explicit allowlist; renderMarkdown()")
Component(article_html, "article-html.ts", "TypeScript", "HTML5 wrapper: stacked header, language switcher, TOC sidebar, JSON-LD NewsArticle, isBasedOn provenance, hreflang alternates, footer")
Component(article_metadata, "article-metadata.ts", "TypeScript", "5-tier editorial-highlight resolver for <title>/<meta description>: manifest override → first-artifact H1 → aggregated H1 → first strong prose → localized template")
}
Container_Boundary(mcp_c, "MCP & Data Clients (src/mcp/)") {
Component(ep_client, "ep-mcp-client.ts", "TypeScript", "60+ tools; safeCallTool + callToolWithRetry; recess-mode detection; slow-feed warnings")
Component(ep_open_data, "ep-open-data-client.ts", "TypeScript", "EPOpenDataClient + getVotingRecordsWithFallback() three-state fallback")
Component(wb_client, "wb-mcp-client.ts", "TypeScript", "WORLD_BANK_MCP_TOOLS — non-economic indicators")
Component(imf_client, "imf-mcp-client.ts", "TypeScript", "class IMFMCPClient + IMF_MCP_TOOLS; native fetch SDMX 3.0; primary economic source")
Component(mcp_health, "mcp-health.ts / mcp-retry.ts / mcp-connection.ts", "TypeScript", "Health probes, retry backoff, connection lifecycle")
}
Container_Boundary(intel_c, "Intelligence Utilities (src/utils/, src/generators/)") {
Component(political_classification, "political-classification.ts", "TypeScript", "7-dimension EP event classification")
Component(political_threat, "political-threat-assessment.ts", "TypeScript", "5-framework political threat (Landscape 6D + Attack Trees + Kill Chain + Diamond + ICO)")
Component(political_risk, "political-risk-assessment.ts", "TypeScript", "5×5 Likelihood × Impact scoring")
Component(significance, "significance-scoring.ts", "TypeScript", "Publication priority score per artifact")
Component(quality_scorer, "article-quality-scorer.ts", "TypeScript", "Editorial quality signals")
Component(news_indexes, "news-indexes.ts + sitemap.ts", "TypeScript", "14-language indexes + sitemap.xml + per-language sitemap_<lang>.html")
}
Container_Boundary(scripts_c, "Workflow Scripts (scripts/aggregator/)") {
Component(prior_run_diff, "prior-run-diff.js", "Node.js", "Re-run improve/extend helper; classifies prior-run artifacts as must-extend (carryForward[]) or rewrite; always-on (no env flag); emits priorRunDiff JSON with priorLines+extendFloor")
Component(forward_statements, "forward-statements-registry.js", "Node.js", "Forward-looking-statement JSONL registry; week/month-ahead seeding")
Component(checkpoint, "checkpoint-analysis-to-memory.sh", "Bash", "Pre-audited helper; replaces inline expansion-heavy bash in workflows (shell-safety)")
}
System_Ext(ep_mcp, "EP MCP Server v1.2.18+", "60+ tools — plenary, voting, motions, committee, MEPs, declarations, procedures, analytical")
System_Ext(ep_portal, "EP Open Data Portal", "/api/v2/decision — voting fallback")
System_Ext(wb_mcp, "World Bank Open Data MCP", "Non-economic WDI")
System_Ext(imf_api, "IMF SDMX 3.0", "WEO / FM / IFS / BOP / ER / PCPS")
ContainerDb(analysis_dir, "analysis/daily/<date>/<type>/", "Markdown + JSON", "manifest.json + intelligence/ + classification/ + risk-scoring/ + threat-assessment/ + extended/")
ContainerDb(news_dir, "news/<slug>(-<lang>).{md,html}", "Markdown + HTML", "Per-language deliverables")
Rel(article_generator, analysis_dir, "reads manifest.json", "fs.readFileSync")
Rel(article_generator, artifact_order, "uses ARTIFACT_SECTIONS", "import")
Rel(article_generator, clean_artifact, "cleans each artifact", "fn")
Rel(article_generator, analysis_aggregator, "aggregateAnalysisRun()", "fn")
Rel(analysis_aggregator, markdown_renderer, "renderMarkdown()", "fn")
Rel(markdown_renderer, article_html, "wraps in HTML5 chrome", "fn")
Rel(article_html, article_metadata, "5-tier metadata resolver", "fn")
Rel(article_html, news_dir, "writes 14 HTML + 1 .md", "fs.writeFileSync")
Rel(news_indexes, news_dir, "writes index pages + sitemaps", "fs.writeFileSync")
Rel(ep_client, ep_mcp, "stdio JSON-RPC", "MCP")
Rel(ep_open_data, ep_portal, "voting fallback", "HTTPS")
Rel(wb_client, wb_mcp, "stdio JSON-RPC", "MCP")
Rel(imf_client, imf_api, "HTTPS/SDMX", "REST")
Rel(ep_client, mcp_health, "health + retry", "fn")
Rel(prior_run_diff, analysis_dir, "carry-forward plan", "JSON")
Rel(forward_statements, analysis_dir, "JSONL registry seeding", "fs")
UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="2")
| 🧩 Component | 🎯 Responsibility | 🔗 Dependencies | 📂 File location |
|---|---|---|---|
| 🟢 Aggregator pipeline | Discover manifest.json → clean artifacts → aggregate (19-section canonical order, Provenance & Audit at end, .md only excluding data/runs/pass1/) → render Markdown → wrap HTML with TOC sidebar + shared chrome → write <slug>.en.md + 14 <slug>-<lang>.html |
markdown-it + markdown-it-anchor/-footnote/-attrs/-deflist |
src/aggregator/{article-generator,analysis-aggregator,markdown-renderer,article-html,artifact-order,clean-artifact,article-metadata}.ts |
| 🧠 Analysis artifacts | 51 templates per run (39 core + 12 extended) under analysis/daily/<date>/<type>/ with manifest.json declaring articleType + files map. 3-variant manifest schema (articleType / articleTypes[] / legacy runType) handled by resolveArticleTypeFromManifest() |
17 methodologies (10-step protocol, Rules 1–22) | analysis/methodologies/*.md, analysis/templates/**, analysis/daily/** |
| 🔌 EP MCP Client | 60+ EP tools via stdio JSON-RPC; safeCallTool() + callToolWithRetry() wrappers; recess-mode detection ([1952,2100] year window); slow-feed warning downgrade for get_events_feed |
european-parliament-mcp-server@1.2.18+ (PR #405 normalises political-group codes) |
src/mcp/ep-mcp-client.ts |
| 🗳️ EP Open Data fallback | Three-state voting fallback: (a) MCP has data → use it · (b) MCP empty → query /api/v2/decision · (c) both empty → 🔴 unavailability marker via virtual tool name ep-get-voting-records |
EP Open Data Portal | src/mcp/ep-open-data-client.ts (see getVotingRecordsWithFallback()) |
| 💰 IMF Client | class IMFMCPClient + IMF_MCP_TOOLS; primary economic source per IMF Indicator Mapping; native Node 25 fetch SDMX 3.0; env IMF_API_BASE_URL, IMF_API_TIMEOUT_MS |
None (REST) | src/mcp/imf-mcp-client.ts |
| 🌱 World Bank Client | WORLD_BANK_MCP_TOOLS; non-economic WDI indicators only (health, education, environment, governance, innovation) |
worldbank-mcp (optional) |
src/mcp/wb-mcp-client.ts |
| ⚖️ Stage-C completeness gate | Editorial agent-side review against .github/prompts/03-analysis-completeness-gate.md and analysis/methodologies/reference-quality-thresholds.json line floors. Replaces the purged runtime content-validator.ts |
Methodology library + per-artifact thresholds | .github/prompts/03-…, analysis/methodologies/reference-quality-thresholds.json |
| 🔁 Prior-Run Diff | Re-run improve/extend helper; classifies prior-run artifacts as must-extend (carryForward[]) or below-floor rewrite; always-on (no env flag); emits priorRunDiff JSON with priorLines+extendFloor consumed by Stage B and Stage C |
— | scripts/aggregator/prior-run-diff.js |
| 📜 Forward-statements registry | Canonical last-occurrence-per-id JSONL registry; week/month-ahead seeds data/forward-statements-open.json; Stage C enforces a "carried-forward forward statements" section when open items exist |
JSONL registry | scripts/aggregator/forward-statements-registry.js, analysis/forward-statements/ |
| 🛡️ Shell-safety helper | Pre-audited bash helper for checkpoint-to-memory; replaces expansion-heavy inline workflow bash that the sandbox shell-safety filter would block | Bash | scripts/checkpoint-analysis-to-memory.sh |
| 🧠 Intelligence utilities | political-classification (7D), political-threat-assessment (5-framework), political-risk-assessment (5×5 L×I), significance-scoring, article-quality-scorer |
Types | src/utils/*.ts |
| 🌐 News Indexes & Sitemap | Per-language news index pages, sitemap.xml, per-language sitemap_<lang>.html, hreflang alternates |
Metadata, file-utils | src/generators/news-indexes.ts, src/generators/sitemap.ts |
| 🔢 Constants | ALL_LANGUAGES (14), LANGUAGE_PRESETS (all, eu-core, nordic), article-type slugs, committee indicator map |
— | src/constants/*.ts |
sequenceDiagram
autonumber
participant CLI as CLI Interface
participant Gen as Article Generator
participant MCP as MCP Client
participant EPMCP as EP MCP Server
participant Tmpl as HTML Template
participant Meta as Metadata Manager
participant FS as File System Writer
CLI->>Gen: generate(type, languages)
Gen->>MCP: fetchEPData(type)
MCP->>EPMCP: query(endpoint, params)
EPMCP-->>MCP: return EP data
MCP-->>Gen: return parsed EP data
loop For each language (sequential)
Gen->>Tmpl: renderHTML(epData, lang)
Note over Gen,Tmpl: Current: placeholder English content<br/>Future (ADR-004): native LLM generation per language
Tmpl-->>Gen: return HTML
Gen->>FS: writeFile(path, html)
Gen->>Meta: recordGeneration(article, lang)
end
Meta->>FS: writeMetadata(json)
Gen-->>CLI: generation complete
| Pattern | Components Involved | Purpose | Error Handling |
|---|---|---|---|
| Cache-Aside (Planned) | MCP Client → LRU Cache → EP MCP Server | Reduce API calls, improve performance | Planned: cache miss triggers fresh fetch; current: direct calls to EP MCP Server |
| MCP Connection Retry with Backoff (Current) | MCP Client → EP MCP Server | Handle transient MCP connection failures | Connection attempts retried with backoff; individual MCP requests use a fixed timeout and are not retried |
| Validation Pipeline (Planned) | Content Validator → Article Generator | Ensure content quality | Planned: failed validation triggers regeneration (max 2 attempts); current: single-pass generation without regeneration loop |
| Sequential Multi-Language | Article Generator → HTML Template (per language) | Content generation per language | Current: failure in one language aborts remaining languages; Planned: per-language failures logged while other languages still generate; parallel generation planned (ADR-004) |
| Template Method | Article Generator → HTML Template → File System Writer | Consistent HTML generation | Template errors logged and propagated to prevent partial writes |
| Metadata Aggregation | Metadata Manager → File System Writer | Track generation history | Current: metadata written synchronously via writeFileSync; failures throw and fail the run. Planned: non-blocking, best-effort writes |
☁️ Infrastructure Focus: Shows how the system is deployed on GitHub infrastructure.
🚀 CI/CD Focus: Illustrates the automated deployment pipeline.
C4Deployment
title EU Parliament Monitor - Deployment Diagram
Deployment_Node(github_cloud, "GitHub Cloud", "GitHub Infrastructure") {
Deployment_Node(actions_runner, "GitHub Actions Runner", "Ubuntu 24.04") {
Container(workflow, "News Generation Workflow", "GitHub Actions YAML", "Daily scheduled workflow")
Container(node_runtime, "Node.js Runtime", "Node.js 25", "Executes generation scripts")
}
Deployment_Node(pages_cdn, "AWS Infrastructure", "S3 + CloudFront") {
Container(web_server, "Amazon CloudFront", "CDN / HTTPS", "Serves HTTPS content globally")
ContainerDb(static_content, "Amazon S3 Bucket", "Object Storage", "Generated articles and pages")
}
Deployment_Node(repo_storage, "GitHub Repository", "Git Storage") {
ContainerDb(git_repo, "Git Repository", "Version Control", "Source code and generated content")
}
}
Deployment_Node(user_device, "User Device", "Desktop/Mobile") {
Container(browser, "Web Browser", "Chrome/Firefox/Safari", "Renders news articles")
}
Deployment_Node(external_services, "External Services", "Cloud") {
System_Ext(ep_mcp, "EP MCP Server", "EP data access")
System_Ext(llm, "LLM Service", "Content generation")
}
Rel(workflow, node_runtime, "Executes", "Process")
Rel(node_runtime, ep_mcp, "Fetches data", "stdio/JSON-RPC")
Rel(node_runtime, llm, "Generates content", "HTTPS/API")
Rel(node_runtime, git_repo, "Commits files", "Git")
Rel(git_repo, static_content, "Deploys via", "S3 sync + CloudFront invalidation")
Rel(browser, web_server, "Requests pages", "HTTPS")
Rel(web_server, static_content, "Serves", "HTTP/2")
UpdateLayoutConfig($c4ShapeInRow="2", $c4BoundaryInRow="1")
| Infrastructure Component | Technology | Purpose | Configuration |
|---|---|---|---|
| GitHub Actions Runner | ubuntu-latest, Node.js 25 | Execute generation workflow | .github/workflows/news-*.lock.yml |
| Amazon CloudFront | AWS CDN | Serve static content globally | CloudFront distribution (deploy-s3.yml) |
| Amazon S3 | AWS Object Storage | Host static site files | S3 bucket (deploy-s3.yml) |
| Git Repository | GitHub Storage | Version control + content storage | public repository |
| Web Browser | Modern browsers | Render news articles | HTML5, CSS3, ES6+ |
| EP MCP Server | Local Node process | EP data access | Spawned locally via stdio JSON-RPC |
| LLM Service | External API | Content generation | API key authentication |
8 production article types are driven by 8 unified news-<type>.md workflows (Stage A→E in one ~45-min session, single PR per run). Article HTML is rendered deterministically by src/aggregator/article-generator.ts from committed Stage-B analysis artifacts — there are no per-type strategy modules in the post-April-2026 pipeline.
| 🏷️ Article Type | 🤖 gh-aw Workflow | 📅 Cadence |
|---|---|---|
🚨 breaking |
news-breaking.md |
Every 6 hours |
🔮 week-ahead |
news-week-ahead.md |
Fri 07:00 UTC |
📋 week-in-review |
news-week-in-review.md |
Sat 09:00 UTC |
📊 month-ahead |
news-month-ahead.md |
1st of month 08:00 UTC |
📈 month-in-review |
news-month-in-review.md |
28th of month 10:00 UTC |
🏛️ committee-reports |
news-committee-reports.md |
Mon–Fri 04:00 UTC |
🗳️ motions |
news-motions.md |
Mon–Fri 06:00 UTC |
⚖️ propositions |
news-propositions.md |
Mon–Fri 05:00 UTC |
Plus: news-translate.md (14-language translation helper, manual dispatch only).
All 9 news workflows are markdown source files compiled to YAML (.md → .lock.yml) via the GitHub Agentic Workflows CLI (gh aw compile --validate) with pinned GH_AW_VERSION: v0.69.0 in .github/workflows/compile-agentic-workflows.yml. See WORKFLOWS.md for the full surface.
5-layer security model:
- AWF Squid firewall allowlist — egress HTTP allowlist per workflow
- Sandboxed Docker with restricted shell —
bash tool-call contract(every call requirescommand+description); shell expansion restrictions - Safe-output constraints —
create-pull-requestwithmax-patch-size(default 1024 KB;news-translate.mdsets 10240 KB at top level for 14-language fan-out) - JSONL audit trail — per-step structured logs
- Lock file compilation —
.lock.ymlis the immutable executed artifact;.mdis the human source under review
MCP gateway (containerised): EP_MCP_GATEWAY_URL=http://host.docker.internal:80/mcp/european-parliament, provisioned by scripts/mcp-setup.sh.
Validator gates (Stage-C completeness review, agent-side — replaces purged runtime validators):
.github/prompts/03-analysis-completeness-gate.md— protocol that the editorial agent runsanalysis/methodologies/reference-quality-thresholds.json— per-artifact line floors- Dynamic file resolution pattern (must not hallucinate file names):
ls -t "news/${TODAY}-${TYPE}"*"-en.html" | head -1
| Layer | Technology | Version | Purpose | Rationale |
|---|---|---|---|---|
| Runtime | Node.js | 25.x (engines: >=25); Node.js 26 LTS migration scheduled upon release (~Apr 2026) |
JavaScript execution environment | Current release for latest features, performance improvements; ESM-native ("type": "module") |
| Language | TypeScript | 6.0.3 | Primary development language | Strict type safety; compiles from src/ → scripts/ targeting ES2025, module: NodeNext |
| Package Manager | npm | 10.x | Dependency management | Native Node.js package manager, security audit integration |
| Testing | Vitest | 4.1.4 | Unit and integration testing | Fast, ESM-native; happy-dom env; happy-dom@20.9.0 |
| E2E Testing | Playwright | 1.59.1 | End-to-end browser testing | @axe-core/playwright@4.11.2 for WCAG 2.1 AA |
| Linting | ESLint | 10.2.1 | Code quality and security | Flat config; plugins: eslint-plugin-sonarjs@4.0.3, eslint-plugin-security@4.0.0, eslint-plugin-jsdoc@62.9.0 |
| Formatting | Prettier | 3.8.3 | Code formatting | Opinionated formatter, consistent code style |
| Visualization | Chart.js | 4.5.1 | Dashboard charts in articles | Vendored into js/vendor/ via npm run copy-vendor |
| Visualization | D3 | 7.9.0 | Advanced visualizations | Used for specific intelligence views |
| Documentation | TypeDoc | 0.28.19 | API documentation generation | Generates docs/ pages from TypeScript sources |
| HTML Validation | HTMLHint | 1.9.2 | HTML5 validation | Pre-commit + CI |
| Duplicate Check | jscpd | 4.0.9 | Copy-paste detection | Scheduled quality audits |
| Technology | Current Version | Minimum Version | End-of-Life | Update Policy |
|---|---|---|---|---|
| Node.js | 25.x (current) | 25.0.0 (engines: >=25) |
~Apr 2026 (Current EOL; upgrading to Node.js 26 LTS) | Update to Node.js 26 LTS within days of release (~Apr 2026) |
| npm | 10.x (latest) | 10.0.0 | Follows Node.js lifecycle | Auto-updated with Node.js |
| TypeScript | 6.0.3 | 6.0.0 | N/A | Update to latest minor within 14 days, major within 90 days |
| Vitest | 4.1.4 | 4.0.0 | N/A | Update to latest minor within 14 days, major within 60 days |
| Playwright | 1.59.1 | 1.55.0 | N/A | Update to latest minor within 14 days, major within 60 days |
| ESLint | 10.2.1 | 10.0.0 | N/A | Update to latest minor within 14 days, major within 90 days |
| Prettier | 3.8.3 | 3.0.0 | N/A | Update to latest minor within 14 days, major within 90 days |
| Chart.js | 4.5.1 | 4.0.0 | N/A | Vendored; update with copy-vendor script |
| D3 | 7.9.0 | 7.0.0 | N/A | Vendored; update with copy-vendor script |
| TypeDoc | 0.28.19 | 0.28.0 | N/A | Major within 60 days |
| european-parliament-mcp-server | 1.2.13 (pinned) | 1.2.13 | Per upstream | Track releases; 1.2.11 (2026-04-20) fixes #377/#378 (fixed-window feeds, uniform unavailable envelope); 1.2.13 (2026-04-23) adds non-retryable UPSTREAM_404 for get_procedures, fixes search_documents envelope, enriches track_legislation timeline, improves get_procedures_feed error classification |
| worldbank-mcp | 1.0.1 (optional) | 1.0.0 | Per upstream | Biannual WDI refresh cadence |
| gh-aw CLI | v0.69.0 (pinned GH_AW_VERSION) |
v0.69.0 | Per upstream | Workflow-level pin in compile-agentic-workflows.yml |
Production Dependencies (1 required + 1 optional):
european-parliament-mcp-server@1.2.18— Primary data surface; 6 sliding-window feed tools (timeframe+startDatewhencustom) and 7 fixed-window feed tools (limit/offsetonly —documents,plenary_documents,committee_documents,plenary_session_documents,parliamentary_questions,corporate_bodies,controlled_vocabularies); returns uniform{status:"unavailable", items:[]}envelope on upstream failure.worldbank-mcp@1.0.1(optionalDependencies) — WDI macro/social/environment/health indicators.
IMF REST is integrated via native TypeScript fetch in src/mcp/imf-mcp-client.ts (class IMFMCPClient) — this is NOT an MCP server; calls go directly to https://dataservices.imf.org/REST/SDMX_3.0/. Env: IMF_API_BASE_URL, IMF_API_TIMEOUT_MS. Supplies WEO + FM monthly forecasts up to five years ahead.
Dev dependencies (notable): vitest@4.1.4, @vitest/ui, @vitest/coverage-v8, happy-dom@20.9.0, @playwright/test@1.59.1, @axe-core/playwright@4.11.2, typescript@6.0.3, eslint@10.2.1, eslint-plugin-sonarjs@4.0.3, eslint-plugin-security@4.0.0, eslint-plugin-jsdoc@62.9.0, prettier@3.8.3, htmlhint@1.9.2, typedoc@0.28.19, chart.js@4.5.1, d3@7.9.0, papaparse@5.5.3, husky@9.1.7, jscpd@4.0.9.
| Tool | Purpose | Integration | Configuration |
|---|---|---|---|
| CodeQL | SAST scanning | GitHub Actions (weekly + PR) | .github/workflows/codeql.yml |
| Dependabot | Dependency vulnerability scanning | GitHub native (daily) | .github/dependabot.yml |
| npm audit | Dependency security check | Pre-commit + CI | package.json scripts |
| ESLint Security | Security-focused linting | Pre-commit + CI | eslint.config.js (security plugin) |
| HTMLHint | HTML validation | CI pipeline | .htmlhintrc |
| Husky | Git hooks | Pre-commit, pre-push | .husky/ directory |
| Playwright | Accessibility testing | E2E test suite | playwright.config.js (axe integration) |
| Service | Purpose | Configuration | Cost |
|---|---|---|---|
| GitHub Actions | CI/CD automation | .github/workflows/ | Free (public repo) |
| AWS S3 | Static site hosting | S3 bucket + static website | Pay-per-use (storage, requests) |
| Amazon CloudFront | Content delivery | CloudFront distribution (S3) | Pay-per-use (data transfer, requests) |
| Git | Version control | Repository | Free (public repo) |
| Service | Purpose | Protocol | Authentication | Rate Limits | Cost Model |
|---|---|---|---|---|---|
| European Parliament MCP Server | EP data access | Local process (stdio JSON-RPC) | None (local process) | N/A (handled by MCP server / EP APIs) | Free (EP open data via MCP server) |
| LLM Service (OpenAI/Anthropic) | Content generation | HTTPS/JSON | API key (required) | Varies by provider | Pay-per-token |
| GitHub API | Repository operations | REST/GraphQL | GitHub token | 5000 req/hr | Free (authenticated) |
| Browser | Minimum Version | Features Required | Testing Coverage |
|---|---|---|---|
| Chrome/Edge | 90+ | ES2020, CSS Grid, Flexbox | ✅ Playwright E2E (Chromium in CI) |
| Firefox | 88+ | ES2020, CSS Grid, Flexbox | 🧪 Manual regression (no Playwright CI) |
| Safari | 14+ | ES2020, CSS Grid, Flexbox | 🧪 Manual regression (no Playwright CI) |
| Mobile Chrome | 90+ | ES2020, Responsive Design | 🧪 Manual responsive testing |
| Mobile Safari | 14+ | ES2020, Responsive Design | 🧪 Manual responsive testing |
No support for:
- Internet Explorer (EOL June 2022)
- Legacy Edge (Chromium-based only)
TypeScript source in src/ is compiled to JavaScript in scripts/ via tsc. The generated JavaScript files are executed by Node.js during news generation. The public npm entry point is src/index.ts (published as euparliamentmonitor with SLSA Level 3 provenance attestations).
src/ → scripts/ (tsc compilation)
├── index.ts → index.js npm package entry point
├── constants/ → constants/
│ ├── config.ts Project paths, BASE_URL, filename patterns
│ ├── analysis-constants.ts Shared analysis thresholds
│ ├── committee-indicator-map.ts Committee → indicator mapping
│ ├── language-core.ts ALL_LANGUAGES (14), LANGUAGE_PRESETS
│ ├── language-articles.ts Per-language article-type labels
│ ├── language-ui.ts Per-language UI strings
│ └── languages.ts Language metadata (name, flag, direction)
├── mcp/ → mcp/
│ ├── ep-mcp-client.ts EP MCP stdio client; feed option types (no canonical EP_MCP_TOOLS export yet)
│ ├── wb-mcp-client.ts World Bank MCP client; exports WORLD_BANK_MCP_TOOLS
│ ├── imf-mcp-client.ts IMFMCPClient class (native fetch/SDMX 3.0); exports IMF_MCP_TOOLS
│ ├── mcp-connection.ts Connection lifecycle
│ ├── mcp-health.ts Health probes
│ └── mcp-retry.ts Exponential backoff retry
├── templates/ → templates/
│ ├── article-template.ts HTML5 article shell (SEO, JSON-LD, Open Graph)
│ └── section-builders.ts buildSiteFooter (single source of truth, 14-lang), stakeholder grid
├── aggregator/ → aggregator/ ⭐ April-2026 deterministic article renderer
│ ├── article-generator.ts Entry point CLI (`npm run generate-article`)
│ ├── analysis-aggregator.ts aggregateAnalysisRun() — manifest discovery, .md filter, Provenance & Audit at END
│ ├── artifact-order.ts ARTIFACT_SECTIONS — canonical 19-section order
│ ├── clean-artifact.ts Strips SPDX/banner/provenance front matter
│ ├── markdown-renderer.ts markdown-it + plugin allowlist (anchor, footnote, attrs, deflist)
│ ├── article-html.ts HTML5 wrapper: header, language switcher, TOC sidebar, JSON-LD, hreflang
│ └── article-metadata.ts 5-tier editorial-highlight resolver for <title> / <meta description>
├── generators/ → generators/ (post-aggregator-migration: only indexes & sitemap remain)
│ ├── news-indexes.ts Per-language index pages
│ └── sitemap.ts XML sitemap generator + per-language sitemap_<lang>.html
├── types/ → types/
│ ├── analysis.ts, common.ts, generation.ts, imf.ts, intelligence.ts, mcp.ts,
│ │ parliament.ts, political-classification.ts, political-risk.ts,
│ │ political-threats.ts, quality.ts, significance.ts, stakeholder.ts,
│ │ visualization.ts, world-bank.ts, index.ts
└── utils/ → utils/
├── article-category.ts, article-quality-scorer.ts, content-metadata.ts,
├── file-utils.ts, html-sanitize.ts, imf-data.ts,
├── intelligence-analysis.ts, intelligence-index.ts, metadata-utils.ts,
├── news-metadata.ts, political-classification.ts,
├── political-risk-assessment.ts, political-threat-assessment.ts,
├── significance-scoring.ts, world-bank-data.ts
(content-validator.ts, validate-articles.ts, validate-analysis-completeness.ts
PURGED in April-2026 — replaced by Stage-C agent-side review)
Key build / generation commands:
npm run build— Runstsc(TypeScript compilationsrc/→scripts/)npm run lint— ESLint onsrc/npm run generate-news— Orchestrates strategies via the pipelinenpm run generate-news-indexes— Executesscripts/generators/news-indexes.js(prebuild hook)npm run generate-sitemap— Executesscripts/generators/sitemap.js(prebuild hook)npm run copy-vendor— Vendorschart.jsandd3assets intojs/vendor/npm run test/test:unit/test:integration/test:e2e/test:coverage— Test suite (52 test files, 3061+ passing tests)
TypeScript configuration (tsconfig.json):
target: ES2025,module: NodeNext,strict: true,rootDir: ./src,outDir: ./scripts,"type": "module"in package.json
Runtime JS (browser):
js/index-runtime.js— Index page filter + theme togglejs/article-runtime.js— Reading progress + theme toggle- External scripts only (no inline scripts — CSP-ready)
js/vendor/— Vendored Chart.js (4.5.1) and D3 (7.9.0)
TypeScript configuration (tsconfig.json):
target: ES2025— Modern JavaScript outputmodule: NodeNext— Node.js native ESM resolutionstrict: true— Full strict mode enabledrootDir: ./src— TypeScript source rootoutDir: ./scripts— Compiled JavaScript output
sequenceDiagram
participant GHA as GitHub Actions
participant CLI as CLI Interface
participant Gen as Article Generator
participant MCP as MCP Client
participant EP as EP MCP Server
participant TPL as Template Engine
participant FS as File System
GHA->>CLI: Trigger daily workflow
CLI->>Gen: generate-news --types=week-ahead --languages=all
Gen->>MCP: getPlenarySessions
Note over MCP,EP: MCP client spawns EP MCP Server as local process via stdio JSON-RPC
MCP->>EP: JSON-RPC request via stdio
EP-->>MCP: EP data as JSON-RPC response
MCP-->>Gen: Parsed EP data with basic shape checks
loop For each language sequentially
Gen->>TPL: Render HTML with EP data and language
Note over Gen,TPL: Placeholder English body content - native per-language LLM generation planned
TPL-->>Gen: HTML output
Gen->>FS: Write article file
end
Gen->>FS: Write metadata.json via writeFileSync
GHA->>GHA: Commit and push changes
GHA->>GHA: Deploy to S3 + invalidate CloudFront
sequenceDiagram
participant User as User Browser
participant CDN as CloudFront CDN
participant S3 as Amazon S3
participant Repo as Git Repository
User->>CDN: GET /index.html
CDN->>S3: Forward request (cache miss)
S3-->>CDN: HTML response
CDN-->>User: Cached HTML
User->>CDN: GET /news/week-ahead-2026-02-17-en.html
CDN-->>User: Cached article (or fetch from S3)
Cross-cutting concerns are aspects of the system that affect multiple components and layers. These concerns are implemented consistently across the entire architecture.
Logging Levels:
| Level | Usage | Output | Retention |
|---|---|---|---|
| ERROR | Unrecoverable errors (API failures, file write errors) | console.error(), GitHub Actions logs |
90 days (GitHub) |
| WARN | Recoverable issues (MCP connection retry/backoff, MCP tool fallback, JSON.parse recovery) | console.warn(), GitHub Actions logs |
90 days (GitHub) |
| INFO | Normal operations (generation start/complete, article count) | console.log(), GitHub Actions logs |
90 days (GitHub) |
| DEBUG | Detailed diagnostics (API responses, intermediate data) | Disabled in production | Dev only |
Structured Logging Format:
{
timestamp: "2026-02-20T10:30:00.000Z",
level: "INFO",
component: "ArticleGenerator",
action: "generate_article",
language: "en",
article_type: "week-ahead",
duration_ms: 1234,
status: "success"
}Logging Implementation:
- Build Logs: All GitHub Actions workflow logs (generation, deployment, tests)
- Error Tracking: Errors logged to GitHub Actions workflow logs for visibility
- Performance Metrics: Generation time per article, API call durations
- Audit Trail: Git commit history serves as audit log for all content changes
graph TB
subgraph "Generation Monitoring"
Workflow[GitHub Actions Workflow]
GenMetrics[Generation Metrics<br/>Article count, Duration, Errors]
TestResults[Test Results<br/>Unit, Integration, E2E]
end
subgraph "Application Monitoring"
Pages[Amazon CloudFront + S3]
Analytics[Web Analytics<br/>Visits, Bounce Rate, Countries]
Uptime[Uptime Monitoring<br/>AWS Health Dashboard]
end
subgraph "Security Monitoring"
Dependabot[Dependabot Alerts]
CodeQL[CodeQL Security Scans]
Audit[npm audit]
end
subgraph "Alerting"
Email[Email Notifications]
GitHubUI[GitHub UI Alerts]
Status[Status Checks]
end
Workflow -->|Logs| GenMetrics
Workflow -->|Results| TestResults
Pages -->|Metrics| Analytics
Pages -->|Health| Uptime
Dependabot -->|Alerts| Email
CodeQL -->|Findings| GitHubUI
Audit -->|Vulnerabilities| Status
GenMetrics -->|Failures| Email
TestResults -->|Failures| Status
style Dependabot fill:#f99,stroke:#333,stroke-width:2px
style CodeQL fill:#f99,stroke:#333,stroke-width:2px
Monitoring Tools:
| Metric | Tool | Threshold | Alert |
|---|---|---|---|
| Build Success Rate | GitHub Actions | <95% over 7 days | Email to maintainers |
| Generation Duration | Workflow logs | >15 minutes | Warning annotation |
| Test Pass Rate | Vitest + Playwright | <100% | Block merge |
| Security Vulnerabilities | Dependabot + CodeQL | Any high/critical | Email + PR |
| Site Availability | AWS Health Dashboard | <99.9% | AWS Health event notification |
| Page Load Time | Lighthouse (manual runs) | >3 seconds | Warning annotation |
Error Handling Strategy:
flowchart TD
Start([API Call / Operation])
Try{Try Operation}
Success[✅ Success]
Catch{Catch Error}
Transient{Transient<br/>Error?}
Retry[Retry with<br/>Exponential Backoff]
MaxRetries{Max Retries<br/>Reached?}
Fallback{Fallback<br/>Available?}
UseFallback[Use Fallback Data]
LogError[Log Error]
PropagateError[Propagate Error]
GracefulDegradation[Graceful Degradation]
Start --> Try
Try -->|Success| Success
Try -->|Error| Catch
Catch --> Transient
Transient -->|Yes| Retry
Transient -->|No| Fallback
Retry --> MaxRetries
MaxRetries -->|No| Try
MaxRetries -->|Yes| Fallback
Fallback -->|Yes| UseFallback
Fallback -->|No| LogError
UseFallback --> GracefulDegradation
LogError --> PropagateError
style Success fill:#9f9,stroke:#333,stroke-width:2px
style LogError fill:#f99,stroke:#333,stroke-width:2px
style PropagateError fill:#f99,stroke:#333,stroke-width:2px
style GracefulDegradation fill:#ff9,stroke:#333,stroke-width:2px
Error Categories and Handling:
| Error Category | Examples | Retry Strategy | Fallback | User Impact |
|---|---|---|---|---|
| Transient Network Errors | MCP connection failure during startup, LLM API rate limit | Exponential backoff (1s, 2s, 4s), max 3 retries for MCP connection establishment and LLM calls; individual MCP requests use a single fixed timeout with no retry | Use placeholder events or skip affected items (no cache) | Missing or placeholder content for affected items |
| Permanent API Errors | Invalid API key, malformed request | No retry | Skip article generation for affected language | Missing article for specific language |
| Data Validation Errors | Invalid EP data structure, missing required fields | No automatic regeneration loop | Skip invalid items (no cached-data fallback) | Missing content for invalid items |
| File System Errors | Disk full, permission denied | No retry | Fail workflow | Build failure (no deployment) |
| Content Generation Errors | LLM refusal, prompt injection detected | Single generation attempt (no automatic regeneration loop) | Insert placeholder events when content generation fails | Reduced content quality or placeholder content |
Error Propagation:
- Component Level: Catch and log errors, attempt recovery
- Service Level: Propagate if unrecoverable, aggregate errors for reporting
- Workflow Level: Fail fast if critical (file system), continue if non-critical (single article failure)
14 Languages Supported:
- 🇬🇧 English (en) - 67 million
- �🇪 Swedish (sv) - 10 million
- 🇩🇰 Danish (da) - 6 million
- 🇳🇴 Norwegian (no) - 5 million
- 🇫🇮 Finnish (fi) - 5 million
- 🇩🇪 German (de) - 95 million
- 🇫🇷 French (fr) - 67 million
- 🇪🇸 Spanish (es) - 47 million
- 🇳🇱 Dutch (nl) - 24 million
- 🇸🇦 Arabic (ar) - 420 million
- 🇮🇱 Hebrew (he) - 9 million
- 🇯🇵 Japanese (ja) - 125 million
- 🇰🇷 Korean (ko) - 77 million
- 🇨🇳 Chinese (zh) - 1.3 billion
i18n Architecture:
graph LR
subgraph "Content Generation"
EPData[EP Data<br/>Language-Neutral]
LLM[LLM Service]
Prompt[Language-Specific Prompt]
end
subgraph "14 Language Variants"
EN[English Article]
SV[Swedish Article]
DA[Danish Article]
NO[Norwegian Article]
FI[Finnish Article]
DE[German Article]
FR[French Article]
ES[Spanish Article]
NL[Dutch Article]
AR[Arabic Article]
HE[Hebrew Article]
JA[Japanese Article]
KO[Korean Article]
ZH[Chinese Article]
end
subgraph "Delivery"
Index[Language-Specific<br/>Index Pages]
Sitemap[Multilingual<br/>Sitemap.xml]
end
EPData --> LLM
Prompt --> LLM
LLM --> EN
LLM --> SV
LLM --> DA
LLM --> NO
LLM --> FI
LLM --> DE
LLM --> FR
LLM --> ES
LLM --> NL
LLM --> AR
LLM --> HE
LLM --> JA
LLM --> KO
LLM --> ZH
EN --> Index
DE --> Index
FR --> Index
ES --> Index
Index --> Sitemap
style EPData fill:#9cf,stroke:#333,stroke-width:2px
style LLM fill:#fc9,stroke:#333,stroke-width:2px
i18n Implementation:
| Aspect | Implementation | Example |
|---|---|---|
| Content Generation | Placeholder English content for all languages (current); native LLM per-language generation planned (ADR-004) | Current: shared English body with localized titles/subtitles; Future: each article written directly in target language |
| File Naming | Language suffix in filename | week-ahead-2026-02-17-en.html, week-ahead-2026-02-17-de.html |
| HTML lang Attribute | Set per page | <html lang="en">, <html lang="de"> |
| Navigation | Language-specific index pages | index.html, index-de.html |
| SEO | hreflang tags for alternate languages | <link rel="alternate" hreflang="de" href="..."> |
| Date Formatting | Locale-specific date formats | EN: "February 17, 2026", DE: "17. Februar 2026" |
| Character Encoding | UTF-8 for all languages | <meta charset="UTF-8"> |
Language Quality Assurance:
- Current State: Placeholder English body content with localized metadata (title, subtitle, HTML lang attribute, date formats) per language
- Target State (ADR-004): LLM generates content natively in each language (not machine translation)
- Cultural Adaptation: Planned — prompts will include cultural context for each language/region
- Terminology Consistency: EP terminology to be used consistently per language
- Quality Metrics: Human review of sample articles per language quarterly
Architecture Decision Records document significant architectural decisions made during the design and development of EU Parliament Monitor. Each ADR captures the context, decision, and consequences of a specific architectural choice.
Status: Accepted
Date: 2025-12-01
Decision Makers: CEO, Development Team
Context:
- Need to display European Parliament news to public audience
- Security is paramount (public-facing system)
- Limited development resources
- GitHub Pages available as free hosting solution; AWS S3 + CloudFront chosen for production (see ADR-002)
Decision: We will build EU Parliament Monitor as a static site generator rather than a dynamic web application with backend services.
Rationale:
- Security: Static sites eliminate entire classes of vulnerabilities (SQL injection, XSS via server-side rendering, authentication bypass)
- Scalability: Static content scales infinitely via CDN with no server infrastructure
- Cost: Static hosting on AWS S3 + CloudFront is low-cost, no server infrastructure
- Maintainability: Simpler architecture with fewer moving parts
- Reliability: No database or server downtime risks
Alternatives Considered:
- WordPress: Rejected due to security vulnerabilities, plugin maintenance overhead
- Node.js/Express backend: Rejected due to hosting costs, operational complexity
- JAMstack with headless CMS: Rejected due to unnecessary complexity for simple content
Consequences:
- ✅ Positive: Minimal attack surface, zero infrastructure costs, infinite scalability
- ✅ Positive: Fast page loads, excellent SEO, simple deployment
⚠️ Negative: Content updates require regeneration (acceptable for daily news)⚠️ Negative: No real-time interactivity (not required for news consumption)
Compliance: Aligns with ISO 27001 A.8.28 (Secure Development), NIST CSF PR.DS-5 (Minimal Attack Surface)
Status: Accepted
Date: 2025-12-05
Decision Makers: CEO, DevOps Team
Context:
- Static site architecture chosen (ADR-001)
- Need reliable, secure hosting with global CDN
- Budget constraints (low-cost solution preferred)
- Already using GitHub for source control and CI/CD
Decision:
We will host EU Parliament Monitor on AWS S3 with Amazon CloudFront as the global CDN (see .github/workflows/deploy-s3.yml).
Rationale:
- Cost: Low-cost static hosting within current traffic and budget constraints
- Integration: GitHub Actions CI/CD deploys to S3 and invalidates the CloudFront distribution
- Security: HTTPS via AWS Certificate Manager, TLS termination at CloudFront edge
- Reliability: AWS S3 and CloudFront SLAs provide high availability and durability
- Performance: CloudFront global edge network with caching for low-latency delivery
Alternatives Considered:
- GitHub Pages: Considered for simplicity and zero direct hosting cost; kept as a documented alternative but not chosen due to less flexible edge configuration
- Netlify: Rejected due to build minute limits on free tier
- Vercel: Rejected due to commercial focus, potential future costs
- Self-hosted Nginx: Rejected due to operational burden, security maintenance
Consequences:
- ✅ Positive: Globally distributed static hosting with strong reliability and performance
- ✅ Positive: Automated deployments from GitHub Actions to S3 with CloudFront cache invalidation
- ✅ Positive: Integration with AWS security services (WAF, Shield, ACM)
⚠️ Negative: Ongoing AWS hosting costs and need to manage AWS credentials securely⚠️ Negative: Increased operational complexity compared to GitHub Pages
Compliance: Aligns with ISO 27001 A.8.24 (Cryptography - HTTPS), CIS Control 1 (Asset Management)
Status: Accepted
Date: 2025-12-10
Decision Makers: CEO, Data Team
Context:
- Need structured access to European Parliament data (MEPs, plenary sessions, votes, documents)
- Official EP APIs are fragmented, inconsistent, and poorly documented
- Data schemas vary across endpoints
- Need caching, validation, and error handling
Decision: We will access European Parliament data via the European Parliament MCP Server using the Model Context Protocol (MCP) rather than calling official EP APIs directly.
Rationale:
- Abstraction: MCP Server provides unified interface to fragmented EP APIs
- Data Normalization: Consistent data structures across EP data sources
- Error Handling: Connection retry logic and graceful degradation
- Maintainability: API changes isolated to MCP Server, not news generator
- Local Process: Spawned as stdio JSON-RPC process during build, no separate deployment needed
Alternatives Considered:
- Direct EP API calls: Rejected due to fragmentation, lack of validation, poor error handling
- Custom wrapper library: Rejected due to development overhead, maintenance burden
- Third-party EP data services: Rejected due to cost, data freshness concerns
Consequences:
- ✅ Positive: Clean separation of concerns, reusable data layer
- ✅ Positive: Standardized data structures, no direct EP API fragmentation
- ✅ Positive: MCP Server maintained separately, used by multiple clients
⚠️ Negative: Additional dependency (mitigated by fallback data strategy)⚠️ Negative: Requires MCP Server process availability during build
Compliance: Aligns with ISO 27001 A.8.3 (Input Validation), NIST CSF PR.DS-2 (Data in Transit Protection)
Status: Accepted
Date: 2025-12-15
Decision Makers: CEO, Content Team
Context:
- Need to support 14 languages
- Machine translation often produces unnatural, awkward phrasing
- European Parliament terminology requires domain expertise
- Budget available for LLM API costs
Decision: We will generate content natively in each language using LLMs rather than translating from a base language.
Rationale:
- Quality: Native generation produces natural, idiomatic language
- Cultural Adaptation: LLM can adapt content for cultural context per language
- Terminology: LLM trained on EP documents uses correct terminology
- Flexibility: Different article structures possible per language/culture
- Scalability: Parallel generation for all languages
Alternatives Considered:
- Machine Translation (Google Translate, DeepL): Rejected due to unnatural phrasing, terminology issues
- Human Translation: Rejected due to cost (~€0.10/word x 14 languages), time delays
- English-only: Rejected due to accessibility concerns, limited audience
Consequences:
- ✅ Positive: High-quality, natural language content in all 14 languages
- ✅ Positive: Cultural adaptation, correct terminology
⚠️ Negative: Higher LLM API costs ($5-10/day) vs translation ($1-2/day)⚠️ Negative: Content may vary slightly across languages (acceptable, even beneficial)
Compliance: Aligns with Hack23 AI Policy (Transparency, Human Oversight), ISO 27001 A.5.10 (Information Processing)
Status: Accepted
Date: 2026-01-05
Decision Makers: CEO, Development Team
Context:
- Building news generation scripts and static site generator
- Need compile-time type safety for complex data structures from EP MCP Server
- Multiple article categories, 14 languages, and complex data pipelines
- Small development team (1-2 developers) benefits from IDE support
Decision:
We will use TypeScript (strict mode) as the primary development language, compiling from src/ to scripts/ targeting ES2025.
Rationale:
- Type Safety: Strict mode catches errors at compile time, especially important for complex EP data structures and MCP client interfaces
- IDE Support: Full IntelliSense, refactoring, and navigation in VS Code
- Self-Documenting: TypeScript interfaces serve as living documentation for data models (ArticleCategory, LanguageCode, MCPToolResult, etc.)
- Build Pipeline:
tsccompilessrc/*.ts→scripts/*.js;rootDir: ./src,outDir: ./scripts,target: ES2025,module: NodeNext - Ecosystem: Full access to Node.js and npm ecosystem with type definitions
Alternatives Considered:
- JavaScript (ES2025) with JSDoc: Rejected due to weaker type guarantees, less comprehensive IDE support for complex interfaces
- Flow: Rejected due to declining community support
- JavaScript ES2015: Rejected due to lack of modern features (optional chaining, nullish coalescing)
Consequences:
- ✅ Positive: Compile-time error detection, comprehensive IDE support, self-documenting code
- ✅ Positive: Strict null checks prevent runtime errors with optional EP data fields
⚠️ Negative: Requires build step (npm run build/tsc) before execution⚠️ Negative: Slightly higher learning curve for contributors unfamiliar with TypeScript
Compliance: Aligns with Hack23 Secure Development Policy (Type Safety Principle), ISO 27001 A.8.28 (Secure Coding)
Status: Accepted
Date: 2026-04-27
Decision Makers: CEO, Development Team
Context:
- The EP publishes roll-call voting records with a 2–6 week lag after each plenary sitting.
- The previous
week-in-reviewdata window was D-0 → D-7 (the most-recent 7 days). - A D-0→D-7 window structurally never contains published voting data, making the article vote-blind in every run regardless of content quality — a permanently-empty input.
analysis/daily/2026-04-26/week-in-review/intelligence/methodology-reflection.md§3.1 recommended shifting to a D-36 → D-8 window to systematically capture voting data.
Decision:
We shift the week-in-review analysis window to D-36 → D-8 (start = D-36, end = D-8 — a 28-day window ending 8 days ago, relative to the run date). This direction matches the workflow's DATE_FROM (start = D-36) → DATE_TO (end = D-8) variables. It is a 4-week look-back that consistently captures at least one full EP plenary week with published roll-call votes.
Rationale:
- Data depth over recency: A vote-populated analysis is more valuable than a vote-empty analysis that is 7 days more recent. Readers of the week-in-review expect vote coverage.
- Systematic: The window is deterministic and reproducible — it always yields voting data regardless of EP publication lag variance (2–6 weeks).
- Complementary to fallback: This window shift works alongside any future EP Open Data Portal fallback for historical roll-calls; the two are not mutually exclusive.
- Article framing updated: The
WEEKLY_REVIEW_TITLESsubtitles (all 14 languages) now read "last full reporting week" instead of "past week" to accurately describe the shifted window to readers. - SEO metadata: The title date range already shows the exact
dateFrom–dateTowindow, so canonical URLs remain accurate without additional changes.
Alternatives Considered:
- Keep D-0→D-7 + add EP Open Data Portal fallback query for historical roll-calls: Complementary approach; can be combined with this shift but does not solve the structural vote-empty problem without the window shift.
- D-8→D-14 (7-day window, offset by 8 days): Narrower window; may miss vote publication for sittings right at the 8-day boundary given the 2–6 week lag variance. Rejected in favour of the wider 28-day window.
Consequences:
- ✅ Positive: Every
week-in-reviewrun now reliably contains roll-call voting data. - ✅ Positive: Analysis depth improves without increasing Stage B budget.
- ✅ Positive: Article subtitles accurately describe the reporting window in all 14 languages.
⚠️ Trade-off: Articles cover events from 8–36 days ago rather than the most-recent 7 days; the workflow is less "breaking" but more analytically complete.⚠️ Negative: In this ADR, theDATE_FROM/DATE_TOvariables replaceLAST_WEEKinweek-in-reviewStage A bash blocks; other workflows still usingLAST_WEEKrequire separate migration if their reporting windows are changed.
Implementation:
src/aggregator/article-metadata.ts: NewderiveReportingWindowForWeekInReview()export computes D-36/D-8 from the article date;buildTemplateFallbackuses it forweek-in-review..github/workflows/news-week-in-review.md: Stage A setsDATE_FROM(D-36) andDATE_TO(D-8); all MCP tool calls use these variables;LAST_WEEKremoved.src/constants/language-articles.ts:WEEKLY_REVIEW_TITLESsubtitles updated (14 languages).
Compliance: Aligns with Hack23 AI Policy (unambiguous date semantics in published articles), GDPR (accurate published metadata).
Non-functional requirements define system qualities that are not directly related to specific features but are critical to overall system success.
| Requirement | Target | Measurement | Current Status |
|---|---|---|---|
| Page Load Time (Desktop) | <1 second (LCP) | Lighthouse (manual runs) | ✅ 0.6s average |
| Page Load Time (Mobile) | <2 seconds (LCP) | Lighthouse (manual runs) | ✅ 1.2s average |
| Build Time (All Languages) | <15 minutes | GitHub Actions logs | ✅ 8-12 minutes |
| Article Generation (Single) | <30 seconds | Script logs | ✅ 15-25 seconds |
| MCP API Response Time | <2 seconds (p95) | Client logs | ✅ 1.1s average |
| CDN Cache Hit Rate | >95% | CloudFront metrics (planned) | ⏳ TBD — instrumentation planned |
Performance Optimization Strategies:
- Static Content: All content pre-generated, no server-side processing
- CDN Caching: Tiered caching strategy (1 hour for HTML, 1 day for metadata, 1 year for immutable assets)
- Image Optimization: None required (no images in MVP)
- Minification: HTML minification (future), CSS minification (future)
- HTTP/2: Enabled by default on Amazon CloudFront
| Dimension | Current Capacity | Target Capacity | Scaling Strategy |
|---|---|---|---|
| Concurrent Users | Unlimited (static content) | Unlimited | CDN auto-scales |
| Daily Visitors | 10,000+ | 100,000+ | CDN bandwidth increase |
| Articles per Day | 14 (one per language) | 140 (ten per language) | Parallel generation, workflow optimization |
| Supported Languages | 14 | 24+ (expanded markets) | Add language configs, LLM prompts |
| Repository Size | 150 MB | 800 MB (GitHub limit) | Archive old articles annually |
Scalability Constraints:
- AWS S3: No repository size limit for static hosting; storage costs increase linearly
- GitHub Actions: 2000 minutes/month free, unlimited for public repos
- LLM API: Rate limits vary by provider (typically 3000 RPM for tier 2)
| Requirement | Target | Measurement | Consequence of Failure |
|---|---|---|---|
| Site Availability | 99.9% (AWS CloudFront/S3 SLA) | GitHub Status + AWS Health Dashboard | Users cannot access news |
| Build Success Rate | >98% | GitHub Actions logs | No new content deployed |
| MCP API Availability | >99% (best effort) | Health checks | Fallback to placeholder events (no cached/previous data) |
| LLM API Availability | >99.5% (provider SLA) | API logs | Generation fails, retry logic |
| Recovery Time Objective (RTO) | <15 minutes | Manual testing | Time to restore service after outage |
| Recovery Point Objective (RPO) | <24 hours | Git history | Maximum data loss acceptable |
High Availability Strategies:
- Static Architecture: No single point of failure (SPOF) in runtime
- CDN Redundancy: Amazon CloudFront with multiple edge locations globally
- Fallback Data: Use placeholder events if EP MCP Server unavailable (no cache/previous-data reuse)
- Retry Logic: Exponential backoff for transient failures
- Monitoring: GitHub Status, Dependabot alerts, workflow notifications
| Requirement | Implementation | Verification | Compliance |
|---|---|---|---|
| HTTPS-Only | CloudFront enforces HTTPS redirect via ACM certificate | Manual testing | ISO 27001 A.8.24 |
| Content Security Policy (CSP) | Planned strict CSP via CloudFront response headers (no CSP meta tag in HTML templates currently) | CSP Evaluator (staging/production) | ISO 27001 A.8.23 |
| No Secrets in Repository | GitHub Secrets for API keys | Git history scan | ISO 27001 A.8.3 |
| Dependency Vulnerability Scanning | Dependabot daily scans | GitHub Security tab | CIS Control 10 |
| SAST (Static Application Security Testing) | CodeQL weekly + PR | GitHub Code Scanning | ISO 27001 A.8.28 |
| Access Control | GitHub RBAC, branch protection | Repository settings | CIS Control 6 |
| Audit Logging | GitHub audit logs, workflow logs | Logs API | ISO 27001 A.8.15 |
| Data Classification | All content PUBLIC | CLASSIFICATION.md | ISO 27001 A.5.10 |
| Incident Response | SECURITY.md procedures | Quarterly reviews | NIST CSF RS.RP |
Security Testing:
- SAST: CodeQL (weekly + PR) - JavaScript/TypeScript, HTML
- Dependency Scanning: Dependabot (daily) + npm audit (pre-commit)
- Manual Penetration Testing: Not required (static site, no user input)
- Security Reviews: Quarterly architecture review
| Criterion | Requirement | Implementation | Testing |
|---|---|---|---|
| Perceivable | Text alternatives, adaptable content, distinguishable | Semantic HTML5, alt text, contrast ratios | Playwright axe tests |
| Operable | Keyboard accessible, enough time, navigable, input modalities | Focus management, skip links, ARIA labels | Manual keyboard testing |
| Understandable | Readable, predictable, input assistance | lang attributes, consistent navigation, form labels | Lighthouse accessibility |
| Robust | Compatible with assistive technologies | Valid HTML5, ARIA roles | HTML validator |
Accessibility Targets:
- WCAG 2.1 AA Compliance: 100% (mandatory)
- Lighthouse Accessibility Score: >95% (target 100%)
- Keyboard Navigation: All interactive elements accessible
- Screen Reader Support: JAWS, NVDA, VoiceOver tested quarterly
Accessibility Testing:
- Automated: Playwright with axe-core (every PR)
- Manual: Quarterly screen reader testing, keyboard navigation
- Tools: Lighthouse (manual runs), axe DevTools, HTML validator
| Metric | Target | Current | Tool |
|---|---|---|---|
| Code Coverage | >80% lines | 82% | Vitest |
| Branch Coverage | >80% branches | 83% | Vitest |
| Cognitive Complexity | <15 per function | <10 average | ESLint sonarjs cognitive-complexity rule |
| Code Duplication | <3% | <2% | Manual review |
| Documentation Coverage | 100% public APIs | 95% | JSDoc, manual review |
| Build Time | <5 minutes (tests only) | 3-4 minutes | GitHub Actions |
Maintainability Practices:
- Code Review: All PRs require approval
- Documentation: Architecture, security, process docs maintained
- Testing: Unit (Vitest 4.1.4), Integration (incl. MCP contract tests), E2E (Playwright 1.59.1 + axe-core)
- Linting: ESLint 10.2.1 with
eslint-plugin-sonarjs@4.0.3,eslint-plugin-security@4.0.0,eslint-plugin-jsdoc@62.9.0; Prettier 3.8.3 formatting - Dependencies: Minimal (1 required production, 1 optional, ~40 dev), weekly Dependabot updates
- Minimal Attack Surface: Static architecture eliminates server-side vulnerabilities
- No Runtime Execution: Pure HTML/CSS with no backend processing
- Content Security Policy: Strict CSP headers prevent XSS
- HTTPS Only: All content delivered over HTTPS
- Generation: News generation scripts (TypeScript → Node.js)
- Presentation: Static HTML/CSS
- Data Access: MCP Client abstraction
- Infrastructure: GitHub-managed CI/CD and hosting
- 14 Languages Supported: Full multi-language coverage including RTL support
- Language-Specific Indexes: Separate navigation for each language
- SEO Per Language: Individual sitemaps and metadata
- Minimal Dependencies: One production dependency (
european-parliament-mcp-serverfor build-time data access), only dev dependencies otherwise - Standard Technologies: HTML5, CSS3, TypeScript (compiled to ES2025 JavaScript)
- Comprehensive Testing: Unit, integration, and E2E tests
- Documentation: Architecture, security, and process docs
- Static Content: Infinite scalability via CDN
- No Database: No scaling bottlenecks
- Cacheable: All content highly cacheable
- GitHub Infrastructure: Leverages GitHub's global infrastructure
- Cold Start: N/A (static site, no cold starts)
- Page Load: < 1s (static HTML, CDN cached)
- Build Time: ~5-10 minutes (generation for all languages)
- Deployment Time: ~1-2 minutes (S3 sync + CloudFront invalidation)
- Target: 99.9% (AWS CloudFront/S3 SLA)
- Redundancy: CloudFront with multiple edge locations globally
- Failover: Automatic via AWS infrastructure
- Monitoring: AWS Health Dashboard, GitHub Status page
- Attack Surface: Minimal (static files only)
- Vulnerability Scanning: Daily (Dependabot + npm audit)
- SAST: Weekly (CodeQL)
- Compliance: ISO 27001, GDPR, NIS2, EU CRA aligned
- Code Complexity: Moderate (5-stage pipeline + 8 strategies + 6 builders; no SPA framework)
- Test Coverage: 82%+ lines, 83%+ branches across 52 test files; 3061+ passing tests (unit, integration incl. EP/IMF/WB MCP contract tests, E2E Playwright)
- Documentation: Comprehensive (25+ architecture & ISMS docs — see Architecture Documentation Map)
- Dependencies: 1 pinned production (
european-parliament-mcp-server@1.2.18), 1 optional (worldbank-mcp@1.0.1), ~40 dev dependencies
- Security Architecture - Detailed security implementation and threat model
- Future Architecture - Architectural evolution roadmap
- Data Model - Data structures and EP/IMF/WB contracts
- Workflows - All 9 gh-aw + ~15 standard workflows, AI-First 2-pass enforcement
- End-of-Life Strategy - Technology lifecycle & EOL planning
- Flowcharts - Detailed process workflows
- State Diagrams - System state transitions
- Mindmaps - Conceptual system relationships
- SWOT Analysis - Strategic analysis and positioning
- README.md - Getting started guide and features overview
Document Status: Living Document
Last Updated: 2026-04-20
Next Review: 2026-07-20
Project Release: v0.8.40
Owner: CEO
This architecture documentation follows the C4 model methodology and complies with Hack23 ISMS Secure Development Policy.