Skip to content

Conversation

@nearestnabors
Copy link
Contributor

@nearestnabors nearestnabors commented Feb 4, 2026

Currently our public markdown files are littered with janky JSX components that LLMs can't read and link to HTML files instead of other markdown files. This PR FIXES that.

Summary

  • Generate clean markdown from rendered HTML pages during build time
  • Update /api/markdown endpoint to serve pre-generated clean markdown (falls back to real-time conversion)
  • Add CopyPageOverride component that intercepts the "Copy page" button to fetch clean markdown from the API
  • Add frontmatter (title, description) extracted from HTML <meta> tags to generated markdown files
  • Add public/_markdown/ to .gitignore

Changes

  • scripts/generate-clean-markdown.ts: New script that runs a production server, fetches rendered HTML, and converts to clean markdown using Turndown
  • app/api/markdown/[[...slug]]/route.ts: Updated to serve pre-generated markdown first, with fallback
  • app/_components/copy-page-override.tsx: Client component that intercepts copy button clicks
  • app/_components/custom-layout.tsx: Includes the CopyPageOverride component
  • scripts/generate-llmstxt.ts: Uses pre-generated clean markdown if available
  • package.json: Added build scripts for generating clean markdown

Test plan

  • Run pnpm build to verify the build succeeds
  • Run pnpm generate:clean-markdown to verify markdown generation works
  • Visit any page and click "Copy page" to verify it copies clean markdown
  • Check that /api/markdown/en/home.md returns markdown with frontmatter

🤖 Generated with Claude Code

- Generate clean markdown from rendered HTML pages during build
- Update /api/markdown endpoint to serve pre-generated clean markdown
- Add CopyPageOverride component to fetch clean markdown on "Copy page"
- Add frontmatter (title, description) extracted from HTML meta tags
- Fix linting issues with top-level regex and simplified logic

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@vercel
Copy link

vercel bot commented Feb 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Feb 4, 2026 0:46am

Request Review

Copy link
Contributor

@evantahler evantahler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the test has been running for a few hours... there might be something preventing the processs from exiting (a dangling prommise?)

I'd also love to see a test for one of of the clean markdown files that initially had some HTML that was removed successfully

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants