Skip to content

add paradedb sample app#125

Open
HarshCasper wants to merge 3 commits intomainfrom
paradedb-sample-app
Open

add paradedb sample app#125
HarshCasper wants to merge 3 commits intomainfrom
paradedb-sample-app

Conversation

@HarshCasper
Copy link
Member

@HarshCasper HarshCasper commented Feb 4, 2026

Summary

  • Add ParadeDB movie search sample application demonstrating full-text search with LocalStack
  • Use AWS OpenSearch sample dataset with 5,000 movies including posters, ratings, runtime, and plot descriptions
  • Implement serverless architecture with Lambda, API Gateway, S3, and ParadeDB BM25 search index
  • Include web UI with movie poster thumbnails, fuzzy search, and pagination
  • Provide Makefile automation for dataset download, deployment, initialisation, and seeding.

Preview

image

@HarshCasper HarshCasper requested a review from whummer February 4, 2026 15:00
Copy link
Member

@whummer whummer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great, kudos for churning out this sample app @HarshCasper ! 🚀

I was just having some issues running the Web app - can you please confirm if the script.js is missing in the repo, or did I miss any of the installation commands?

Image

}

const pool = new Pool({
host: process.env.PARADEDB_HOST || "paradedb.localhost.localstack.cloud",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: General observation (also noticed this in the Wiremock sample recently) - would be great if we could avoid these "if local then ..." switches, and rather put the environment configuration at a higher level, and keep the production code free of "if local" switches as much as possible. 👍 (probably something that could be added as a general agent instruction to AGENT.md somewhere..)

});

const s3Client = new S3Client({
endpoint: "http://s3.localhost.localstack.cloud:4566",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above, would be great to externalize this endpoint configuration, to keep the Lambda code as close to production as possible, and only manage overrides via the environment..

@whummer
Copy link
Member

whummer commented Feb 6, 2026

Btw, just asked Claude for a review of the branch, just out of curiosity, and for comparison. Some of the points I've mentioned above are also included in there, and there are some additional nitpick comments (which I don't think are that important, tbh). Just wanted to share.. 🙂 👍

  Summary: Adds a sample movie search app demonstrating ParadeDB BM25 full-text search on LocalStack, using CDK (Lambda + API Gateway + S3) with a static
  web UI.

  Issues

  1. Missing web/script.js — web/index.html:53 loads <script src="script.js"> but no script.js file exists in the web/ directory. The web UI is
  non-functional as shipped.
  2. getLambdaCode() called 4 times — lib/movie-search-stack.ts:47-68 defines a getLambdaCode() function that bundles the Lambda on every invocation. This
  means esbuild runs 4 times during cdk synth and produces 4 separate asset copies. Should be called once and the result reused.
  3. Seed handler is not batched — lambda/index.ts:266-291 issues individual INSERT queries in a loop (5000 sequential queries). Should use multi-row INSERT
   or COPY for meaningful performance improvement. The batchSize = 100 variable at line 261 is misleading since there's no actual batching — each movie is
  still a separate query.
  4. No transaction around seed — The DELETE FROM movies at line 258 followed by individual inserts means a failure mid-seed leaves the table in a partial
  state. Wrap in a transaction (BEGIN/COMMIT).
  5. Pool defaults vs. stack defaults mismatch — lambda/index.ts:24-28 has hardcoded defaults (postgres/postgres/5432) that differ from the CDK environment
  variables in movie-search-stack.ts:33-38 (myuser/mypassword/4566/mydatabase). The defaults are never used at runtime since the env vars are set, but
  they're misleading.
  6. Hardcoded S3 endpoint — lambda/index.ts:31-37 hardcodes s3.localhost.localstack.cloud:4566. This should come from an environment variable for
  consistency with the rest of the config.
  7. package-lock.json committed for lambda — The 2357-line lambda/package-lock.json is committed. Consider whether this is intentional for a sample app, or
   if it should be in .gitignore.

  Minor / Style

  - The README is thorough and well-structured — unusual for AI-generated content, no issues there.
  - The Makefile get-api-url target is a useful utility but isn't listed in make help.
  - CORS is correctly configured in both the API Gateway and Lambda response headers.

  Verdict

  The main blocker is the missing script.js — the web UI literally can't work without it. The getLambdaCode() duplication and unbatched inserts are worth
  fixing. Otherwise the structure is reasonable for a sample app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants