genlayerlabs · jmlago · Jun 22, 2026 · Jun 22, 2026
diff --git a/.env.example b/.env.example
@@ -32,3 +32,29 @@ PUBLIC_BASE_URL=http://127.0.0.1:8080/v1
 RATE_PER_MIN=600
 BURST=200
 LOG_LEVEL=INFO
+
+# ============================================================================
+# AntSeed marketplace (OPTIONAL — only with `docker compose --profile antseed`)
+# ============================================================================
+# AntSeed lets the router buy inference from a decentralized marketplace, paid in
+# REAL USDC on Base mainnet from a hot wallet you control. Off by default.
+#
+# ⚠️  ALWAYS use a DEDICATED DEV WALLET here with a tiny balance — NEVER your
+#     production wallet key. This var IS a private key: treat it like a password,
+#     never commit it (.env / .env.secrets are gitignored).
+#
+# Setup (3 steps):
+#   1. Generate a dev wallet:   ./scripts/gen-dev-wallet.sh   (prints the two
+#      lines below; paste them into .env)
+#   2. Bring it up:             docker compose --profile antseed up -d --build
+#   3. Get the address to fund: docker compose exec antseed antseed buyer balance --json
+#      then send a little USDC + ETH (gas) on **Base mainnet** to that address,
+#      and `deposit` it into escrow from the dashboard Catalog (wallet cell).
+ANTSEED_IDENTITY_HEX=
+# Shared secret enabling the dashboard's wallet self-service (deposit/withdraw).
+# Same value on the router and the antseed sidecar. Unset => those endpoints 503.
+ANTSEED_CONTROL_TOKEN=
+# Wide outer spend ceilings (USD per million tokens). The real per-call price gate
+# is the caller's Σ_pol policy; these are just rails.
+ANTSEED_MAX_INPUT=1000
+ANTSEED_MAX_OUTPUT=1000
diff --git a/behave.ini b/behave.ini
@@ -0,0 +1,4 @@
+[behave]
+paths = features
+tags = -manual
+show_timings = true
diff --git a/docs/PROVIDERS.md b/docs/PROVIDERS.md
@@ -46,6 +46,20 @@ The host pins the policy-selected peer per request via `x-antseed-pin-peer`
 (the browse-mode buyer disables auto-selection), keeping peer choice inside
 Σ_pol rather than an opaque buyer-side router.
 
+### Local dev wallet (testing)
+
+For local testing use a **dedicated dev wallet**, never your production key.
+`./scripts/gen-dev-wallet.sh` prints a fresh `ANTSEED_IDENTITY_HEX` +
+`ANTSEED_CONTROL_TOKEN` to paste into `.env`; bring the sidecar up
+(`docker compose --profile antseed up -d`), read the derived address with
+`docker compose exec antseed antseed buyer balance --json`, fund it with a little
+USDC + ETH (gas) on Base, then **Deposit** into escrow from the dashboard Catalog
+(wallet cell). Keep dev and prod wallet secrets separate. See `.env.example`.
+
+> Note: the AntSeed deposits contract **locks** deposited funds — an immediate
+> `withdraw` after a `deposit` reverts. Funds are safe in escrow and become
+> withdrawable later, or are spent as the buyer routes paid calls.
+
 ### Running the node (vendored sidecar)
 
 Built from `Dockerfile.antseed` (pinned `@antseed/cli`, `socat`) and run by

diff --git a/features/01_onboarding.feature b/features/01_onboarding.feature
@@ -0,0 +1,34 @@
+Feature: Onboarding & setup — a new user gets a running, healthy stack
+  The clone/compose steps themselves are environment-level (@manual: cannot be
+  re-run inside the suite); here we assert their OUTCOME on the running stack.
+
+  @p0 @onboarding
+  Scenario: The core engine submodule is populated (recursive clone outcome)
+    Then the file "core/router.lua" exists
+    And the file "core/llm_policy.lua" exists
+
+  @p0 @onboarding
+  Scenario: The stack is up and healthy (compose up outcome)
+    Given the stack is healthy
+    When I GET "/healthz" as none
+    Then the status is 200
+    And the field "ok" equals "True"
+
+  @p0 @onboarding
+  Scenario: The router loaded its catalog (engine embedded + config.live.lua)
+    Given I have a caller token
+    When I GET "/v1/models" as consumer
+    Then the status is 200
+    And the array "data" has at least 5 items
+
+  @manual @onboarding
+  Scenario: Recursive clone (manual — run once on a fresh machine)
+    # git clone --recursive https://github.com/genlayerlabs/unhardcoded.git
+    # -> core/ submodule populated; covered by the 'submodule populated' outcome above.
+    Given the stack is healthy
+
+  @manual @onboarding
+  Scenario: docker compose up --build (manual — environment setup)
+    # cp .env.example .env.secrets; fill secrets; docker compose up -d --build
+    # -> router + ingress healthy; covered by the 'stack up and healthy' outcome above.
+    Given the stack is healthy
diff --git a/features/02_auth.feature b/features/02_auth.feature
@@ -0,0 +1,39 @@
+Feature: Authentication — dashboard sessions and the caller bearer contract
+
+  Background:
+    Given the stack is healthy
+
+  @p0 @auth
+  Scenario: DASHBOARD_NO_AUTH grants local admin to the console API
+    When I GET "/dashboard/api/stats" as admin
+    Then the status is 200
+    And the field "viewer_role" equals "admin"
+
+  @p0 @auth
+  Scenario: A valid caller bearer token is accepted on /v1
+    Given I have a caller token
+    When I GET "/v1/models" as consumer
+    Then the status is 200
+
+  @p0 @auth
+  Scenario: A missing caller token is rejected on /v1
+    When I GET "/v1/models" as none
+    Then the status is 401
+    And the field "error.code" equals "caller_auth"
+
+  @p1 @auth
+  Scenario: A consumer can log into the dashboard with their API key (scoped session)
+    Given I have a caller token
+    When I log into the dashboard with my caller key
+    Then the status is 200
+    And the field "role" equals "consumer"
+
+  @manual @auth
+  Scenario: Admin password login (manual — needs DASHBOARD_PASSWORD_SHA256 set and NO_AUTH off)
+    # POST /dashboard/login {password} -> sets an admin session cookie.
+    # Not auto-tested: the local dev stack runs with DASHBOARD_NO_AUTH=1.
+    Given the stack is healthy
+
+  @manual @auth
+  Scenario: Trusted-header SSO admin (manual — needs a reverse proxy injecting the header+secret)
+    Given the stack is healthy
diff --git a/features/03_consumer_api.feature b/features/03_consumer_api.feature
@@ -0,0 +1,94 @@
+Feature: Consumer API flows (/v1) — the calling service's surface
+  As a consuming service I call /v1 with my bearer token and the router
+  decides/falls-back over the operator's provider keys. All end-to-end chats
+  here route to codex ($0) so the suite is free.
+
+  Background:
+    Given the stack is healthy
+    And I have a caller token
+
+  @p0 @api
+  Scenario: List the routable model catalog
+    When I GET "/v1/models" as consumer
+    Then the status is 200
+    And the field "object" equals "list"
+    And the array "data" has at least 5 items
+    And the array "data" includes an item where "id" equals "profile:default"
+
+  @p0 @api
+  Scenario: Chat completion runs a policy and returns a real answer + trace
+    When I POST a free chat as consumer
+    Then the status is 200
+    And the field "object" equals "chat.completion"
+    And the field "choices[0].message.content" is non-empty
+    And the field "usage.total_tokens" is a number
+    And the field "x_router.provider" is non-empty
+    And the field "x_router.served_model_id" is non-empty
+    And the field "x_router.decision_trace" is present
+
+  @p0 @api
+  Scenario: Per-call policy_ir is admitted and executed
+    When I POST "/v1/chat/completions" as consumer with json
+      """
+      {"model":"","max_tokens":16,"messages":[{"role":"user","content":"hi"}],
+       "policy_ir":["policy",
+         ["and",["meets_req"],["not",["is","disabled"]],["family_eq","gpt-5.5"]],
+         ["neg",["normalize",["field","price_in"]]],
+         ["argmax"],["id"],["always",{"action":"next_candidate"}]]}
+      """
+    Then the status is 200
+    And the field "x_router.policy_fingerprint" is present
+    And the field "choices[0].message.content" is non-empty
+
+  @p1 @api
+  Scenario: Malformed policy_ir is rejected cleanly at admission (no spend)
+    When I POST "/v1/chat/completions" as consumer with json
+      """
+      {"model":"","messages":[{"role":"user","content":"hi"}],
+       "policy_ir":["policy","not-a-valid-term"]}
+      """
+    Then the status is 400
+    And the field "error.type" equals "invalid_request_error"
+    And the field "error.message" contains "policy_ir"
+
+  @p0 @api
+  Scenario: Sigma_flow DAG runs and returns the sink answer with a per-node trace
+    When I POST a free flow as consumer
+    Then the status is 200
+    And the field "x_router.provider" equals "flow"
+    And the field "choices[0].message.content" is non-empty
+    And the array "x_router.decision_trace.flow_nodes" has at least 2 items
+    And every item in "x_router.decision_trace.flow_nodes" has a "provider"
+    And every item in "x_router.decision_trace.flow_nodes" has a "served_model_id"
+
+  @p1 @api
+  Scenario: Malformed flow_ir is rejected at admission
+    When I POST "/v1/chat/completions" as consumer with json
+      """
+      {"model":"","messages":[{"role":"user","content":"hi"}],
+       "flow_ir":["flow",{"out":{"kind":"output","inputs":["missing"]}}]}
+      """
+    Then the status is 400
+    And the field "error.message" contains "flow_ir"
+
+  @p1 @api
+  Scenario: Per-key usage self-service is scoped and sanitized
+    When I POST a free chat as consumer
+    And I GET "/v1/usage?window=24h" as consumer
+    Then the status is 200
+    And the field "kind" equals "router_key_usage"
+    And the field "key_sha256_prefix" is non-empty
+    And the field "totals.requests" is at least 1
+    And the field "consumer_settings.status" is present
+
+  @p0 @api
+  Scenario: Missing bearer token is rejected
+    When I GET "/v1/models" as none
+    Then the status is 401
+    And the field "error.code" equals "caller_auth"
+
+  @p0 @api
+  Scenario: Unknown bearer token is rejected
+    When I GET "/v1/models" as bad
+    Then the status is 401
+    And the field "error.code" equals "caller_auth"
diff --git a/features/04_dashboard.feature b/features/04_dashboard.feature
@@ -0,0 +1,109 @@
+Feature: Dashboard data — what the operator console renders MUST be present and correct
+  The dashboard is a thin renderer of /dashboard/api/*. These scenarios assert the
+  backing data is complete and correct (so the frontend shows real, correct values
+  in Analytics, Activity, Catalog, Config, Consumers, Provider keys). Seeded
+  activity (one chat + one flow) is created in before_all.
+
+  Background:
+    Given the stack is healthy
+
+  @p0 @dashboard
+  Scenario: The dashboard HTML page loads with all its tabs and renderers
+    When I GET "/dashboard" as admin
+    Then the status is 200
+    And the response text contains "Analytics"
+    And the response text contains "Builder"
+    And the response text contains "Activity"
+    And the response text contains "Catalog"
+    And the response text contains "Config"
+    And the response text contains "renderActivity"
+    And the response text contains "renderAnalytics"
+
+  @p0 @dashboard
+  Scenario: Analytics — totals, breakdowns and health are populated
+    When I GET "/dashboard/api/stats" as admin
+    Then the status is 200
+    And the field "viewer_role" equals "admin"
+    And the field "totals.requests" is at least 1
+    And the field "totals.tokens_total" is a number
+    And the field "totals.cost_usd" is a number
+    And the field "by_provider" is non-empty
+    And the field "by_status" is non-empty
+    And the field "health_summary" is present
+    And the array "daily_totals" has at least 1 items
+
+  @p0 @dashboard
+  Scenario: Activity — recent requests carry a full, correct per-request trace
+    When I POST a free chat as consumer
+    And I POST a free flow as consumer
+    And I GET "/dashboard/api/stats" as admin
+    Then the status is 200
+    And the array "recent" has at least 2 items
+    And every item in "recent" has a "status"
+    And every item in "recent" has a "ts"
+    And the array "recent" includes an item where "provider" equals "flow"
+    And the array "recent" includes an item where "provider" equals "openai"
+
+  @p0 @dashboard
+  Scenario: Catalog (Market) — families list with prices and per-seller perf
+    When I GET "/dashboard/api/market" as admin
+    Then the status is 200
+    And the array "families" has at least 3 items
+    And every item in "families" has a "family"
+    And every item in "families" has a "quality"
+    And every item in "families" has a "rows"
+    And the array "families" includes an item where "family" equals "gpt-5.5"
+
+  @p0 @dashboard
+  Scenario: Policies — the default profile and live providers with health
+    When I GET "/dashboard/api/policies" as admin
+    Then the status is 200
+    And the array "profiles" includes an item where "name" equals "default"
+    And the field "providers" is non-empty
+    And every item in "providers" has a "health"
+
+  @p1 @dashboard
+  Scenario: Builder field vocabulary is available
+    When I GET "/dashboard/api/fields" as admin
+    Then the status is 200
+    And the array "fields" includes an item where "name" equals "price_in"
+    And the array "fields" includes an item where "name" equals "latency_ms"
+    And the array "fields" includes an item where "name" equals "success_rate"
+
+  @p1 @dashboard
+  Scenario: Config — per-provider tunable knobs are present
+    When I GET "/dashboard/api/config" as admin
+    Then the status is 200
+    And the field "knobs" is non-empty
+
+  @p1 @dashboard
+  Scenario: Consumers — the test consumer is listed with stats
+    When I GET "/dashboard/api/keys" as admin
+    Then the status is 200
+    And the array "keys" includes an item where "consumer" equals "bdd-test"
+
+  @p1 @dashboard
+  Scenario: Provider keys — credentials snapshot is privatized but present
+    When I GET "/dashboard/api/provider-keys" as admin
+    Then the status is 200
+    And the field "rows" is non-empty
+
+  @p1 @dashboard
+  Scenario: Codex accounts — an active account is configured
+    When I GET "/dashboard/api/codex/accounts" as admin
+    Then the status is 200
+    And the field "accounts" is non-empty
+    And the field "active" is non-empty
+    And the field "activity" is present
+
+  @p1 @dashboard
+  Scenario: Builder dry-run ranking (policy preview) returns an ordering (no spend)
+    When I POST "/dashboard/api/policy/preview" as admin with json
+      """
+      {"policy_ir":["policy",
+        ["and",["meets_req"],["not",["is","disabled"]],["family_eq","gpt-5.5"]],
+        ["neg",["normalize",["field","price_in"]]],
+        ["argmax"],["id"],["always",{"action":"next_candidate"}]]}
+      """
+    Then the status is 200
+    And the field "ranked" is non-empty
diff --git a/features/05_providers.feature b/features/05_providers.feature
@@ -0,0 +1,42 @@
+Feature: Providers — OpenRouter, Codex, discovery and registered model traits
+  Asserts the configured providers are live and the registered benchmark/modality
+  fields (model_meta) are part of the field vocabulary the builder/policies use.
+
+  Background:
+    Given the stack is healthy
+
+  @p0 @providers
+  Scenario: OpenRouter and Codex providers are present with health
+    When I GET "/dashboard/api/policies" as admin
+    Then the status is 200
+    And the array "providers" includes an item where "name" equals "openrouter"
+    And the array "providers" includes an item where "name" equals "openai"
+    And every item in "providers" has a "health"
+
+  @p1 @providers
+  Scenario: Codex is configured as a ChatGPT-subscription (openai_codex) provider
+    When I GET "/dashboard/api/policies" as admin
+    Then the status is 200
+    And the array "providers" includes an item where "name" equals "openai"
+    And the matched item field "api_kind" equals "openai_codex"
+
+  @p1 @providers
+  Scenario: A Codex account is active (auth wired through)
+    When I GET "/dashboard/api/codex/accounts" as admin
+    Then the status is 200
+    And the field "accounts" is non-empty
+    And the field "active" is non-empty
+
+  @p1 @providers
+  Scenario: Registered model traits (model_meta benchmarks) are in the field vocabulary
+    When I GET "/dashboard/api/fields" as admin
+    Then the status is 200
+    And the array "fields" includes an item where "name" equals "bench_intelligence"
+    And the array "fields" includes an item where "name" equals "bench_coding"
+
+  @p1 @providers
+  Scenario: The discovered catalog exposes routable families
+    Given I have a caller token
+    When I GET "/v1/models" as consumer
+    Then the status is 200
+    And the array "data" includes an item where "id" equals "family:gpt-5.5"