Skip to content

[Hackathon] feat: Publish Workflow as REST API#5121

Open
EmilySun621 wants to merge 49 commits into
apache:mainfrom
EmilySun621:hackathon/workflow-api
Open

[Hackathon] feat: Publish Workflow as REST API#5121
EmilySun621 wants to merge 49 commits into
apache:mainfrom
EmilySun621:hackathon/workflow-api

Conversation

@EmilySun621
Copy link
Copy Markdown

The Problem

You built an amazing diabetes prediction workflow. It works perfectly. But now your hospital's patient intake system needs to call it in real-time — and there's no way to expose your workflow as an API.

Texera workflows are locked inside the browser. No external system can call them.

The Solution

One click turns any workflow into a REST endpoint.

┌─────────────────────────────────────────────────────────┐
│  Workflow Editor                                        │
│                                                         │
│  [CSV] → [Clean] → [Split] → [Model] → [Eval]         │
│                                                         │
│                              Click: 🌐 Publish as API  │
└──────────────────────────────┬──────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│  Published API                                          │
│                                                         │
│  POST /api/published/7/run                              │
│  X-API-Key: tex_bbfe69fb...                             │
│                                                         │
│  → Returns cached execution results as JSON             │
└─────────────────────────────────────────────────────────┘

How It Works

  1. Run your workflow — execute it once to cache results
  2. Click 🌐 Publish as API — generates endpoint URL + API key
  3. Share the curl command — any external system can fetch your results
bash
curl -X POST 'http://localhost:4200/api/published/7/run' \
  -H 'X-API-Key: tex_bbfe69fb...' \
  -H 'Content-Type: application/json'

Response:

json
{
  "workflowId": 7,
  "workflowName": "[Example] Data Exploration on Movies Dataset",
  "executedAt": "2026-05-17T05:47:29.799Z",
  "results": { ... }
}

Key Design Decisions

Decision | Choice | Why -- | -- | -- Execution model | Return cached results | No long-running HTTP requests; instant response Auth | Per-workflow API key | Simple, stateless, no OAuth complexity Storage | In-memory + localStorage | Hackathon speed; production would use DB Scope | Read-only | Safe — external clients can't modify workflows

What's Included

Backend (agent-service):

  • published-workflow-api.ts — POST /register + POST /:workflowId/run
  • API key validation (401 missing / 403 mismatch / 404 unpublished)
  • 5 unit tests, all passing

Frontend:

  • 🌐 icon button in workflow editor toolbar
  • Publish dialog: endpoint URL, masked API key (show/hide + copy), sample curl
  • Refuses to publish if no cached results exist

Use Cases

  • 🏥 Hospital system calls your prediction workflow for real-time patient risk scoring
  • 📱 Mobile app calls your NLP workflow to analyze user-reported symptoms
  • 🔄 Another team's pipeline calls your data cleaning workflow
  • ⏰ Cron job curls your workflow daily for automated reporting

Any workflow becomes a microservice.

Demo

  1. Open a workflow that has been executed → toolbar shows 🌐 button
  2. Click 🌐 → dialog with endpoint, API key, curl command
  3. Copy curl → run in terminal → JSON results returned instantly

Emily Sun and others added 30 commits May 15, 2026 21:55
This bundles the feature work that built up on this branch:

- Custom agents: dashboard CRUD page and editor dialog (48px icon tile,
  chip-style guardrails, model selector). Each custom agent now carries a
  LiteLLM model_name (Opus 4.7 / Haiku 4.5) that is passed through to the
  agent-service so different agents can use different models.

- Conversation history is scoped per (workflowId, agentId): switching
  agent or workflow yields a different conversation list. localStorage
  key: texera.workflowConversations.v1.{workflowId}.{agentId}.

- Time machine: workflow snapshot list, revert, and agent-tagged
  checkpoints. New workflow-history-tool in agent-service backs the
  "undo my last change" flow; amber gains a WorkflowSnapshotResource;
  sql/updates/23.sql adds the snapshot table.

- Operator-aware custom-agent prompts: the system prompt now injects the
  full operator catalog with a "prefer built-in operators over Python
  UDFs" rule, sourced from WorkflowSystemMetadata at request time.

- LiteLLM: added the claude-opus-4.7 entry alongside claude-haiku-4.5
  and gpt-5-mini in bin/litellm-config.yaml.

- Agent panel rewritten around the (conversation list / chat) two-view
  model with subscription-managed list reloads and per-step persistence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rompt

Adds a new agent tool that queries dkNET, UCI ML Repository, and Kaggle in
parallel and returns up to 5 results per source. Failures from individual
sources degrade gracefully so the rest still return. Kaggle is skipped when
KAGGLE_USERNAME / KAGGLE_KEY are not set.

Also fetches the user's accessible datasets via /api/dataset/list when an
agent is bound to a workflow (delegate config), and renders them in a "Your
Datasets" section of the system prompt with the path prefix a File Scan
operator would use to reference files in each one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new floating right-side panel that displays the most recent
agent-generated analysis report (model comparison tables, key metrics,
winner/recommendation, train-vs-test). The agent system prompt now instructs
the model to wrap structured result summaries in `<!-- REPORT_START -->` /
`<!-- REPORT_END -->` markers.

Flow:
- agent emits content wrapped in the markers
- agent-chat strips the marker block from inline rendering and shows a
  compact "Results ready — View Report" card in its place
- card click and new-report arrival both surface the Results Dashboard panel
- panel renders the markdown via ngx-markdown, with copy-to-clipboard and
  export-as-markdown buttons plus the generation timestamp

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New page that lists popular public datasets from dkNET, UCI, and Kaggle with
search and category filters. Backed by a hardcoded seed of ~30 well-known
datasets (Iris, Titanic, MNIST, COCO, TCGA, …) so the page always has
content even when the live catalog APIs are unavailable.

DatasetBankService:
- BehaviorSubjects for search query and active category, plus a combined
  filteredDatasets$ stream the component subscribes to.
- Best-effort live refresh from dkNET + UCI on first visit; results merge
  with the seed (Kaggle is skipped in the browser — CORS/auth).
- Hour-long localStorage cache for the merged list.

DatasetBankComponent:
- Standalone component with title/subtitle, full-width search bar,
  horizontal category chips (All / Biomedical / NLP / CV / Finance /
  Social Science / Time Series / Tabular), and a responsive card grid
  with name, source badge, description, rows/cols/size/format stats,
  tags, and an Import button.
- Import currently opens the source download / catalog page in a new tab
  and surfaces a toast — backend wiring to copy the file into the user's
  Texera datasets is left as a stretch goal.

Route registered at /dashboard/user/dataset-bank with a "Dataset Bank"
sidebar link under Your Work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pload

Card actions now show two equal-width buttons side by side:

- **Download** (left, outline): opens the bank entry's direct download URL
  (or source catalog page) in a new tab — the previous behavior.
- **Import** (right, primary): fetches the file in-browser and registers it
  as a Texera dataset under the current user.

Import goes through the existing user-dataset upload pipeline:
  DatasetService.createDataset()          → new dataset metadata
  DatasetService.multipartUpload()        → stages the file via LakeFS
  DatasetService.createDatasetVersion()   → publishes as v1

The button reflects state per card: idle → "Importing…" (loading) →
"Imported" (disabled, ✓). Failures (most commonly CORS on the source fetch)
re-enable the button so the user can retry, and surface a clear toast
suggesting the Download fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds POST /api/dataset-bank/import-from-url to agent-service. The endpoint
takes { url, name, description } plus a bearer token; server-side fetches
the source file (no browser CORS), then drives the existing dashboard
endpoints with the caller's token:

  /api/dataset/create
  /api/dataset/multipart-upload?type=init   → returns missingParts[]
  /api/dataset/multipart-upload/part        (per chunk, 5 MB)
  /api/dataset/multipart-upload?type=finish
  /api/dataset/{did}/version/create  body "v1"

Returns { did, datasetName, fileName, fileSize }.

DatasetBankService.importToTexera() now calls this proxy instead of
doing the browser-side fetch + multipart upload itself; the per-card
Import button flow on the Dataset Bank page is unchanged from the user's
perspective (idle → Importing… → ✓ Imported), but actually succeeds for
catalogs that don't send CORS headers (UCI, Kaggle direct downloads, etc.).

The Angular proxy.config.json routes /api/dataset-bank/* to localhost:3001
in dev so the existing relative-URL pattern keeps working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /api/dataset/* routes live in file-service (port 9092 per
file-service-web-config.yaml), not amber/dashboard (8080). The new
dataset-bank import proxy and the Feature-1 user-dataset list fetch
were both hitting 8080 and getting 404.

Adds TEXERA_FILE_SERVICE_ENDPOINT to env (default http://localhost:9092)
and exposes it on BackendConfig.fileServiceEndpoint. Both call sites now
read from there.

Also logs the exact URL + upstream status/body at every step of the import
pipeline so future endpoint drift is obvious from the agent-service logs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Card footer now has three actions side by side:

  [🔗 View on UCI]   [↓ Download]   [☁ Import]

The View link is a plain anchor (always visible, not on hover) opening
the dataset's catalog page in a new tab so users can verify the source
before importing. Layout is a 1.4fr / 1fr / 1fr grid that gives the
"View" pill room for the longer label without crowding Download or Import.

Also fixed the Human Protein Atlas seed entry to actually point at the
dkNET catalog (RRID:SCR_006710) instead of proteinatlas.org.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…detail

Replaces the utilitarian Projects list with a card-based gallery that surfaces
icon + name + description + stats (📄 workflows · 📊 datasets · 👥 members)
+ relative-time updated. Adds a dedicated Create/Edit Project modal with
emoji icon picker and preset color swatches, and rewrites the project detail
page as a tabbed view (Workflows / Datasets / Members) — Workflows tab reuses
the existing saved-workflow-section, Members tab renders ShareAccessComponent
inline (component now accepts @input bindings as a fallback to NZ_MODAL_DATA),
Datasets tab shows a "Coming soon" placeholder.

Workflow list items now badge their project memberships (emoji + name chip)
regardless of whether the project has a custom color. Per-project emoji icons
are stored in localStorage (no DB migration required for hackathon scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ence

Builds the UI on top of the existing CollaborationService scaffold. Adds a
slide-in panel on the right of the workspace with three tabs:

  • Chat — Yjs-synced team chat for the current workflow, Enter-to-send
  • Comments — per-operator threaded comments with resolve / reopen and
    inline reply composer. Right-click an operator → "add comment" opens
    a yellow new-thread composer at the top of that operator's section.
  • Online — presence list derived from Yjs awareness, idle status,
    "you" tag, and per-user current-edit hints.

The menu bar gets a team-icon toggle button with an nz-badge counting
unresolved comment threads across the workflow. The context menu on
single-operator selection gains an "add comment" item that opens the
panel pre-targeted at the clicked operator (uses a new
startNewThreadForOperator action on CollaborationService).

CollaborationService gains a draftThreadOperatorId stream so the panel
can surface a draft composer for operators with zero existing threads;
the service casts the readonly WorkflowGraph to access sharedModel /
newYDocLoadedSubject from the existing real-time edit infrastructure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Project Gallery polish on top of the earlier card redesign:
  • Workflows tab in project detail now fetches via the dedicated
    /api/project/:pid/workflows endpoint and renders a clean card list,
    bypassing the generic search index which silently dropped project-
    filtered queries.  Cards expose Open + Remove-from-project actions and
    the project's accessLevel as a badge.  "Create Workflow" creates an
    empty workflow, adds it to the project, and routes to the editor.
  • Datasets tab fully wired: localStorage-backed pid→dids overlay (no DB
    migration), card grid with size/owner/visibility, popconfirm remove,
    and a dedicated Add-Existing-Dataset modal that filters out datasets
    already in the project.
  • Members tab replaced the dual-card ShareAccessComponent UI with a
    dedicated ProjectMembersComponent: single-row invite (email + Editor/
    Viewer dropdown + Invite button) on top, member cards below with
    deterministic-color avatar, role badge, inline role dropdown, and
    popconfirm remove.
  • Add-Workflows and Add-Datasets modals rebuilt as searchable card lists
    with select-all, hover/selected states, primary "Add Selected" button
    with a white count pill, consistent 10px-row / 12px-gap styling.
  • Sidebar Projects link unconditional — *ngIf removed, projects_enabled
    default flipped to true, loadTabs() skips it so admin config can't
    flip it back.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a community gallery under /dashboard/hub/workflow-hub where users
browse, star, fork, and publish data science workflows. Backed by 15
seed entries and localStorage for stars/forks/views/publishes so the
page is never empty.

- List page: search, sort (trending/stars/forks/recent), category
  chips, featured grid, DAG-chain preview cards, agent badges.
- Detail page: SVG DAG preview, stats panel, fork-to-my-workflows
  (uses WorkflowPersistService.duplicateWorkflow when a backend wid is
  attached, otherwise falls back to a local stub), star toggle, and an
  optional 'Agent Included' card.
- Publish dialog: pulls the user's workflows via the persist service,
  derives operator chain from workflow content, writes a hub entry to
  localStorage.
- Sidebar: 'Workflow Hub' link added to the Hub submenu.
Seed entries don't have a workflowId, so the previous code only
incremented a localStorage counter and navigated to /dashboard/user/workflow
without actually writing to the backend — the forked workflow never showed
up in the Workflows page. Now the seed path calls
WorkflowPersistService.createWorkflow with empty content named
"[Fork] <title>", waits for the backend to return the new wid, and routes
straight into the new workflow's workspace. The duplicate-workflow path
for real-wid entries is unchanged.
…, role detection

Adds a Data Profiling Panel triggered from data-source operator properties
(CSV/JSON/Parquet/FileScan). The panel surfaces three derived views on top of
a single profile response — no new backend calls:

  - Data Quality Score (0–100): completeness, duplicates, outliers, constant
    columns, high-cardinality categoricals, and class-imbalance penalties,
    with a colored progress bar and sub-score badges.
  - Auto-Suggest Cleaning Actions: severity-sorted rules (drop sparse/ID/
    constant cols, impute via median/mode, deduplicate, review outliers) with
    an Add-to-Workflow button that copies an operator hint to the clipboard.
  - Column Relationship Detector: heuristic ID/target/feature/datetime/
    constant classification with badges per column and an auto-detected
    summary section.

Wires a small "📊 Profile Data" button into the operator property editor that
opens the panel as a draggable modal seeded with the operator's file path.
Backend integration is intentionally a follow-up; the service ships a
deterministic mock so the UX is fully exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ble rule

Adds a console.debug so we can see what operatorType is on the selected
operator (helps when the rule doesn't match an unexpected name). Also
broadens the profileable regex to include Text/File so anything that looks
remotely like a data source shows the button.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a "My Operators" section under Your Work where users can author
Python operators by writing UDF code, configuring ports/properties,
and saving them as draggable items in the workflow editor. Each saved
operator wraps a PythonUDFV2 node pre-filled with the saved code and
ports — no engine or operator-registry changes required.

- Data model + localStorage CRUD (custom-operator.{interface,service})
- "My Operators" list page + full-page editor with Monaco code editor,
  port/property builder, and a Test Run that validates UDF shape
- Operator panel surfaces a "My Operators" collapse at the top of the
  workflow editor; dragging a custom operator builds a PythonUDFV2
  predicate via CustomOperatorFactoryService and reuses the existing
  DragDropService flow via a new dragStartedCustom() entry point
- Custom properties get rendered as a PROPS dict prepended to the saved
  code so user UDFs can read self-configured values without extending
  the Python UDF schema

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous fix used createWorkflow with empty content, so forking a seed
entry produced a workflow with the right name but zero operators — the
"Executions doesn't exist" 403 the user saw was just the workspace trying
to load nonexistent executions for an empty workflow.

Now seed entries carry a sampleOperators field listing REAL Texera operator
types from the running backend's metadata (verified against the 163
operators the deployed build exposes). When the user forks:

1. Wait for OperatorMetadataService to publish the schema list.
2. For each known sampleOperators type, build a proper OperatorPredicate via
   WorkflowUtilService.getNewOperatorPredicate (which fills in ports, default
   properties, and the correct operatorVersion).
3. Connect consecutive operators by their first output→input ports.
4. Lay them out in a horizontal chain (200px apart).
5. POST to /workflow/create with the populated WorkflowContent and navigate
   to the new wid.

Any sampleOperators not present in the running build land in a single
comment box at the top of the canvas so the user can see what was intended.

For real (published) hub entries with a workflowId, the path is still
WorkflowPersistService.duplicateWorkflow — unchanged.
DataProfilingService now fetches the actual dataset file via
DatasetService.retrieveDatasetVersionSingleFile (presign-download endpoint),
parses with papaparse (first 5000 rows for performance), and runs a new
pure-TS profiler that computes:

  - dtype inference per column (numeric / datetime / boolean / categorical / text)
  - per-column: count, missing, missingPercent, unique, plus dtype-specific stats
  - numeric: mean, median, std, min, max, ±3σ outlier count, 10-bin histogram
  - categorical/boolean: top-5 value counts
  - dataset-level: row-key duplicate count
  - Pearson correlation matrix across (up to 8) numeric columns

If the source isn't a dataset path or any step fails (fetch / parse / empty
headers), we fall back to the deterministic mock so the panel always renders.
The panel header now shows a short filename (full path on hover) and surfaces
fetch/parse errors inline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…board

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llback

- New WorkflowDataService: lists user workflows, parses operators from
  workflow content, fetches latest-execution runtime stats per operator
  via existing /executions REST endpoints.
- Widgets now record a WidgetSource ({manual} or {workflow, wid, opId?,
  metric, scope}). Workflow-sourced widgets refresh on dashboard load.
- Add Widget modal now has two paths:
    From a Workflow — pick workflow + widget type + metric (and operator
    for Metric Card); supports metric/bar/hbar/donut/table.
    Manual Entry — unchanged form for all 6 widget types.
- Seed dashboard is now empty with a clear CTA instead of hardcoded demo
  data.

Note: Texera does not expose operator tuple data via REST (results live
only in the WebSocket cache during a workspace session). We render the
data that IS persisted and REST-queryable: per-operator tuple counts,
sizes, processing times, worker counts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Switch to Texera's light theme: white card surfaces, #f5f6f8 page bg,
  light-mode chart colors, #1677ff accent matching the rest of the app.
  Removed every dark-theme rule.
- Add Widget modal redesigned for multi-select:
  - Checkbox gallery of all 6 widget types with inline config forms.
  - Quick metric presets (Accuracy / F1 / Precision / Recall) pre-fill
    the Metric Card.
  - "Add N widgets" submits everything checked at once; modal stays
    open with a confirmation badge so users can compose another batch.
  - "Done" closes the modal.
- Dropped the operator runtime-stats path. Texera does not expose tuple
  data via REST, and runtime stats (tuple counts, sizes, latencies)
  aren't what researchers want on a results dashboard. Deleted the
  workflow-data service and the WidgetSource discriminator.
- Removed seed data entirely. List page now shows a real empty state
  with a Create Dashboard CTA. Service no longer auto-creates anything;
  legacy seed-* dashboards from earlier iterations get purged on load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add Widget is now a tabbed modal:

- Tab 1 "From Workflow" — pick a workflow that has been executed,
  see what's available, pick a data point, pick a widget type, add.
  Two data sources:
    * Saved results from localStorage at texera.workflow.results.{wid}
      (WorkflowResultsSnapshot shape) — picks up evaluation metrics
      (accuracy, F1, …) and tabular outputs whenever the workspace
      writes them. Allows Metric Card / Bar / Donut / Table widgets.
    * Runtime stats from REST — output rows, input rows, output size,
      processing time per operator. Always available after a run.
      Renders as Metric Card.

- Tab 2 "Manual Input" — pick widget type, fill in data, add.
  Five widget types: Metric / Bar / Donut / Table / Text.

Widgets added from a workflow carry a WidgetSource and display a
"From <workflow>" pill in the top-right of the widget frame.

Modal stays open after each add with a confirmation badge so users
can compose several widgets in one session; Done closes it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Result Panel already receives operator output via WebSocket during
execution. We tap the same stream — no new endpoint, no duplicate fetch.

- New DashboardResultCacheService subscribes to
  WorkflowResultService.getResultUpdateStream() at app load. Each
  update is snapshotted to localStorage:
    key   : texera.results.{wid}.{opId}
    value : { columns: string[], rows: any[][], timestamp: string }
  Caps at 100 rows per operator; LRU-evicts on QuotaExceededError. The
  current wid is tracked from /dashboard/user/workflow/{wid} URLs.
  Paginated operators are fetched via OperatorPaginationResultService
  .selectPage(1, 100) (debounced per op) since pagination updates only
  carry metadata, not rows.

- workflow-data.service.ts now reads per-operator keys and merges them
  with any legacy texera.workflow.results.{wid} bundle. Auto-detects
  metric shapes:
    * Single-row table → every numeric column becomes a metric
      (catches evaluation outputs like {accuracy, f1, precision}).
    * 2-column ≤20-row name|value table → each row becomes a metric
      (catches aggregations like {model: "RF", score: 0.94}).
  Rows are always exposed for Table / Bar / Donut widgets.

- AppComponent injects the cache service so it instantiates at boot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Results

Two bugs in the Add Widget modal:

1. The workflow dropdown was sized by nz-select's default min-width
   (~120px), truncating names to "[Example] Da...". Made it stretch to
   the full modal width and ellipsize at the actual edge.

2. Data cards were single-select via a separate "Widget type" step,
   which felt unresponsive — users had to click twice to enable Add.
   Switched to multi-select with toggle-on-click:
   - Each card has a checkmark in the top-right when picked.
   - "Add N Widget(s)" creates one widget per selected card in one go;
     metric/stat cards become Metric Cards, row cards become Tables.
   - Modal stays open afterward with confirmation; selection clears so
     users can compose another batch.

Also restructured the sections:
- Results — actual run data — gets a primary blue-tinted card with a
  prominent header at the top.
- Runtime stats moved into a collapsed section (a count pill + caret).
  Expand only when needed; most users don't care about input row
  counts and processing time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The empty-grid had `span[nz-icon] { font-size: 56px }` as a descendant
selector, which cascaded onto the button's "+" icon too — that's what
made it render at 56px on its own line above the "Add Widget" label.
Scoped that rule to a new `.empty-grid-icon` class on the dashboard
icon only, and gave `.empty-cta` explicit inline-flex so the icon and
text render side-by-side.

Also gated the CTA button (and its "click + Add Widget" prompt) behind
mode === 'edit'. In view mode the empty state shows just an icon and
"No widgets yet." — no button to click since adding widgets requires
edit mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the Result Panel receives output from a visualization operator
(Scatterplot, ConfusionMatrix, etc.), the row contains an html-content
column with a self-contained Plotly HTML document. We were storing it
in localStorage but then showing the raw HTML source in a Table widget.

- New "html" widget type. Renders via <iframe [srcdoc]> with
  sandbox="allow-scripts" + bypassSecurityTrustHtml so inline Plotly
  scripts actually execute. Mirrors what VisualizationFrameContentComponent
  does in the workspace.
- The Add Widget modal now scans cached operator outputs for cells
  that look like an HTML document (starts with <!doctype/<html/<script/
  <svg, or contains script/plotly markers) and surfaces each as its own
  HTML data point alongside the regular metrics/rows. Default widget
  type for those points is "html" with a chart-area icon.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause: safeHtmlContent was a getter that ran
sanitizer.bypassSecurityTrustHtml(...) on every change-detection cycle,
returning a fresh SafeHtml each time. Angular saw a "new" srcdoc value
and rebound it; the browser reloaded the iframe; Plotly redrew. Hence
the flicker.

- Cache the SafeHtml in a private field via an @input setter. We only
  recompute when the underlying htmlContent string actually changes.
- ChangeDetectionStrategy.OnPush on the widget component so CD only
  fires when its input reference actually changes.
- trackBy by widget id on the editor's widget *ngFor so a fresh
  widgets array reference doesn't destroy + recreate every iframe.
- min-height 240px on .html-iframe so a still-loading chart doesn't
  shift layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Emily Sun and others added 19 commits May 16, 2026 15:11
Previous attempt (cached SafeHtml + OnPush + trackBy) wasn't enough —
the user still saw the Plotly chart appear/disappear/reappear.

Switched to an imperative approach:

- Drop [srcdoc] binding entirely. Get the iframe via @ViewChild and
  set element.srcdoc = htmlContent imperatively. Track the currently-
  applied content; only re-set when the html string actually changes.
  Angular's change detection now has no way to retrigger a srcdoc
  attribute write, no matter how many times the parent's array
  reference flips.

- Memoize widgetStyle() in the editor — keyed on widget id + layout
  signature — so [ngStyle] receives a referentially-stable object
  when nothing has moved. Prevents Angular from re-applying inline
  styles on every CD cycle, which in turn keeps the widget-cell from
  retriggering CSS transitions that could resize the iframe and force
  Plotly to redraw.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the 12-column grid in favor of pixel-perfect x/y/width/height
layouts. Several problems with the grid:

- Newly-added widgets, dragged in grid units with a re-assigned mouse
  origin on every snap, gave a "glued to the cursor" feel — the widget
  appeared to chase the mouse after Add.
- Snapping made the resize handle feel sluggish.
- CSS transitions on the cell's width/height meant a resize redrew
  Plotly continuously.

New layout:
- WidgetLayout is { x, y, width, height } in pixels.
- nextLayout() places new widgets at x=16, y=(max bottom + 16). They
  always land below the existing stack — never under the cursor.
- Drag captures (mouseStart, widgetStart) on mousedown and applies
  newX = widgetStart + (mouseNow - mouseStart). Standard pattern; no
  origin re-assignment, no jitter.
- Resize handle: same pattern on bottom-right; clamped to MIN 160x120.
- CSS transitions on the cell's geometry removed.
- Edit-mode-only drag/resize via mode check at the top of startMove
  / startResize.
- Persists on every layout update through localStorage.
- Legacy {x,y,w,h} dashboards migrate to pixels on load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Resize handle is now an always-visible diagonal grip in the
  bottom-right corner of every widget when in Edit mode. Drawn with
  two CSS pseudo-element stripes (no icon font). Hover and active
  drag turn it blue.
- Blue selection border on the widget frame no longer appears on
  hover; it only shows on .widget-cell.dragging, which is toggled
  via isWidgetActive(w.id) when the user is actively moving or
  resizing the widget.
- iframe in the HTML widget drops min-height: 240px and gets
  min-height: 0 plus height: 100%, so it shrinks freely with the
  widget. The flex column on .widget already handles the title +
  iframe split, so the iframe fills whatever vertical space the
  user resizes to.
- Min widget size raised to 200x150 per spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…L rendering

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…set UI

Preserves in-progress work-in-progress changes before switching branches:
agent-service gains a data-source router with format utilities, and the
user-dataset frontend gains UI/styles backed by new dataset service
helpers. Saved so the snippets-quicksteps branch can be resumed cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…report generation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ent tool

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PubMed (live search) — when the user types a query of 3+ characters in
the Dataset Bank search box, DatasetBankService debounces 400ms then hits
NCBI eSearch + eFetch directly (NCBI sends CORS headers). Each returned
paper appears as a card with title, abstract, authors, journal, year.
Source badge is green "PubMed". Importing a paper sends its PMID to the
backend proxy, which re-fetches via eFetch server-side and emits a 1-row
CSV with columns (pmid, title, abstract, authors, journal, year).

WHO Global Health Observatory — 5 hardcoded seed entries with real GHO
indicator codes:
  - Life Expectancy at Birth      (WHOSIS_000001)
  - HIV Prevalence Adults 15-49   (HIV_0000000001)
  - Tuberculosis Incidence        (MDG_0000000020)
  - Malaria Estimated Deaths      (MALARIA_EST_DEATHS)
  - Under-Five Mortality Rate     (MDG_0000000007)

Source badge is geekblue "WHO". Import fetches the GHO indicator API
(https://ghoapi.azureedge.net/api/<indicator>) server-side and converts
the rows into a (country, year, sex, numeric_value, value) CSV.

New "Public Health" category chip groups WHO entries (and biomedical
seeds that touch population health) for filtering.

Backend proxy refactor: the existing /api/dataset-bank/import-from-url
now accepts a sourceType discriminator. "url" (default) keeps the
existing fetch-arbitrary-URL behavior. "pubmed" and "who" each fetch
their canonical API server-side, build a CSV, then feed the shared
createDataset → multipart-upload → createDatasetVersion pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Angular dev-server proxy was misrouting requests to /api/dataset-bank/*
because the catch-all /api/dataset rule (file-service, port 9092) shares a
common string prefix with /api/dataset-bank and was winning the proxy match
race despite the more-specific rule being declared first.

Avoid the collision by giving the agent-service endpoint a distinct path.
Component/directory names remain `dataset-bank` (it's the user-facing page
identity); only the HTTP path changes:

  proxy.config.json:   "/api/databank" → http://localhost:3001
  agent-service:       new Elysia({ prefix: "/databank" })
  frontend service:    POST /api/databank/import-from-url

A dev-server restart is required when proxy.config.json changes, since
webpack-dev-server does not hot-reload it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Last comment site that still mentioned the pre-rename /api/dataset-bank
path. No behavior change — the actual http.post() call already used the
new path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	frontend/src/app/app-routing.module.ts
#	frontend/src/app/dashboard/component/dashboard.component.html
#	frontend/src/app/dashboard/component/dashboard.component.ts
#	frontend/src/app/workspace/component/left-panel/operator-menu/operator-menu.component.ts
#	frontend/src/app/workspace/component/menu/menu.component.ts
#	frontend/src/app/workspace/component/workflow-editor/context-menu/context-menu/context-menu.component.ts
# Conflicts:
#	agent-service/src/agent/texera-agent.ts
#	agent-service/src/agent/tools/index.ts
#	agent-service/src/server.ts
#	frontend/proxy.config.json
…ures

# Conflicts:
#	frontend/src/app/app-routing.constant.ts
#	frontend/src/app/app-routing.module.ts
#	frontend/src/app/dashboard/component/dashboard.component.ts
Adds a "Publish as API" button to the workflow editor toolbar that
exposes the last cached execution results as a read-only HTTP
endpoint on agent-service. The endpoint returns the same rows the
Dashboard Visualizer reads (texera.results.* cache) — no real
execution is triggered.

- agent-service: POST /api/published/register and
  POST /api/published/:workflowId/run with X-API-Key validation
  (401 missing / 403 mismatch / 404 unpublished), backed by an
  in-process Map and covered by 5 new tests.
- frontend: PublishApiService gathers the cached snapshot from
  localStorage, generates an API key, registers with the backend,
  and persists the published list under texera.published.v1. The
  dialog shows endpoint URL, masked API key with show/hide + copy,
  and a sample curl command.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The button previously combined the 🌐 emoji with the text "Publish as
API" and sat inside the execution-buttons nz-space-compact group.
Because [nz-button] is force-sized to 32px in the toolbar, the long
label overflowed and visually crashed into the Run button.

- Strip the text; render only the 🌐 emoji inside the 32px button so
  it matches the other icon-only utility buttons.
- Use the native title attribute for the "Publish as API" tooltip,
  consistent with the rest of the toolbar.
- Lift the button out of nz-space-compact and place it directly under
  #button-groups, where the 16px flex gap separates it from the
  Computing Unit / Share / Kill / Run cluster.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added engine ddl-change Changes to the TexeraDB DDL frontend Changes related to the frontend GUI dev common agent-service labels May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-service common ddl-change Changes to the TexeraDB DDL dev engine frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant