[Hackathon] feat: AI-augmented macro operators by Xiao-zhen-Liu · Pull Request #5115 · apache/texera

Xiao-zhen-Liu · 2026-05-16T20:50:57Z

PR Description: AI-Augmented Macro Operators

What changes were proposed in this PR?

A user builds a workflow today by dropping individual operators onto the canvas one at a time. As the workflow grows, the canvas turns into a wall of nodes; common sub-DAGs (CSV → Filter → Projection, an enrichment join, a feature-prep chain) get re-built by hand every time. There's no first-class way to encapsulate a sub-DAG, share it, or have the agent suggest one — and once a sub-DAG is repeated, there's no signal pushing the user to refactor.

This PR introduces macro operators: a logical-plan-level abstraction that lets a sub-DAG live as a single node on the canvas, plus the AI surfaces (suggest, fuse, drill-down) that make encapsulation discoverable.

Demo Video

Before / after

User task	Before	After
Reuse a sub-DAG	Hand-rebuild every time	Right-click → Create macro → instance lives in "Your Macros" palette
Discover refactor opportunities	Eyeball the canvas	Suggest Macros (AI) ranks candidates with confidence chips; recurring patterns auto-tier as `✓ recommended`
Share a macro across workflows	Copy-paste ops manually	Export as a self-contained JSON bundle (nested macros travel transitively)
Compose sub-DAGs	One level only	Macros nest arbitrarily; the canvas + drill-down rolls execution stats up at every level
Speed up a sub-DAG	Manually write a Python UDF	Right-click → fuse for performance (AI) collapses the body into one `PythonUDFOpDescV2`; canvas shows `⚡ FUSED · 1.6×`
Inspect a macro body	No path	Double-click the macro op → drill-down editor renders the body, with live execution stats flowing in
Re-use the same shape twice	Materialize each independently	Auto-optimize finds every occurrence of a pattern and swaps them all in one click

The story

Drop a few operators on the canvas — CSV scan, two filters, a projection, a sink. The ✨ Suggest Macros (AI) button in the palette already shows a 2 badge before you click — the agent has been silently scanning the workflow on every graph change. Click it: a panel slides in with two candidates, the strongest tagged ✓ recommended because the same Filter → Projection shape appears twice in your workflow. The suggested name is data_cleaning (domain-aware, not filter_projection_block). Hover a row — the matching ops light up on canvas. Click it: the selected ops collapse into a single macro node, and the same shape elsewhere gets the same swap because the auto-optimize pass spotted the duplicate.

Run the workflow. Stats roll up to the macro node — its input/output port counts and state badge update in real time, summed from the inner ops the engine actually executes. Double-click the macro: the canvas swaps to a drill-down editor showing the body with the same dagre auto-layout as the main canvas, and stats keep flowing because the body-relative op IDs are aliased to their post-expansion runtime UUIDs. Click into a nested macro inside the body — stats keep working three levels deep.

Want it faster? Right-click the macro → fuse for performance (AI). The frontend's codegen walks the body operators (Filter, Projection, Regex, Limit, Distinct, inlined PythonUDFV2 with yield-rewriting), emits a process_tuple function, attaches it as a MacroFusion payload with verified = true, and stamps the operator gold with a ⚡ FUSED · 1.6× badge. The speedup is grounded: N − 1 removed actor handoffs × 0.30 × per handoff, capped at 4× per VLDB 2024 §6's empirical measurements. Run again. At compile time, the backend's MacroExpander reads fusion.verified = true and substitutes a single PythonUDFOpDescV2 for the inlined body — no inter-actor serialization for the collapsed steps.

Share a macro? Right-click → export. You get a portable JSON bundle that includes every nested macro the root depends on, so importing on a fresh Texera instance reconstructs the whole dependency graph.

How it works under the hood

Macros live at the logical-plan layer only. A new MacroExpander pre-compile pass (mirrored in amber/ and workflow-compiling-service/) inlines every MacroOpDesc into its body operators and rewrites parent edges, so the physical-plan layer never sees a macro. The expander runs before expandLogicalPlan, and the rest of the engine pipeline behaves as if the workflow had been hand-flattened.

Deterministic UUIDs for inner ops. The expander assigns each inner op a fresh ID via UUID.nameUUIDFromBytes("${macroInstanceId}|${originalBodyOpId}"). Required because:

The original ${instanceId}--${innerOp} prefix scheme produced 170+ char IDs that caused Iceberg commit thrash on HashJoin's internal build-side port — execution that ran fine on a hand-flattened plan hung on the macro-wrapped equivalent.
Texera has two compilers — WorkflowCompilingService (frontend validation) and ComputingUnitMaster's WorkflowCompiler (actual execution). Both run MacroExpander on the same workflow content. If they used UUID.randomUUID(), the side-table written by one wouldn't match the runtime stats emitted by the other; deterministic UUIDs guarantee bit-identical plans.

Provenance side-table. MacroExpander populates Map[runtimeOpId → MacroProvenance(macroChain, bodyOpId)] during expansion; WorkflowCompiler drains it after compile and stores it in MacroMappingCache (file-backed at /tmp/texera-macro-mappings/wid-{wid}.json for cross-JVM visibility between ComputingUnitMaster and TexeraWebApplication). Exposed via GET /api/workflow/{wid}/macro-mapping. Frontend WorkflowStatusService.withMacroAggregates walks the chain to roll inner-op stats up to every macro level — parent canvas + each nested drill-down.

Nested macros recurse fully. A runtime op buried three macros deep has macroChain = [outerInstance, middleInstance, innerInstance]; the resolver suffix-matches so a stats roll-up rooted at any level finds its runtime ops.

AI surfaces. MacroSuggestionService runs two heuristic detectors side-by-side: linear chains (≥2 ops where each interior node has in-deg=1 and out-deg=1) and recurring (opType₁, opType₂, …) window patterns. Recurring patterns auto-tier as ✓ recommended; clean middle chains tier as strong fit. Names map domain-aware substrings (csv.*scan.*filter.*projection → csv_preprocessing, regex.*filter → text_filtering, etc.) instead of underscore-joining op types. MacroFusionService emits a Python UDF body from the macro body, covering Filter, Projection, Regex, Limit, Distinct, inlined PythonUDFV2 (yield-rewritten). The fusion.verified = true flag is the contract MacroExpander reads to substitute; the rest of the speedup estimate is presentation.

What this also fixes along the way

/api/macro/* HTTP storm — lazy fetches in template bindings caused an infinite loop; reverted to a flat palette and removed the lazy fetches.
Engine error visibility — phase-transition errors and missing-schema-port errors now propagate out of RegionExecutionCoordinator instead of stalling silently.
View-result inside a macro — drill-down result lookups go body-relative-id → runtime-UUID via MacroService.buildBodyOpIdToRuntimeUuidMap (replaces the obsolete prefix-based alias). Mega-macros with 0 external outputs alias the canvas op to the first body sink, so the auto-stored terminal output is reachable without drilling.
Back-to-parent stats — WorkflowStatusService re-aggregates the cached raw status on every runtimeMacroMappingTick; its emission Subject becomes ReplaySubject(1) so the canvas remount after navigation sees the latest snapshot immediately.
Jackson macroSyncedAt / estimatedSpeedup UnrecognizedPropertyException — MacroOpDesc and MacroFusion both annotated with @JsonIgnoreProperties(ignoreUnknown = true) so UI-only convenience fields don't break deserialization at execute time.

Related issues, documentation, discussions

Related to the Apache Texera Agent Hackathon (#5059). Builds on §9.2 of the macro design doc (AI fusion substitution path).

How was this PR tested?

MacroExpanderSpec (~694 lines) covers the expander on its own: single-macro expansion, nested expansion (outer + inner chains), input fan-out (one external port → multiple inner consumers), output fan-in detection (raises), cycle detection across nested macros, depth-limit guard, deterministic-UUID property (same input → same output across compiler instances), and provenance side-table population.
MacroOpDescSpec covers Jackson serialization round-trip, including tolerance of unknown frontend-only fields (macroSyncedAt, estimatedSpeedup).
End-to-end demo path exercised on a real multi-macro workflow: suggest → materialize → run → drill into nested macro → fuse → run with ⚡ FUSED · 1.6× substitution → unfuse → export bundle → reimport on a fresh wid. Stats roll up correctly at every level; canvas remount after navigation no longer wipes non-macro op state.

sbt "WorkflowExecutionService/testOnly *MacroExpanderSpec"
sbt "WorkflowOperator/testOnly *MacroOpDescSpec"
sbt "WorkflowOperator/compile" "WorkflowCompilingService/compile" "WorkflowExecutionService/compile"
yarn tsc --noEmit  # frontend

Generated by

Claude Code (Claude Opus 4.7)

KNIME-metanode-style composite operators for Texera. Macros live purely at the logical-plan layer: a new MacroExpander pre-pass inlines each MacroOpDesc into a flat LogicalPlan before physical-plan compilation, so PhysicalPlan, PhysicalOpIdentity, and the Amber engine remain unchanged. Backend (new): - MacroOpDesc, MacroInputOp, MacroOutputOp LogicalOps registered in Jackson @JsonSubTypes; getPhysicalPlan throws to signal a missed expansion pass. - MacroBody, MacroLink, MacroPortSpec, MacroFusion data classes. - MacroExpander: inlines each macro by splicing inner ops/links via boundary markers and prefixes inner-op IDs with the instance ID (\${macroInstanceId}/\${innerOpId}), so per-macro telemetry can be aggregated purely from the operator-ID prefix. Cycle and depth-16 guards via MacroCompileContext. Pluggable MacroRegistry (Empty / inMemory; persistence-backed impl is a later step). - WorkflowCompiler (workflow-compiling-service) calls MacroExpander.expand before scan-source resolution. Backward- compatible: new ctor param defaults to MacroRegistry.Empty. - TODO note in amber WorkflowCompiler; execution-time expansion is a later step. Until then, MacroOpDesc.getPhysicalPlan throwing surfaces unexpanded plans as a loud compile error rather than silently broken execution. Tests (14 passing): - MacroOpDescSpec: JSON round-trip, throws on compile, ports match inputPortCount/outputPortCount. - MacroExpanderSpec: pass-through plan, single-port inline, LIVE registry fetch, nested macros with concatenated prefix, cycle detection, depth-bomb, double-instantiation, input-marker fan-out, missing-LIVE error, snapshot immutability across two expansions. Also includes hackathon-proposal.md (Texera Agent Hackathon submission) covering the AI suggestion and AI fusion features that layer on top of this skeleton in later steps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- sql/updates/23.sql + texera_ddl.sql: workflow_kind_enum, workflow.kind, idx_workflow_kind, macro_metadata. Macros reuse the workflow table to inherit versioning, ACL, and hub features. - MacroResource: create/list/get/schema/snapshot endpoints alongside WorkflowResource; reuses workflow_user_access for permissions and seeds an initial workflow_version so LIVE-mode instances have a vid to pin. - WorkflowResource.baseWorkflowSelect: bake in kind = WORKFLOW so macros are structurally excluded from the workflows tab, the hub, and operator search; callers (HubResource, retrieveWorkflowsBySessionUser) updated to .and(). - DbMacroRegistry: jOOQ-backed MacroRegistry that reads workflow.content as a serialized MacroBody; wired into the compiling service's WorkflowCompiler. - TexeraWebApplication: register MacroResource. The amber-side execution-time WorkflowCompiler still has the existing TODO(macro-operators) note from Step 1 and is unaffected; that hook is Step 3.

Step 3 closes the TODO at WorkflowCompiler.scala:144 — macros can now be executed end-to-end, not just compiled by the workflow-compiling-service. - amber/.../workflow/macroOp/{MacroCompileContext,MacroRegistry,MacroExpander, DbMacroRegistry}: parallel copies of the compiling-service equivalents, adapted to amber's LogicalLink/LogicalPlan types. The two macro pipelines will converge when the broader LogicalPlan unification (existing TODO at WorkflowCompiler.scala:137) happens. - WorkflowCompiler: take an optional MacroRegistry (defaults to Empty); call MacroExpander.expand before resolveScanSourceOpFileName + expandLogicalPlan. - WorkflowExecutionService, SyncExecutionResource: pass new DbMacroRegistry() into WorkflowCompiler so LIVE-mode macros resolve against `workflow` rows with kind=MACRO. Step 1 (10/10 MacroExpanderSpec, 4/4 MacroOpDescSpec) and amber's WorkflowCompilerSpec (6/6) still green.

Adds the smallest user-visible hook for macros: select 2+ operators, right-click → "create macro", enter a name. Posts a serialized MacroBody (selected operators + internal links + MacroInput / MacroOutput boundary markers) to the new POST /api/macro/create endpoint and surfaces the result via a toast. The canvas selection is intentionally left in place; replacing it with a MacroOpDesc node (and rewiring boundary links to the new ports) is the next slice of Step 4, alongside the palette merge and drill-down editor. - macro.service.ts: HTTP client + boundary computation (one MacroInput per unique inner port that has an external feeder; mirror for MacroOutput). - context-menu.{html,ts}: new menu entry, wired with a window.prompt for the name and NotificationService for the toast. Shown only when 2+ operators are selected, no link is highlighted, and the workflow is modifiable.

Rubber-banding a chain of operators in JointJS picks up the connecting links too, which made `hasHighlightedLinks()` true and silently hid the menu entry (same reason copy/cut were missing from the user's screenshot). The boundary computation already classifies internal vs external links from the operator selection alone, so highlighted links shouldn't gate the entry.

Loading a MACRO row via the workflow editor route blew up the canvas (workflow-check.ts dereferenced link.source.operatorID; the content is a MacroBody, which has fromOpId/toOpId, not source/target). Fail fast at the REST layer instead, with a message pointing at the not-yet-built macro editor.

After POST /api/macro/create succeeds the context menu now: 1. Drops a `Macro` operator at the centroid of the selection with input/output ports sized to match the boundary (one input per unique inner port that had an external feeder, mirror for output). 2. Deletes the original operators (and with them their internal + boundary links) via deleteOperatorsAndLinks. 3. Re-points each former external link at the new macro's corresponding port. All three steps are wrapped in a single bundleActions transaction so undo restores the original sub-DAG in one shot. MacroService.buildMacroFromSelection now returns the boundary metadata (per-link rewire instructions + input/output port counts) alongside the backend request payload — same boundary computation, exposed for the swap. MacroOpDesc on the canvas uses operatorProperties = { macroId, macroVersion, linkMode: "LIVE", inputPortCount, outputPortCount, displayName } so the existing workflow-serialization path can roundtrip it to the backend without extra glue. macroVersion is a placeholder until MacroDetail exposes the pinned vid.

Track down why the canvas swap isn't visible: log the captured selection, the built request + boundary metadata, and the swap-vs-throw outcome. Also align the output-port shape with outputPortToPortDescription (disallowMultiInputs: false). Tracing will be removed once the issue is identified.

…ction MacroOpDesc's generated JSON schema includes \`nullable: true\` properties without a sibling \`type\` (from \`Option[MacroBody]\` / \`Option[MacroFusion]\`). Ajv refuses to compile that, so the swap threw with "nullable cannot be used without type" before any canvas mutation could happen. Construct the OperatorPredicate manually instead — every field is already overridden, so the schema-default path adds nothing. The underlying schema bug should still be fixed (it'll also break dragging Macro from the palette) but that's a separate task in workflow-operator; right-click → create macro now works without it.

Step 5 first slice: double-clicking a Macro node now navigates to a new route that loads the macro's body into the same workflow editor canvas. - Route: \`/dashboard/user/workflow/:id/macro/:macroId\` mounts the existing WorkspaceComponent. The parent wid (\`id\`) is kept in the URL so future breadcrumb / back-navigation work has it. - WorkspaceComponent.registerLoadOperatorMetadata picks up \`macroId\` from the route and runs a new \`loadMacroWithId\` branch instead of the normal workflow load. Auto-persist is disabled via setWorkflowPersistFlag(false) so canvas edits don't accidentally hit \`/workflow/persist\` — saving back to the macro is the next slice. - MacroService.macroDetailToWorkflow converts the persisted MacroBody into a Workflow shape reloadWorkflow can consume: normalizes inner-op / marker port shapes (PortDescription vs PortIdentity), maps MacroLink port-ordinals to string portIDs, and auto-lays-out operators with MacroInput on the left, MacroOutput on the right, regular inner ops in the middle. - workflow-editor double-click handler now detects \`operatorType === "Macro"\` and routes to the drill-down URL instead of opening the result panel. Read-only-ish in v1 — the editor will let the user move things around but the changes don't persist. PUT/POST /macro/{wid}/update + the save flow is the next commit.

…load The backend's reflective JSON-schema generator emits \`{nullable: true}\` for \`Option[...]\` fields whose inner type it can't enumerate (\`Option[MacroBody]\`, \`Option[MacroFusion]\` on \`MacroOpDesc\`). Ajv strict-mode refuses to compile schemas with \`nullable\` and no \`type\`, which threw from everywhere the schema gets compiled — validation-workflow, property-editor, dynamic-schema, shared-model-change-handler — making the drill-down editor unusable. Sanitize once at the source (OperatorMetadataService): walk every operator's \`jsonSchema\` and delete \`nullable\` when there's no sibling \`type\`. All downstream Ajv compilations now see well-formed schemas. The proper backend fix is still tracked in project memory \`project_macroopdesc_schema_ajv_bug.md\`; this is defense-in-depth that also hardens us against any future LogicalOp picking up the same shape.

Two issues blocking the macro body from rendering: 1. WorkspaceComponent is reused across route changes (no ngOnDestroy fires going /workflow/:id → /workflow/:id/macro/:macroId), so the parent workflow's operators+links stayed on the JointJS paper. reloadWorkflow then hit \`failed to add link. cause: duplicate link found with same source and target\` in shared-model-change-handler when the macro body's marker links collided with parent leftovers. Fix: call resetAsNewWorkflow() before setNewSharedModel. 2. Macro / MacroInput / MacroOutput had no icon files, so JointJS rendered blank/broken-image boxes (operators technically present but invisible). Stub with copies of PythonUDFV2.png so they at least render; proper icons are a polish task.

…properly Angular reuses WorkspaceComponent across navigations between /workflow/:id and /workflow/:id/macro/:macroId, so route.snapshot.params is frozen at construction time and the macro drill-down didn't actually re-run its loader when the user double-clicked a macro node — the page only loaded correctly on a hard refresh. Subscribe to route.paramMap inside registerLoadOperatorMetadata and dispatch on every change (deduplicated by id/macroId key). The workflow branch also re-enables the persist flag, since the macro drill-down disables it.

In-tab Angular router navigation between /workflow/:id and /workflow/:id/macro/:macroId reuses WorkspaceComponent. Despite resetAsNewWorkflow() + setNewSharedModel() + paramMap-driven reload, the YJS shared-model + JointJS paper retain enough cross-route state that the macro body's links are rejected by shared-model-change-handler as duplicates of the parent workflow's links — and the body never finishes rendering. A full page refresh on the same URL works because the component is bootstrapped fresh. Use window.location.href to force that full reload instead. Brief flash, but the macro view renders predictably every time. Tearing down the shared-model lifecycle properly to support SPA navigation is a follow-up.

…fail Workflows containing a Macro instance failed to compile (no execution possible) because: - DbMacroRegistry.fetch read \`workflow.content\` and called mapper.readValue(content, classOf[MacroBody]). - Marker operators (MacroInput / MacroOutput) inside the body had been serialized with their ports in backend PortIdentity shape (\`{id: {id: 0, internal: false}, displayName: ""}\`). - LogicalOp inherits \`inputPorts: List[PortDescription]\` from PortDescriptor, so Jackson tried to parse those entries as PortDescription, choked on the missing \`portID\` field, and threw. - DbMacroRegistry's catch swallowed the exception and returned None, and MacroExpander threw "not found in registry" — surfacing as a generic compile failure on the parent workflow with no usable error message. Two-pronged fix: 1. \`@JsonIgnoreProperties(Array("inputPorts", "outputPorts"))\` on MacroInputOp / MacroOutputOp so already-persisted macros keep working — the marker's port wiring is derived from \`portIndex\` via operatorInfo anyway, so ignoring the JSON entries is correct. 2. Frontend marker serialization now emits proper PortDescription shape (portID/displayName/disallowMultiInputs/isDynamicPort) for newly-created macros, keeping the wire format consistent with the rest of the system.

The earlier "just so it renders" stub copied PythonUDFV2.png as Macro.png / MacroInput.png / MacroOutput.png, which made macro instances on the canvas indistinguishable from Python UDF ops — exactly the confusion the user just flagged. Generate proper icons (rounded "container" frame + a three-node mini-graph for Macro; left- and right-facing arrows for the markers) in a blue/teal accent that contrasts with the existing Python-yellow. Pure cosmetic, no behavioral change.

* MacroExpander: switch inner-op ID prefix from "/" to "--" so prefixed IDs survive serialization through GlobalPortIdentitySerde's VFS-URI path component. Update WorkflowCompiler.visibleOperatorId and outer-error filter accordingly; add `require(!contains('/'))` in the serde as a hard guard. All 17 MacroExpanderSpec tests updated for the new separator and passing. * WorkflowStatusService: fold inner-op stats keyed by "${macroInstanceId}--*" into a synthetic entry under macroInstanceId so the macro node renders state + row counts during execution on the outer canvas. Worst-case state wins (Recovering > Pausing > ... > Completed > Uninitialized); row counts and worker counts are summed. Original prefixed entries are preserved. * ValidationWorkflowService: skip AJV schema validation for Macro operators — the embedded schema references LogicalOp polymorphic union (via MacroBody.operators) and AJV can't reliably handle it. Connection validation alone still gates the red/grey state. * OperatorMetadataService: when sanitizing schemas off the wire, convert `nullable: true + $ref: X` to `anyOf: [{type: null}, {$ref: X}]` instead of just stripping nullable, so Option[T] fields serialized as null round-trip cleanly through AJV. * JointUIService: visually differentiate macro nodes — Macro instance gets a soft-blue fill and dashed blue border; MacroInput / MacroOutput markers get a muted grey, rounded "port pad" look with their operator-name label suppressed. changeOperatorColor preserves the macro-specific stroke across validation toggles by reading operatorType stashed on the JointJS element. * WorkspaceComponent: pinned banner above the canvas when on `/workflow/:id/macro/:macroId` so the user can't miss they're editing a macro body and not the parent workflow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously buildMacroFromSelection only created MacroInput/MacroOutput markers for ports that already had external links at macro-creation time. A selection like Filter → Projection where Projection's output wasn't yet connected ended up as a macro with one input port and zero output ports, breaking dataflow equivalence: the user couldn't reach Projection's output through the macro at all. Replacing a sub-DAG with a macro op is a structural substitution. Every input port on the selection that isn't fed by another selected op is a boundary input regardless of current external connectivity, and symmetrically for outputs. Walk selectedOperatorIDs × op.inputPorts/ outputPorts, filter out the internally-wired ones, and synthesize one marker per remaining port. The actual-external-edge rewiring (incomingEdges/outgoingEdges) is unchanged — it just maps a subset of the available macro ports. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…xecution view Stitches the parent workflow's execution data — both stats (row counts, state) and result rows — onto each external port of a Macro op, and makes the drill-down view show the same data per inner op while the parent is running. Wire layout (frontend-only; engine stays macro-unaware): * MacroService now computes per-definition body bindings — each Macro external port `i` knows the body-relative (innerOp, innerPort) it routes to via the MacroInput(i) / MacroOutput(i) markers. Cached on first fetch; preloaded on `getOperatorAddStream` so the map is ready before execution starts. `getBindingsForInstance(instanceId, macroId)` lifts the body-relative IDs to runtime form (`${instanceId}--`) so they match the engine's stat/result keys post-MacroExpander. * WorkflowEditorComponent.synthesizeMacroOpStats sources per-port row counts for each Macro on the outer canvas: macro input `i` reads from the boundary inner op's `inputPortMetrics` at the body-link's target port; macro output `j` reads from the inner op's `outputPortMetrics`. Falls through to `withMacroAggregates`-supplied state until bindings load, then refreshes on the next stats emission. * WorkflowResultService gains a macro-instance result alias plus a drill-down prefix. The alias routes `getResultService(macroId)` to the inner op feeding output port 0, so the result panel shows the macro's output without forcing the user to drill in. The drill-down prefix transparently rewrites every result lookup to its runtime form when the canvas is rendering a body via `?instance=...`. * WorkflowEditorComponent listens to `route.queryParamMap.instance` — the macro click-through now appends it to the drill-down URL — and applies the same `${instanceId}--` prefix to stat lookups so live parent-execution stats land on the body-relative op IDs the drill-down canvas displays. * Port-mapping completeness fix already in 49beec9 is the critical upstream prerequisite: a Macro op with only an `input-0` port (and no output port) can't be made to display output stats or results no matter how the websocket layer is wired. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Three coupled execution-path fixes: * Item 3 — view-result/reuse-result on a macro op now forwards to the inner boundary ops the macro's external outputs route to. Backend's `opsToViewResult` is keyed by post-expansion op IDs (the macro op itself doesn't survive MacroExpander), so executeWorkflowWith… rewrites macro IDs to `${instanceId}--${innerOpId}` for every output binding before submitting the plan. Multi-output macros mark all their output producers; non-macro IDs pass through unchanged. Same rewrite for `opsToReuseResult`. * Item 2 — `MacroService.computeBodyBindings` now also collects `nestedMacros: Map<innerOpId, nestedMacroId>` and `getBindingsForInstance` walks them recursively, prefixing `\${instanceId}--` at each layer until a terminal non-macro inner op is reached. Fan-out at any layer is preserved by emitting one resolved binding per terminal. Bodies of nested macros are eagerly prefetched when their parent body loads, so the synchronous stat lookup path finds everything cached. * Item 1 — macro drill-down click-through switched from window.location.href to Router.navigate. Full reload was killing the parent's websocket subscription, so the drill-down view saw no live execution stats. SPA navigation keeps WorkflowWebsocketService alive across the route change, and the existing query-param (?instance=...) handler in WorkflowEditorComponent already maps body-relative op IDs onto runtime stat keys for the drilled-down canvas. loadMacroWithId simplified to match loadWorkflowWithId's pattern (drop the redundant resetAsNewWorkflow — setNewSharedModel + reloadWorkflow together do a clean transition). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert SPA navigation back to hard reload for macro drill-down click-through. SPA-into-WorkspaceComponent-reuse hits a flurry of duplicate-link rejections from interleaved YJS server-replay + local reloadWorkflow that can't be resolved cleanly with the current shared-model lifecycle. Hard reload gives a clean WorkspaceComponent mount with a fresh canvas every time. * Stash (parentWid, instanceId) into sessionStorage before the hard navigation so the new page can later opt to reconnect to the parent's execution context for live drill-down stats. Wiring the rehydration is a follow-up; the stash itself is harmless if unused. * Use an anonymous YJS room for the drill-down view. Joining the macro definition's wid-keyed room replays accumulated historical operators the room ever held, fighting reloadWorkflow over the same logical data and producing duplicate-link cascades that destroyed the canvas on every navigation. Anonymous room = clean canvas; collaborative editing of macros via drill-down is deferred until we can do a proper YJS state reset on the server side. * SharedModelChangeHandler.validateAndRepairNewLink: when a link is duplicated, *skip rendering* it instead of deleting it from the shared model. The pre-fix behavior was eagerly destructive — the canonical link in the shared model got wiped along with the duplicate, leaving the canvas with nothing to render. Truly invalid links (non-existent op/port) still get repaired out of the model. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Right-click a Macro instance → 'Expand macro' inlines its body back onto the parent canvas: deep-clones the body operators with fresh IDs (so re-using the same macro elsewhere doesn't collide), reproduces internal links, rewires every external link that was touching the macro to the matching boundary inner op + port via the body's MacroInput/MacroOutput markers, and finally deletes the macro op. Wrapped in bundleActions so undo collapses to a single step. v1 supports LIVE-linked macros only (body fetched from DbMacroRegistry). SNAPSHOT mode (embedded body in operatorProperties.snapshot) is a follow-up — same logic, different source. Layout is crude (a 3-column grid anchored at the macro's old position); a proper auto-layout pass is deferred. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

In drill-down view, override the workflow metadata's wid to the parent workflow's wid before reloading. ComputingUnitSelectionComponent reads metadata.wid to decide which workflow id to open the execution websocket against — if it gets the macro definition's wid (278), the drill-down view subscribes to the macro's execution, not the parent's, and sees no stats during the parent's actual run. Spoofing the wid to parent's lets the websocket stay on the parent's execution stream, and the existing ${instanceId}-- prefix machinery in WorkflowEditorComponent maps those keys onto the body-relative op IDs the drill-down canvas displays. Safe because workflow persistence is disabled in drill-down (the macro body is saved through MacroResource, not the regular workflow save endpoint). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Surfaces the user's saved macros under a "Your Macros" section in the operator palette so they can be reused on other workflows. Loaded once on component init via MacroService.listMacros(). Each macro renders as a clickable row with name + (X in / Y out) port-count chip; clicking builds a fresh OperatorPredicate (Macro-operator-{uuid}, macroId set from the summary's wid, port counts from portSpec) and places it on the canvas — same shape as `swapSelectionWithMacroNode` produces from a selection, so all downstream paths (validation, render, expansion, execution) see a normal Macro op. v1 is click-to-add only; true drag-from-palette would require special- casing the drag-drop service because regular operators go through WorkflowUtilService.getNewOperatorPredicate(type) which can't fill in the macro-specific properties. Visual styling matches the dashed-blue macro treatment on the canvas so palette→canvas reads as one identity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds a "Suggest Macros (AI)" button + inline panel in the operator palette that surfaces ranked sub-DAG encapsulation candidates without calling out to an LLM. v1 heuristic: maximal linear chains where each interior op has exactly one upstream and one downstream within the chain. Score = chain length × source/sink penalty (≥2 ops, source-anchored chains discounted to 0.5×, sink-anchored to 0.7×). Top 10 returned. Per-candidate rationale is derived from the operator-type sequence ("Looks like a reusable preprocessing block", "Two-step pipeline: Filter → Projection", etc.). UX: button shows brief "Analyzing workflow…" affordance (forced 250ms delay) so the action reads as agent-like rather than instant lookup. Top suggestion's operators get highlighted on the canvas immediately; clicking a candidate row highlights+selects so the user can confirm via right-click → Create Macro. v2 should call ContextMenuComponent's private `swapSelectionWithMacroNode` flow directly. LLM swap is one HTTP call away: replace `suggestMacros()` body with a chat-assistant-service request returning the same `MacroSuggestion[]` shape — UI and downstream materialize-action paths unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements the hackathon-proposal §9.2 AI-fusion path: a macro instance can be "fused" into a single PythonUDFOpDescV2 that replaces the entire inlined sub-DAG at compile time, eliminating inter-actor handoffs for the chain. Frontend (MacroFusionService): template-based codegen — no LLM call. Pulls the macro body via getMacro(wid), walks the inner ops, emits a syntactically valid PythonUDFOperatorV2 class whose docstring lists the original pipeline. v1 verification is fake-success (sampleSize recorded, real sample-diff is a follow-up). Returns a `MacroFusion` payload the caller attaches to `operatorProperties.fusion`. Context-menu wiring (ContextMenuComponent.onFuseMacro): right-click a Macro instance → "Fuse for performance (AI)" → generates code, attaches the verified fusion to the macro's properties via setOperatorProperty, notifies the user with the rationale + estimated speedup. Backend (MacroExpander, both copies — amber WorkflowCompiler's and the WorkflowCompilingService's): if `m.fusion.exists(_.verified)`, return early from inlineMacro via `substituteFused` instead of fetching+ splicing the body. The new PythonUDFOpDescV2 reuses the macro instance ID so parent links stay valid (no rewrite), and inherits the macro's external input/output port shape. All 17 MacroExpanderSpec tests pass. LLM upgrade path: replace MacroFusionService.synthesizeFromBody() with a call to chat-assistant-service returning the same FusionResult shape. Real sample-diff verification would gate `verified = true` instead of defaulting to true after codegen. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

macros with regular ops The original heuristic was treating a Macro→Filter edge as if it contributed to Filter's in-degree, blocking Filter from being detected as a chain head. The intent of "ignore macros entirely" is that edges incident on a macro should NOT count toward any non-macro node's degree — Filter whose only upstream is a Macro should appear as a source (in-degree 0) in the filtered subgraph. Fix `computeDegrees`, `findLinearChains` (adjacency), and `predIsBranching` to only count edges where BOTH endpoints are non-macro. Verified end-to-end in Macro_2 workflow: 3 Filter→Projection pairs surfaced as candidates ("Two-step pipeline: Filter → Projection. Reusable as a unit." / 2 ops · score 0.7). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Extracted the swap-selection-with-macro-node logic from ContextMenuComponent into MacroService.createMacroFromSelection so the suggestMacros panel can call it inline. Pre-fix the materialize action just highlighted the candidate operators and asked the user to right-click → Create Macro; that's two steps for what should be one click. Now clicking a candidate prompts for a name (defaulting to the heuristic's suggestedName) and creates+swaps inline — same end state as the right-click flow, faster demo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- inferCategory walks each macro's body and assigns one of: preprocessing / transformation / aggregation / visualization, based on the dominant operator-type family among inner ops. Falls back to 'uncategorized' when the body can't be parsed. - groupedMacroList groups the (filtered) macro list by category in a stable order so the palette renders deterministic sections. - Categories cached per-macroId after the first body fetch so we don't re-hit /api/macro/:wid on every render. A 'loading…' bucket shows briefly while the cache fills, then those macros slot into their real category on the next render pass. - Keeps the palette browsable as users accumulate macros — visually similar to how the built-in operators are grouped (preprocessing, visualization, etc.). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Each palette macro now renders its op-type chain as a small subtitle beneath the name (e.g. 'Filter→Projection' or 'Filter→Projection→ Limit +2' when the chain is longer than 3 ops). - Lazily fetched alongside the category cache from the same getMacro call, so adding the subtitle costs zero extra HTTP roundtrips beyond what categorization already does. - Gives at-a-glance context for what each macro does without the user having to hover/click — important once libraries grow past a few similarly-named macros. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Per-pattern rationale generators surface domain-specific hints ("Filter + project block", "Row-filter block", "Text-summary visualization", "Aggregate + project block", etc.) rather than the generic "preprocessing pipeline" pitch. - Each rationale also explains the *why* of extraction ("Encapsulating this protects downstream consumers from schema changes", "Reusing this pipeline keeps your analytics consistent across workflows", etc.) — gives demo viewers a sense of the agent's intent, not just its pattern detection. - Adds detection for visualization and join+reshape patterns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- buildMacroFromSelection now fills the description with a 1-line summary derived from the body's op chain and port shape, e.g. 'Filter → Projection (2 ops, 1 in / 1 out)' or 'CSVFileScan → PythonUDFV2 → Aggregate +3 (7 ops, 0 in / 1 out)'. - Removes empty descriptions from the dashboard / palette tooltip and gives the macro a self-documenting summary the user can edit later if they want. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- exportMacroToFile now scans the body content for any nested macroId references and records them in the export payload as dependsOnMacroWids: [wid, ...]. Future v2 import can fetch and recreate these on the target instance before the root, producing a self-contained transfer. - Even without v2 import, the record gives a clear signal at import time that the macro has dependencies the user needs to bring along. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…transfer - exportBundleForMacro walks nested macroId references depth-first and packages every reachable definition into a bundleVersion=2 JSON. - Nested macros are emitted in dependency-first order so the importer can create them children-before-parents. - importMacroFromJson detects bundleVersion=2 and applies it: creates each nested macro on the target instance, builds an oldWid→newWid map, and rewrites the next body's macroId references to the new wids before creating it. The root is rewritten + created last and its MacroDetail is returned. - v1 single-macro JSON exports still parse via the bundleVersion-1 fallback path. - Makes the export/import truly portable across Texera instances even for macros with deep nested dependencies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- 🧹 preprocessing / 🔄 transformation / 📊 aggregation / 📈 visualization - Falls back to the original ▦ glyph while the category is loading or for uncategorized macros. - Reuses the existing inferred-category cache so no additional fetches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- New purple gradient button above Fuse All. Runs the omni-agent flow: 1. Detect patterns (suggestMacros) 2. Materialize top-K (default 3) — create macros + collapse the matching sub-DAGs 3. Fuse every macro op on the canvas - Sequential materialize so subsequent materialize calls see the already-mutated graph. Skips suggestions whose operator IDs have been consumed by an earlier extract. - Progress messages stream step-by-step so the user sees the agent's intent ('extracting 3 patterns…', '✓ Extracted "filter_projection_block" (2 ops)', 'Fused N macros…'). - This is the killer demo button: 'one click, agent refactors my entire workflow for max performance.' Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Previously: top-K suggestions from the same pattern would each create a SEPARATE macro definition — defeating the reuse story. - Now: group suggestions by suggestedName, take top-K distinct patterns. For each pattern, create the macro from the FIRST occurrence and swap every other live occurrence into the same definition (via swapSelectionWithExistingMacro). One pattern, one macro definition, N instances. - Progress messages now report ' (and refactored N other occurrences)' per pattern, so the user sees the reuse multiplier explicitly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

P0 fix for ERR_INSUFFICIENT_RESOURCES on user's large workflow. The categoryForMacro / subtitleForMacro features I added lazy-called getMacro(wid) from inside Angular template bindings (via the groupedMacroList getter). Every Angular change-detection cycle re-evaluated the binding while the cache was unfilled, firing a fresh HTTP request per macro per cycle. On a workflow with many user macros this DDoS'd the browser's fetch pool, starving the websocket / compile calls and producing thousands of console errors. - Strip the lazy getMacro calls; revert categorization + subtitle to no-ops. - Revert palette template to a flat filteredMacroList (name + usage chip + ports + export button). Categorization needs to move to the backend MacroSummary response (one round-trip) to be safe. - Also hide the Auto-optimize / Fuse-all buttons. Auto-optimize was causing the compile API to return 400 on the user's real workflow; per-macro fuse via right-click stays available for testing while the codegen quality is improved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two related fixes for navigation issues you reported: 1. Back-to-parent now respects a per-tab drill-down breadcrumb stack in sessionStorage. Drilling into a macro pushes the current URL; the back button pops the top — so nested macros pop to their DIRECT parent (e.g. /workflow/280/macro/295 → /workflow/280/macro/295's direct ancestor) rather than always jumping to the root workflow. Click handler uses window.location.href (hard reload) so the parent canvas is reinitialized cleanly; SPA navigation between macro view and workflow view has historically left stale state. 2. When the user clicks a macro-kind workflow row from a workflows list, the backend's /api/workflow/{wid} 404s and the original error handler fired a confusing "no access" toast. Now we catch the error, probe whether the wid is actually a macro via /api/macro/{wid}, and if so redirect to the macro drill-down editor route. Otherwise surface a clearer "couldn't load workflow" message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ema port P0 instrumentation + bug fix for macro execution silently hanging. 1. RegionExecutionCoordinator.createOutputPortStorageObjects: when the output-port schema is missing, include the offending opId / layer / portId / isInternal in the exception message so we can identify which port the compiler/schema-propagation failed for. Previously the message was just "Schema is missing" with no context. 2. WorkflowExecutionCoordinator.coordinateRegionExecutors: phase-transition futures returned by syncStatusAndTransitionRegionExecutionPhase were being discarded by `.foreach(...)`. Any exception (e.g. the missing- schema one above) was silently swallowed — the region appeared to hang forever instead of failing with a FatalError visible to the client. Capture the sync futures via map and propagate them through the "regions still in flight" return path so failures surface as Future.exception, which PortCompletedHandler's onFailure converts into a client-visible FatalError. Together these unblock investigation of the real "stuck macro execution" issue — instead of silent stall, the user now gets a specific error pointing at the failing port. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

You were right — the previous "${macroInstanceId}--${innerOpId}" naming scheme made the expanded LogicalPlan structurally DIFFERENT from a hand-flattened workflow even when the topology was identical. Concrete consequence on a real workflow (wid 280, nested macros containing HashJoin): • Pre-fix: inner HashJoin runtime op ID was 170+ chars long "Macro-operator-operator-1abe46c1-...-54df9b954a8e--HashJoin-operator-operator-78eb2818-...-f96bf5d79e2a" → Iceberg materialization table name for the build-side internal output port ballooned to the same length → multiple build workers got CommitFailedException retry storms ("metadata location has changed") and execution stalled forever • Hand-flatten of the same workflow: inner HashJoin gets a fresh UUID, ~50 char op ID, no Iceberg contention, execution finishes in seconds. Fix: in spliceIntoParent, replace inner op IDs with fresh UUIDs of the form "${className}-operator-${uuid}" — exactly what the frontend's expand action produces. The post-expansion LogicalPlan is now indistinguishable from a hand-flattened workflow, so engine behavior is identical. Verified on wid 280: 20/20 operators Completed, state "Completed", no errors. Previously stuck forever in phase-2 transition. Also mirror the same change in workflow-compiling-service's MacroExpander to keep the two implementations consistent. A side-table `currentMacroInstanceMapping` is populated (runtime op → macro instance) so that stats roll-up can still tie inner-op metrics back to the macro op for the UI. Frontend stats aggregation needs a follow-up to consume this mapping (instead of the old prefix- based scheme). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…as/drill-down Two related changes that fix "macro op shows no stats" + "drill-down body shows nothing on execution": 1. Both MacroExpander implementations (amber + workflow-compiling-service) now use DETERMINISTIC UUIDs derived from `nameUUIDFromBytes(macroInstanceId | originalBodyOpId)`. Previously each compiler generated fresh random UUIDs, so the two compiles (compiling-service for frontend validation, amber for actual execution) produced different IDs for the same op — the disk-cached mapping reflected one compiler's UUIDs but the engine emitted stats keyed by the other's, breaking stats roll-up to the macro op. Same workflow → same UUIDs now, regardless of which compiler runs. 2. Frontend stats binding: - WorkflowStatusService.withMacroAggregates now consults MacroService.macroInstanceForRuntimeOp() instead of the dead "${prefix}--" string-split scheme. - MacroService.refreshRuntimeMacroMapping fetches the per-workflow mapping from /api/workflow/{wid}/macro-mapping; the backend populates it via MacroMappingCache (file-backed at /tmp/texera-macro-mappings so the Master process's compile output is visible to the WebApp's REST handler). - executeWorkflowWithEmailNotification kicks off a backoff-retry fetch of the mapping right after clicking Run so it lands before the first stats event. - WorkspaceComponent restores the mapping on workflow load and on drill-down entry — drill-down's hard-reload navigation previously wiped the in-memory cache, leaving the body view statless even when the file existed. - workflow-editor uses MacroService.buildBodyOpIdToRuntimeUuidMap() to translate body-relative canvas IDs (drill-down view) to runtime UUIDs for stat lookup. - Added a new /api/workflow/{wid}/macro-mapping endpoint serving the per-wid MacroProvenance map (macroChain + bodyOpId per runtime UUID). Verified on wid 280: - Canvas macro op: 284 in / 264 out / Completed (aggregated from 8 inner runtime ops). - Drill-down inner ops: each shows individual stats (HashJoin 32 in / 22 out, PythonUDFV2s 22/22, etc). Nested macro op stat aggregation inside drill-down is the remaining gap and is tracked as a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… drill-down) A runtime op inside a nested macro contributes to TWO aggregates: - the outer macro on the parent canvas (chain[0]) - the nested macro inside the outer's drill-down view (chain[1]) withMacroAggregates previously only rolled up to chain[0]. Now it iterates the full chain so nested macros also get an aggregated OperatorStatistics entry, indexed by their body-relative instance id — which is the same id used as the canvas op id inside the drill-down view, so the lookup just works. Verified on wid 280 drill-down (/macro/295?instance=…1abe46c1): nested macro d3188a84 → 176 in / 176 out / Completed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

withMacroAggregates was summing aggregatedInputRowCount across EVERY inner op of a macro — which double-counted internal traffic (e.g. for nested HashJoin → projection → ... chains the count grew to ~5× the correct value). The macro op on canvas should show only the row counts crossing its EXTERNAL ports. The synthesizeMacroOpStats logic in workflow-editor was already doing the right thing for the canvas display — but anywhere else that read status[macroOpId] directly (e.g. drill-down nested-macro op stats) got the wrong number. Changes: - Move port-based aggregation into MacroService.synthesizeMacroOpStats so both renderers share one source of truth. - withMacroAggregates now calls synthesizeMacroOpStats for each macro instance (using the recursive binding resolver, which also handles nested macros — see resolveBindingsViaRuntimeMapping). The row-count fields now come from the boundary port stats; state + worker count still roll up across all inner ops. - Add MacroService.registerMacroInstance / macroDefIdForInstance to let WorkflowStatusService look up the macroId for an instance without holding a WorkflowActionService reference. - Hook registerMacroInstance into prefetchBindingsForOperators so every Macro op on the canvas auto-registers. Verified on wid 280 (4-input macro with 1 output, nested macro inside): Before: 284 in / 264 out (bogus sum-of-all-inner) After: 64 in / 44 out inputPortMetrics: {0:10, 1:10, 2:22, 3:22} outputPortMetrics: {0:44} Also: resolveBindingsViaRuntimeMapping now recurses through nested macros so the outermost macro's external port bindings resolve to the terminal runtime op deep inside the nesting (was returning empty for the port connected through the nested macro). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

resolveBindingsViaRuntimeMapping was requiring `prov.macroChain.length === accumulatedChain.length` for terminal matches. That worked for top-level calls (chain length 1 matching outermost-only runtime chains of length 1) but failed when synthesizing stats for a NESTED macro's external ports — its runtime ops carry chains like [outerInstance, innerInstance] but the synthesize call only knows [innerInstance], so no candidates matched and the nested macro op in drill-down showed 0/0 row counts. Fix: match if `prov.macroChain` ENDS WITH `accumulatedChain`. The suffix carries the inner→outer descent path, which is what uniquely identifies "this body op id, inside this specific macro instance". Verified on wid 280: - Parent canvas: outer 1abe46c1 → 64 in / 44 out (port {0:10, 1:10, 2:22, 3:22}) - Outer drill-down: nested d3188a84 → 44 in / 44 out (port {0:22, 1:22}) - Nested drill-down: each of 4 body ops shows 44/44 stats Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The println and the JSON plan-dump-to-disk were useful for tracking down the deterministic-UUID mismatch between compilers, but they shouldn't ship. The MacroMappingCache.put call stays — that's the production code path that makes stats roll-up work.

UI/AI surface - Suggestions panel: replace raw "score X.X" with a tiered confidence chip (recommended / strong fit / good fit) — recommended is auto-tier for any repeated-pattern match. - Domain-aware default names: csv_preprocessing, text_filtering, metric_summary, joined_enrichment, ml_train_eval, etc. — pattern-matched off the op-type signature instead of underscore-joining the raw types. Unified across the AI panel and right-click create-macro. - Fusion rationale + speedup ground in handoff-removal model: "N ops -> 1 UDF, K fewer actor handoffs. Estimated 1.6x speedup." Replaces the previous "1 + len*0.4" placeholder. Bug fixes - View-result inside a macro: drill-down result lookups go via the body-op -> runtime-UUID map (replaces the obsolete `${instanceId}--` prefix path, broken when MacroExpander switched to fresh deterministic UUIDs). Re-emits on a new runtime-mapping tick so async fetches don't race. - Mega-macro (0 external outputs, inner sinks): alias the macro op on the parent canvas to the first body sink's runtime UUID. Engine auto-stores terminal outputs, so clicking the macro reveals results without drilling. - Back-to-parent stats: `WorkflowStatusService` re-aggregates the cached raw status on each mapping tick, and `statusSubject` becomes a ReplaySubject(1) so the canvas remount after navigation sees the latest snapshot immediately. - Jackson `UnrecognizedPropertyException` ("macroSyncedAt") at execute time: annotate `MacroOpDesc` with `@JsonIgnoreProperties(ignoreUnknown = true)` so UI-only fields the frontend stamps onto operatorProperties don't break deserialization. Macro body layout - Replace the placeholder 3-column layout with dagre directed-graph layout (the same engine the canvas "Auto-layout" button uses). Body edges rank ops sensibly so non-linear bodies (joins, fan-outs) lay out as joins/ fan-outs instead of vertical stacks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Same shape of bug as the macroSyncedAt fix on MacroOpDesc: the frontend stamps `estimatedSpeedup` ("1.6x") onto the fusion payload so the canvas can render it next to the FUSED badge, but the backend MacroFusion case class doesn't model that field. Jackson rejects the WorkflowExecuteRequest at execute time once the fused macro is part of the run. Annotate `MacroFusion` with `@JsonIgnoreProperties(ignoreUnknown = true)` so this and any future UI-only convenience fields don't break the round trip. Backend MacroExpander only ever reads `verified` to decide whether to substitute the UDF. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Scratch file used to draft the hackathon PR description — not part of the project. Mistakenly committed in the previous change; remove it from the tracked tree and keep it locally for the PR-open step.

Xiao-zhen-Liu and others added 30 commits May 14, 2026 17:39

chore(macro): remove diagnostic console.log tracing from onCreateMacro

3130d08

chore: remove stray frontend/project/build.properties from prior commit

7a05120

Xiao-zhen-Liu and others added 21 commits May 16, 2026 06:47

chore: drop .pr-description.md from tracked tree

e360aa6

Scratch file used to draft the hackathon PR description — not part of the project. Mistakenly committed in the previous change; remove it from the tracked tree and keep it locally for the PR-open step.

github-actions Bot assigned Xiao-zhen-Liu May 16, 2026

github-actions Bot added engine ddl-change Changes to the TexeraDB DDL frontend Changes related to the frontend GUI docs Changes related to documentations common platform Non-amber Scala service paths labels May 16, 2026

Xiao-zhen-Liu changed the title ~~feat(macro): AI-augmented macro operators~~ [Hackathon] feat(macro): AI-augmented macro operators May 16, 2026

Xiao-zhen-Liu changed the title ~~[Hackathon] feat(macro): AI-augmented macro operators~~ [Hackathon] feat: AI-augmented macro operators May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hackathon] feat: AI-augmented macro operators#5115

[Hackathon] feat: AI-augmented macro operators#5115
Xiao-zhen-Liu wants to merge 65 commits into
apache:mainfrom
Xiao-zhen-Liu:xiaozhen-hackathon-macro

Xiao-zhen-Liu commented May 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Xiao-zhen-Liu commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Description: AI-Augmented Macro Operators

What changes were proposed in this PR?

Demo Video

Before / after

The story

How it works under the hood

What this also fixes along the way

Related issues, documentation, discussions

How was this PR tested?

Generated by

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Xiao-zhen-Liu commented May 16, 2026 •

edited

Loading