[Hackathon] feat: Multi-Source Data Import — URL, Local File, SQLite, REST API#5119
Open
EmilySun621 wants to merge 3 commits into
Open
[Hackathon] feat: Multi-Source Data Import — URL, Local File, SQLite, REST API#5119EmilySun621 wants to merge 3 commits into
EmilySun621 wants to merge 3 commits into
Conversation
This bundles the feature work that built up on this branch:
- Custom agents: dashboard CRUD page and editor dialog (48px icon tile,
chip-style guardrails, model selector). Each custom agent now carries a
LiteLLM model_name (Opus 4.7 / Haiku 4.5) that is passed through to the
agent-service so different agents can use different models.
- Conversation history is scoped per (workflowId, agentId): switching
agent or workflow yields a different conversation list. localStorage
key: texera.workflowConversations.v1.{workflowId}.{agentId}.
- Time machine: workflow snapshot list, revert, and agent-tagged
checkpoints. New workflow-history-tool in agent-service backs the
"undo my last change" flow; amber gains a WorkflowSnapshotResource;
sql/updates/23.sql adds the snapshot table.
- Operator-aware custom-agent prompts: the system prompt now injects the
full operator catalog with a "prefer built-in operators over Python
UDFs" rule, sourced from WorkflowSystemMetadata at request time.
- LiteLLM: added the claude-opus-4.7 entry alongside claude-haiku-4.5
and gpt-5-mini in bin/litellm-config.yaml.
- Agent panel rewritten around the (conversation list / chat) two-view
model with subscription-managed list reloads and per-step persistence.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…set UI Preserves in-progress work-in-progress changes before switching branches: agent-service gains a data-source router with format utilities, and the user-dataset frontend gains UI/styles backed by new dataset service helpers. Saved so the snippets-quicksteps branch can be resumed cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ent tool Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's New
🔗 URL Import — Paste any CSV/JSON URL on the Datasets page, click Import. Server-side fetch, auto-format detection.
📁 Local File Drop — Drag & drop CSV, JSON, XLSX, TSV, SQLite directly onto the Datasets page.
🗄️ SQLite Import — Drop a .sqlite file → pick tables from a list → each table becomes a dataset. Uses Bun's built-in
bun:sqlite, no external dependencies.⚡ REST API Agent Tool —
fetch_api_datatool lets the AI agent fetch from any API endpoint. Auto-flattens nested JSON to tabular format.How It Works
Verified
$ curl -X POST localhost:3001/api/data-source/fetch-url \ -H "Content-Type: application/json" \ -d '{"url":"https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"}' → {"rows":150, "columns":["5.1","3.5","1.4","0.2","Iris-setosa"], "format":"csv"} ✅Files Changed
New:
agent-service/src/api/data-source-api.ts(3 endpoints),data-source-tools.ts(agent tool)Modified:
user-dataset.component.*(URL input + drop zone),dataset.service.ts(fetch methods),proxy.config.json,DatasetSearchQueryBuilder.scala(fix: new datasets now appear in list immediately)