Skip to content

VectorlyApp/bluebox

Repository files navigation

bluebox 🟦

Index the world's undocumented APIs.

Why "Blue Box"? Named after the phone phreaking devices that let tech enthusiasts in the 1960s and 70s explore telephone networks.

You are in the right place if you ...

  • need to scrape data behind UI interactions
  • are dealing with closed APIs
  • want to reverse engineer websites

Tutorial

bluebox-agent-tutorial.mov

Prerequisites

  • Python 3.12+
  • Vectorly API key (required, used by bluebox agent for web data extraction)
    • Sign up at console.vectorly.app
    • macOS/Linux: export VECTORLY_SERVICE_TOKEN="your-key"
    • Windows (PowerShell): setx VECTORLY_SERVICE_TOKEN "your-key"
    • Or add it to your .env file: VECTORLY_SERVICE_TOKEN=your-key
  • LLM provider API key (required, used by bluebox agent for orchestration)
    • Configure one of the following:
    • OpenAI (default):
      • macOS/Linux: export OPENAI_API_KEY="your-key"
      • Windows (PowerShell): setx OPENAI_API_KEY "your-key"
      • .env: OPENAI_API_KEY=your-key
    • Anthropic:
      • macOS/Linux: export ANTHROPIC_API_KEY="your-key"
      • Windows (PowerShell): setx ANTHROPIC_API_KEY "your-key"
      • .env: ANTHROPIC_API_KEY=your-key
  • uv (optional, for dependency management)
    • macOS/Linux: curl -LsSf https://astral.sh/uv/install.sh | sh
    • Windows (PowerShell): iwr https://astral.sh/uv/install.ps1 -UseBasicParsing | iex

Installation

# Clone the repository
git clone https://github.com/VectorlyApp/bluebox.git
cd bluebox

# Create and activate virtual environment
python3 -m venv bluebox-env
source bluebox-env/bin/activate  # On Windows: bluebox-env\Scripts\activate

# Install in editable mode
pip install -e .

# Or using uv (faster)
uv venv bluebox-env
source bluebox-env/bin/activate
uv pip install -e .

Bluebox agent

The bluebox agent is a conversational AI agent that automates web data extraction. It searches the Vectorly web routine index for relevant web APIs, executes matched endpoints in parallel, and falls back to a live AI browser agent when no suitable pre-built routine is available.

Quickstart

# run with OpenAI models
bluebox-agent --model gpt-5.2

# run with Anthropic models
bluebox-agent --model claude-opus-4-5

What it does:

  • Interprets natural language requests and maps them to relevant routines
  • Executes multiple routines concurrently for faster results
  • Falls back to an AI browser agent for tasks without predefined routines
  • Post-processes outputs using Python (CSV, JSON, etc.)
  • Saves generated files to a local workspace
  • Generates reusable context files to replay successful sessions instantly

Ask it anything: "Run a price analysis on Rolex Sea Dweller 16600" — the agent automatically selects the right routine, runs it, and delivers structured results.

Context (session replay)

After a successful session, run /generate_context to save a snapshot of what worked — the goal, routines called (with exact parameters), any Python post-processing code, and output descriptions. Context files are saved to the workspace context/ directory in both JSON and Markdown formats.

When the agent starts a new session, it automatically loads the most recent context file and injects it into the system prompt. This lets the agent skip trial and error and directly replay the known-good path, adjusting parameters as needed for the new request.

You can also load a specific context file explicitly:

bluebox-agent --context-file path/to/agent_context.json

Create your own routines

To learn about the core technology powering BlueBox, see routine_discovery.md.

Contributing 🤝

We welcome contributions! Here's how to get started:

  1. Report bugs or request features — Open an issue
  2. Submit code — Fork the repo and open a pull request
  3. Test your code — Add unit tests and make sure all tests pass:
python -m pytest tests/ -v

Please follow existing code style and include tests for new features.