Add image analysis capabilities to Grok TUI using the Grok Vision API.
# Clone the repo
git clone https://github.com/parkertoddbrooks/grok-tui-vision-mcp.git
cd grok-tui-vision-mcp
# Install dependencies
uv sync
# Copy .env.example to .env and add your API key
cp .env.example .env
# Edit .env: GROK_CHAT_KEY=your-api-keyAdd to ~/.grok/config.toml:
[mcp_servers.vision]
command = "uv"
args = ["run", "--directory", "/absolute/path/to/grok-tui-vision-mcp", "python", "server.py"]Replace /absolute/path/to/grok-tui-vision-mcp with where you cloned the repo.
The server loads the API key from .env in the project directory.
Once configured, ask Grok to analyze images:
- "What's in this image? /path/to/photo.png"
- "Analyze the UI in /screenshots/app.png"
- "Describe /images/diagram.jpg"
The vision__analyze_image tool will be available to the model.
How to verify the MCP server is working correctly:
When the path in config.toml is correct, you'll see the tool called directly:
✔ Other
vision__analyze_image
If the path is wrong, Grok TUI will fall back to manually exploring the codebase and running Python commands directly - you'll see it checking files, verifying dependencies, etc. If this happens, double-check the --directory path in your config.toml.
Analyze an image using Grok Vision API.
Parameters:
image_path(required): Absolute path to image file (PNG, JPG, GIF, WebP)prompt(optional): Custom analysis prompt. Default: "Describe what you see in this image. Be concise but thorough." Use for specific questions like "How many people are in this photo?" or "What text is visible?"save_to_file(optional): Save analysis to a timestamped file. Default: false. To save, ask Grok to "save the analysis to a file" or "analyze and save to file".
Returns: Description of image contents. When save_to_file is enabled, saves to analysis-{name}-{timestamp}.txt in the image directory.
- Python 3.10+
- uv
- API key from console.x.ai
Uses grok-2-vision-latest for image analysis.
