|
| 1 | +# LiteLLM Gateway on Vercel |
| 2 | + |
| 3 | +Self-hosted AI gateway using [LiteLLM](https://www.litellm.ai/) deployed on Vercel with [Services](https://vercel.com/docs/services). A Next.js chat frontend talks to a LiteLLM proxy backend — both deployed as a single Vercel project. |
| 4 | + |
| 5 | +[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fvercel%2Fexamples%2Ftree%2Fmain%2Fpython%2Flitellm-gateway&env=VERCEL_AI_GATEWAY_API_KEY&envDescription=Vercel%20AI%20Gateway%20API%20key%20from%20your%20dashboard&envLink=https%3A%2F%2Fvercel.com%2Fdocs%2Fai-gateway) |
| 6 | + |
| 7 | +## Architecture |
| 8 | + |
| 9 | +Two services deployed together via `experimentalServices` in `vercel.json`: |
| 10 | + |
| 11 | +- **frontend** (Next.js) at `/` — Chat UI using the [AI SDK](https://sdk.vercel.ai) |
| 12 | +- **gateway** (LiteLLM/FastAPI) at `/gateway` — OpenAI-compatible proxy that routes to any LLM provider |
| 13 | + |
| 14 | +``` |
| 15 | +Browser → /api/chat (Next.js) → GATEWAY_URL/v1/chat/completions (LiteLLM) → Vercel AI Gateway → Provider |
| 16 | +``` |
| 17 | + |
| 18 | +Vercel Services automatically generates a `GATEWAY_URL` environment variable so the frontend can reach the gateway without hardcoded URLs. |
| 19 | + |
| 20 | +## Project structure |
| 21 | + |
| 22 | +``` |
| 23 | +litellm-gateway/ |
| 24 | +├── gateway/ |
| 25 | +│ ├── app.py # LiteLLM proxy entrypoint |
| 26 | +│ ├── litellm_config.yaml # Model + provider config |
| 27 | +│ └── pyproject.toml # Python dependencies |
| 28 | +├── frontend/ |
| 29 | +│ ├── app/ |
| 30 | +│ │ ├── api/chat/route.ts # Proxies to LiteLLM via GATEWAY_URL |
| 31 | +│ │ ├── layout.tsx |
| 32 | +│ │ ├── page.tsx # Chat UI |
| 33 | +│ │ └── globals.css |
| 34 | +│ ├── package.json |
| 35 | +│ └── next.config.js |
| 36 | +└── vercel.json # Services configuration |
| 37 | +``` |
| 38 | + |
| 39 | +## Setup |
| 40 | + |
| 41 | +### 1. Configure models |
| 42 | + |
| 43 | +Edit `gateway/litellm_config.yaml` to define your model routing. The default config routes through the [Vercel AI Gateway](https://vercel.com/docs/ai-gateway): |
| 44 | + |
| 45 | +```yaml |
| 46 | +model_list: |
| 47 | + - model_name: gpt-4o-mini |
| 48 | + litellm_params: |
| 49 | + model: vercel_ai_gateway/openai/gpt-4o-mini |
| 50 | + api_key: os.environ/VERCEL_AI_GATEWAY_API_KEY |
| 51 | +``` |
| 52 | +
|
| 53 | +You can also route directly to providers (OpenAI, Anthropic, etc.) — see the [LiteLLM provider docs](https://docs.litellm.ai/docs/providers/) and the [config reference](https://docs.litellm.ai/docs/proxy/configs). |
| 54 | +
|
| 55 | +### 2. Set environment variables |
| 56 | +
|
| 57 | +| Variable | Required | Description | |
| 58 | +| --------------------------- | ------------------------ | --------------------------------------------------------------- | |
| 59 | +| `VERCEL_AI_GATEWAY_API_KEY` | Yes (for default config) | [Vercel AI Gateway](https://vercel.com/docs/ai-gateway) API key | |
| 60 | +| `LITELLM_MASTER_KEY` | No | Require auth for LiteLLM proxy endpoints | |
| 61 | + |
| 62 | +If routing directly to providers instead, set their API keys (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc.) and update `litellm_config.yaml`. |
| 63 | + |
| 64 | +### 3. Update the frontend model list |
| 65 | + |
| 66 | +The chat UI has a hardcoded model list in `frontend/app/page.tsx`. Update the `MODELS` array to match your `litellm_config.yaml`. |
| 67 | + |
| 68 | +### 4. Deploy |
| 69 | + |
| 70 | +Set the project framework to **Services** in your Vercel project settings, then: |
| 71 | + |
| 72 | +```bash |
| 73 | +vercel deploy |
| 74 | +``` |
| 75 | + |
| 76 | +## Local development |
| 77 | + |
| 78 | +Install frontend dependencies: |
| 79 | + |
| 80 | +```bash |
| 81 | +cd frontend |
| 82 | +npm install |
| 83 | +``` |
| 84 | + |
| 85 | +Run all services together: |
| 86 | + |
| 87 | +```bash |
| 88 | +cd .. |
| 89 | +vercel dev -L |
| 90 | +``` |
| 91 | + |
| 92 | +Open http://localhost:3000 to use the chat UI. The LiteLLM gateway runs at `/gateway` — try `/gateway/health/liveliness` to verify it's up. |
| 93 | + |
| 94 | +## How it works |
| 95 | + |
| 96 | +The Next.js API route at `/api/chat` creates an OpenAI-compatible client pointed at the LiteLLM gateway: |
| 97 | + |
| 98 | +```ts |
| 99 | +const litellm = createOpenAI({ |
| 100 | + baseURL: `${process.env.GATEWAY_URL}/v1`, |
| 101 | + apiKey: process.env.LITELLM_MASTER_KEY || 'not-needed', |
| 102 | +}) |
| 103 | +``` |
| 104 | + |
| 105 | +`GATEWAY_URL` is auto-generated by Vercel Services — no hardcoded URLs needed. The AI SDK's `streamText` function handles streaming the LLM response back to the browser. |
| 106 | + |
| 107 | +## Tuning |
| 108 | + |
| 109 | +The gateway service is configured with `maxDuration: 120` (seconds) in `vercel.json`. You can also set `memory` (128–10240 MB) if your config requires more resources. See the [Services docs](https://vercel.com/docs/services) for all options. |
0 commit comments