Skip to content

[Bug]: /v1/messages/count_tokens for Vertex AI Claude models fails on missing vertexai (google-cloud-aiplatform) import #28084

@mateo-berri

Description

@mateo-berri

What happened

POST /v1/messages/count_tokens against a Vertex AI–routed Claude model returns HTTP 500:

{"detail":{"error":"Internal server error: vertexai import failed please run `pip install -U \"google-cloud-aiplatform>=1.38\"`. Got error: No module named 'vertexai'"}}

The proxy is up, all other Vertex AI routes (/v1/messages, tool calls, streaming, vision, web_search, etc.) work for the same model alias — only count_tokens requires the vertexai SDK module. The same count_tokens call against claude-sonnet-4-6 (anthropic-native) returns {"input_tokens": 9} as expected from the same proxy.

google-cloud-aiplatform is not a LiteLLM core dependency, nor is it in the proxy-dev/proxy extras used by uv sync --frozen --group proxy-dev --extra proxy. So a fresh stable-tag install can never count tokens on Vertex AI Claude unless the operator separately installs the Gemini SDK.

Reproducer

# proxy launched via `uv run litellm --config config.yaml --port 4101`
curl -sS -X POST http://127.0.0.1:4101/v1/messages/count_tokens \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model":"claude-sonnet-4-6-vertex","messages":[{"role":"user","content":"hello world"}]}'

Returns the 500 above. The claude-sonnet-4-6-vertex alias routes to Vertex AI's Claude API and works for every other endpoint without google-cloud-aiplatform.

Expected

Counting tokens for Claude-on-Vertex should not require the Gemini-specific vertexai Python module. Vertex AI's Claude models are addressed via the Anthropic Messages API protocol; the token counter should use the same Anthropic-protocol path the rest of the vertex_ai Claude routing does (or call Vertex's :rawPredict count-tokens equivalent directly via HTTPS, like LiteLLM's other Vertex Claude transformations do).

Either:

  1. Make count_tokens for Claude-on-Vertex avoid import vertexai entirely (preferred — keeps deps minimal); or
  2. Add google-cloud-aiplatform to the default proxy extras with a clear error message pointing the operator at the install when it's absent (worse — pulls in a heavy Gemini SDK for a feature that should be SDK-free).

Why this matters

Claude Code's headless mode (and any tooling that uses count_tokens for budget display) is silently broken against any LiteLLM proxy that routes Claude to Vertex AI unless the operator knows to install the Gemini SDK.

Relevant LiteLLM version

v1.83.14-stable (also reproduced against current main).

Surfaced by

Daily Claude Code × LiteLLM compatibility-matrix cron (PR BerriAI/litellm-docs#142 — see count_tokens × vertex_ai cell).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaude codeIssues related to Claude Code usagellm translation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions