[Bug]: `/v1/messages/count_tokens` for Vertex AI Claude models fails on missing `vertexai` (google-cloud-aiplatform) import

## What happened

`POST /v1/messages/count_tokens` against a Vertex AI–routed Claude model returns HTTP 500:

```json
{"detail":{"error":"Internal server error: vertexai import failed please run `pip install -U \"google-cloud-aiplatform>=1.38\"`. Got error: No module named 'vertexai'"}}
```

The proxy is up, all *other* Vertex AI routes (`/v1/messages`, tool calls, streaming, vision, web_search, etc.) work for the same model alias — only `count_tokens` requires the `vertexai` SDK module. The same `count_tokens` call against `claude-sonnet-4-6` (anthropic-native) returns `{"input_tokens": 9}` as expected from the same proxy.

`google-cloud-aiplatform` is not a LiteLLM core dependency, nor is it in the `proxy-dev`/`proxy` extras used by `uv sync --frozen --group proxy-dev --extra proxy`. So a fresh stable-tag install can never count tokens on Vertex AI Claude unless the operator separately installs the Gemini SDK.

## Reproducer

```bash
# proxy launched via `uv run litellm --config config.yaml --port 4101`
curl -sS -X POST http://127.0.0.1:4101/v1/messages/count_tokens \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model":"claude-sonnet-4-6-vertex","messages":[{"role":"user","content":"hello world"}]}'
```

Returns the 500 above. The `claude-sonnet-4-6-vertex` alias routes to Vertex AI's Claude API and works for every other endpoint without `google-cloud-aiplatform`.

## Expected

Counting tokens for Claude-on-Vertex should not require the Gemini-specific `vertexai` Python module. Vertex AI's Claude models are addressed via the Anthropic Messages API protocol; the token counter should use the same Anthropic-protocol path the rest of the `vertex_ai` Claude routing does (or call Vertex's `:rawPredict` count-tokens equivalent directly via HTTPS, like LiteLLM's other Vertex Claude transformations do).

Either:

1. Make `count_tokens` for Claude-on-Vertex avoid `import vertexai` entirely (preferred — keeps deps minimal); or
2. Add `google-cloud-aiplatform` to the default proxy extras with a clear error message pointing the operator at the install when it's absent (worse — pulls in a heavy Gemini SDK for a feature that should be SDK-free).

## Why this matters

Claude Code's headless mode (and any tooling that uses `count_tokens` for budget display) is silently broken against any LiteLLM proxy that routes Claude to Vertex AI unless the operator knows to install the Gemini SDK.

## Relevant LiteLLM version

`v1.83.14-stable` (also reproduced against current `main`).

## Surfaced by

Daily Claude Code × LiteLLM compatibility-matrix cron (PR [BerriAI/litellm-docs#142](https://github.com/BerriAI/litellm-docs/pull/142) — see `count_tokens × vertex_ai` cell).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: `/v1/messages/count_tokens` for Vertex AI Claude models fails on missing `vertexai` (google-cloud-aiplatform) import #28084

What happened

Reproducer

Expected

Why this matters

Relevant LiteLLM version

Surfaced by

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: /v1/messages/count_tokens for Vertex AI Claude models fails on missing vertexai (google-cloud-aiplatform) import #28084

Description

What happened

Reproducer

Expected

Why this matters

Relevant LiteLLM version

Surfaced by

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: `/v1/messages/count_tokens` for Vertex AI Claude models fails on missing `vertexai` (google-cloud-aiplatform) import #28084