feat(ai-monitoring): Fetch model context size and rename task to fetch_ai_model_info by constantinius · Pull Request #112656 · getsentry/sentry

constantinius · 2026-04-10T10:01:31Z

Closes https://linear.app/getsentry/issue/TET-2219/sentry-map-llm-context-size-to-relay-cost-calculation-config

Introduces a new AIModelMetadata schema that nests costs under a costs field and adds an optional contextSize. Context size is fetched from OpenRouter's context_length field and models.dev's limit.context field, following the same precedence logic as costs (OpenRouter takes priority).

Architecture

Both tasks run independently on the same cron schedule (every 30 min):

Task	Cache key	Format
`fetch_ai_model_costs` (legacy, TODO remove)	`ai-model-costs:v2`	Flat `AIModelCostV2`
`fetch_ai_model_metadata` (new)	`ai-model-metadata:v1`	Nested `AIModelMetadata`

They share helper functions (_normalize_model_id, _create_prefix_glob_model_name, safe_float_conversion) but fetch and cache independently. The old task + cache key will be removed once all consumers have migrated.

GlobalConfig serves both fields side by side:

aiModelCosts — legacy flat format (read by Relay today)
aiModelMetadata — new nested format with contextSize (not yet consumed by Relay)

New schema (`ai-model-metadata:v1`)

{
  "version": 1,
  "models": {
    "gpt-4": {
      "costs": {
        "inputPerToken": 0.0000003,
        "outputPerToken": 0.00000165,
        "outputReasoningPerToken": 0.0,
        "inputCachedPerToken": 0.0000015,
        "inputCacheWritePerToken": 0.00001875
      },
      "contextSize": 1000000
    },
    "claude-3-5-sonnet": {
      "costs": {
        "inputPerToken": 0.000003,
        "outputPerToken": 0.000015,
        "outputReasoningPerToken": 0.0,
        "inputCachedPerToken": 0.0000015,
        "inputCacheWritePerToken": 0.00000375
      }
    }
  }
}

contextSize is optional — only present when the source API provides it.

New types

Type	Purpose
`AIModelCost`	Token cost fields (nested under `costs`)
`AIModelMetadata`	Per-model entry: `costs` + optional `contextSize`
`AIModelMetadataConfig`	Top-level config: `version` + `models` dict

Legacy types (AIModelCostV2, AIModelCosts) are kept unchanged for the old task.

Co-Authored-By: Claude Sonnet 4 noreply@anthropic.com

…h_ai_model_info Extend the fetch_ai_model_costs task to also fetch context size (context window length) for each AI model alongside token costs. Context size is sourced from OpenRouter's context_length field and models.dev's limit.context field, following the same precedence logic as costs (OpenRouter takes priority). The task is renamed from fetch_ai_model_costs to fetch_ai_model_info since it now fetches more than just cost data. The AIModelCostV2 type gains an optional contextSize field (int). Updated references: - Task registration name in server.py cron schedule - Logger metric names in warning messages - All test imports, method names, and assertions Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

linear-code · 2026-04-10T10:01:39Z

TET-2219 Sentry: map LLM context size to relay cost calculation config

constantinius · 2026-04-10T10:07:05Z

src/sentry/relay/config/ai_model_costs.py

+class AIModelCostV2(TypedDict, total=False):
+    inputPerToken: Required[float]
+    outputPerToken: Required[float]
+    outputReasoningPerToken: Required[float]
+    inputCachedPerToken: Required[float]
+    inputCacheWritePerToken: Required[float]
+    contextSize: int


Maybe we should update the config version, or use another structure.

Maybe adding a new field in the config would be the best approach, because I can see us wanting to expand this even further for non cost related things. I am aware that it complicates this all a bit now, but it will allow us to do a quick metadata changes in future.

That new field could be called LLMModelMetadata and for each model would contain cost and context information for now, with a chance to expand it in the future.

Added type defs for a new schema.

src/sentry/conf/server.py

vgrozdanic · 2026-04-10T11:08:52Z

src/sentry/relay/config/ai_model_costs.py

+class AIModelCostV2(TypedDict, total=False):
+    inputPerToken: Required[float]
+    outputPerToken: Required[float]
+    outputReasoningPerToken: Required[float]
+    inputCachedPerToken: Required[float]
+    inputCacheWritePerToken: Required[float]
+    contextSize: int


Maybe adding a new field in the config would be the best approach, because I can see us wanting to expand this even further for non cost related things. I am aware that it complicates this all a bit now, but it will allow us to do a quick metadata changes in future.

That new field could be called LLMModelMetadata and for each model would contain cost and context information for now, with a chance to expand it in the future.

…acy task Introduce a new LLMModelMetadata schema with costs nested under a 'costs' field and an optional contextSize. Context size is fetched from OpenRouter's context_length and models.dev's limit.context fields. Both tasks run independently on the same cron schedule: - fetch_ai_model_costs -> writes ai-model-costs:v2 (flat AIModelCostV2) - fetch_llm_model_metadata -> writes llm-model-metadata:v1 (nested LLMModelMetadata) They share raw fetch helpers (_fetch_openrouter_raw, _fetch_models_dev_raw) but format and cache independently. The old task + cache key will be removed once all consumers have migrated. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

src/sentry/relay/config/ai_model_costs.py

src/sentry/tasks/ai_agent_monitoring.py

…acy task Introduce a new LLMModelMetadata schema with costs nested under a 'costs' field and an optional contextSize. Context size is fetched from OpenRouter's context_length and models.dev's limit.context fields. Both tasks run independently on the same cron schedule: - fetch_ai_model_costs -> writes ai-model-costs:v2 (flat AIModelCostV2) - fetch_llm_model_metadata -> writes llm-model-metadata:v1 (nested LLMModelMetadata) They share raw fetch helpers (_fetch_openrouter_raw, _fetch_models_dev_raw) but format and cache independently. The old task + cache key will be removed once all consumers have migrated. GlobalConfig now serves both fields side by side: - aiModelCosts: legacy flat format (TODO remove) - llmModelMetadata: new nested format with contextSize Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

github-actions · 2026-04-10T14:02:03Z

Backend Test Failures

Failures on 80438bc in this run:

tests/sentry/api/endpoints/test_relay_globalconfig_v3.py::test_global_config — log

[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/api/endpoints/test_relay_globalconfig_v3.py:72: in test_global_config
    assert normalized == config
E   AssertionError: assert {'aiModelCost....0, ...}, ...} == {'aiModelCost.....]}]}}}, ...}
E     
E     Omitting 5 identical items, use -vv to show
E     Right contains 1 more item:
E     {'llmModelMetadata': None}
E     
E     Full diff:
E       {
E           'aiModelCosts': None,
E     -     'llmModelMetadata': None,
E           'measurements': {
E               'builtinMeasurements': [
E                   {
E                       'name': 'app_start_cold',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'app_start_warm',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'cls',
E                       'unit': 'none',
E                   },
E                   {
E                       'name': 'connection.rtt',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'fcp',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'fid',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'fp',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'frames_frozen_rate',
E                       'unit': 'ratio',
E                   },
E                   {
E                       'name': 'frames_frozen',
E                       'unit': 'none',
... (1493 more lines)

tests/sentry/api/endpoints/test_relay_globalconfig_v3.py::test_global_config_valid_with_generic_filters — log

[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/api/endpoints/test_relay_globalconfig_v3.py:127: in test_global_config_valid_with_generic_filters
    assert config == normalize_global_config(config)
E   AssertionError: assert {'aiModelCost...ts': 10}, ...} == {'aiModelCost.....]}]}}}, ...}
E     
E     Omitting 5 identical items, use -vv to show
E     Left contains 1 more item:
E     {'llmModelMetadata': None}
E     
E     Full diff:
E       {
E           'aiModelCosts': None,
E           'filters': {
E               'filters': [
E                   {
E                       'condition': {
E                           'inner': {
E                               'name': 'event.contexts.browser.name',
E                               'op': 'eq',
E                               'value': 'Firefox',
E                           },
E                           'op': 'not',
E                       },
E                       'id': 'test-id',
E                       'isEnabled': True,
E                   },
E               ],
E               'version': 1,
E           },
E     +     'llmModelMetadata': None,
E           'measurements': {
E               'builtinMeasurements': [
E                   {
E                       'name': 'app_start_cold',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'app_start_warm',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'cls',
E                       'unit': 'none',
E                   },
E                   {
E                       'name': 'connection.rtt',
E                       'unit': 'millisecond',
E                   },
E                   {
E                       'name': 'fcp',
... (1483 more lines)

Relay's normalize_global_config strips unknown fields, causing test_relay_globalconfig_v3 failures. The new cache is still populated and readable via llm_model_metadata_config() but should not be added to Relay's GlobalConfig until Relay supports the field. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

src/sentry/tasks/ai_agent_monitoring.py

jjbayer · 2026-04-13T09:23:05Z

src/sentry/relay/globalconfig.py

        "measurements": get_measurements_config(),
-        "aiModelCosts": ai_model_costs_config(),
+        "aiModelCosts": ai_model_costs_config(),  # TODO: Remove once all consumers use aiModelMetadata
+        "aiModelMetadata": ai_model_metadata_config(),


Was this schema already added to Relay?

There is a PR in Relay for that: getsentry/relay#5814

jjbayer · 2026-04-13T09:26:05Z

A general problem with config migrations like this is that customer Relays also receive the global config, and they might still expect the old format. So the safer way forward is to leave the old format in place and add an additional key.

constantinius · 2026-04-13T10:21:48Z

A general problem with config migrations like this is that customer Relays also receive the global config, and they might still expect the old format. So the safer way forward is to leave the old format in place and add an additional key.

@jjbayer the previous task and config remains for a time

…nfig Rename all LLM prefixes to AI for consistency with existing naming: - LLMModelCost -> AIModelCost - LLMModelMetadata -> AIModelMetadata - LLMModelMetadataConfig -> AIModelMetadataConfig - llm-model-metadata:v1 -> ai-model-metadata:v1 - fetch_llm_model_metadata -> fetch_ai_model_metadata - llm_model_metadata_config -> ai_model_metadata_config Add aiModelMetadata to GlobalConfig alongside aiModelCosts. Relay's normalize_global_config strips unknown fields, so relay globalconfig tests pop aiModelMetadata before comparing until Relay adds support. Restore original fetch_ai_model_costs code untouched from master. New fetch_ai_model_metadata task appended at the bottom of the file. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit f81a89a. Configure here.}

cursor · 2026-04-13T12:58:31Z

src/sentry/relay/config/ai_model_costs.py

+    outputPerToken: float
+    outputReasoningPerToken: float
+    inputCachedPerToken: float
+    inputCacheWritePerToken: float


Identical TypedDict duplicated under a new name

Low Severity

AIModelCost is field-for-field identical to AIModelCostV2. A simple type alias (AIModelCost = AIModelCostV2) or reusing the existing type in AIModelMetadata would avoid the duplication while still allowing the legacy type to be independently removed later.

Additional Locations (1)

src/sentry/relay/config/ai_model_costs.py#L24-L32

^{Reviewed by Cursor Bugbot for commit f81a89a. Configure here.}

jjbayer · 2026-04-14T09:08:17Z

A general problem with config migrations like this is that customer Relays also receive the global config, and they might still expect the old format. So the safer way forward is to leave the old format in place and add an additional key.

@jjbayer the previous task and config remains for a time

OK as long as we double-write in sentry before we deploy the relay change this should be fine. I discussed with @Dav1dde that customer relays should not be a problem because they will just skip this part of normalization once the old config key is gone, which means our own relays will populate the attributes. Since metrics extraction is essentially dead, we no longer require doing this on the edge.

constantinius requested review from a team as code owners April 10, 2026 10:01

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 10, 2026

vercel bot deployed to Preview April 10, 2026 10:03 View deployment

constantinius requested a review from vgrozdanic April 10, 2026 10:03

constantinius commented Apr 10, 2026

View reviewed changes

vgrozdanic reviewed Apr 10, 2026

View reviewed changes

constantinius requested a review from vgrozdanic April 10, 2026 13:26

vercel bot deployed to Preview April 10, 2026 13:28 View deployment

sentry bot reviewed Apr 10, 2026

View reviewed changes

src/sentry/relay/config/ai_model_costs.py Show resolved Hide resolved

cursor bot reviewed Apr 10, 2026

View reviewed changes

src/sentry/tasks/ai_agent_monitoring.py Outdated Show resolved Hide resolved

constantinius requested a review from a team as a code owner April 10, 2026 13:47

vercel bot deployed to Preview April 10, 2026 13:49 View deployment

vercel bot deployed to Preview April 10, 2026 14:29 View deployment

sentry bot reviewed Apr 10, 2026

View reviewed changes

src/sentry/tasks/ai_agent_monitoring.py Outdated Show resolved Hide resolved

constantinius mentioned this pull request Apr 10, 2026

feat(ai): Add ModelMetadata config with context size and utilization getsentry/relay#5814

Open

vercel bot deployed to Preview April 13, 2026 08:47 View deployment

cursor bot reviewed Apr 13, 2026

View reviewed changes

src/sentry/tasks/ai_agent_monitoring.py Show resolved Hide resolved

jjbayer reviewed Apr 13, 2026

View reviewed changes

constantinius force-pushed the constantinius/feat/tasks/ai-agent-monitoring-fetch-llm-context-size branch from ae0f9b1 to f81a89a Compare April 13, 2026 12:52

vercel bot deployed to Preview April 13, 2026 12:54 View deployment

cursor bot reviewed Apr 13, 2026

View reviewed changes

Uh oh!

Conversation

constantinius commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture

New schema (ai-model-metadata:v1)

New types

Uh oh!

linear-code bot commented Apr 10, 2026

Uh oh!

constantinius Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

vgrozdanic Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

constantinius Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vgrozdanic Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Test Failures

Uh oh!

Uh oh!

Uh oh!

jjbayer Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

constantinius Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

jjbayer commented Apr 13, 2026

Uh oh!

constantinius commented Apr 13, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 13, 2026

Choose a reason for hiding this comment

Identical TypedDict duplicated under a new name

Uh oh!

jjbayer commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

constantinius commented Apr 10, 2026 •

edited

Loading

New schema (`ai-model-metadata:v1`)

github-actions bot commented Apr 10, 2026 •

edited

Loading