Skip to content

feat(api): update API spec from langfuse/langfuse 6603be6#1612

Closed
langfuse-bot wants to merge 7 commits intomainfrom
api-spec-bot-6603be6
Closed

feat(api): update API spec from langfuse/langfuse 6603be6#1612
langfuse-bot wants to merge 7 commits intomainfrom
api-spec-bot-6603be6

Conversation

@langfuse-bot
Copy link
Copy Markdown
Collaborator

@langfuse-bot langfuse-bot commented Apr 7, 2026

Disclaimer: Experimental PR review

Greptile Summary

This PR is an auto-generated API spec update (from langfuse/langfuse commit 6603be6) that adds TEXT as a new score data type throughout the Python SDK. It introduces TextScore, TextScoreV1, Score_Text, ScoreV1_Text, GetScoresResponseData_Text, and GetScoresResponseDataText models, and updates both ScoreDataType and ScoreConfigDataType enums with the TEXT value.

  • All new types (TextScore, TextScoreV1) follow the established inheritance pattern from BaseScore/BaseScoreV1.
  • The discriminated union GetScoresResponseData is correctly extended with GetScoresResponseData_Text using dataType="TEXT" as the discriminator key.
  • ScoreDataType.TEXT includes a visit() branch; ScoreConfigDataType.TEXT also adds TEXT (note: CORRECTION remains intentionally absent from config data type, consistent with the pre-existing design).
  • All __init__.py index files and __all__ lists are updated consistently across langfuse/api/, langfuse/api/commons/, and langfuse/api/scores/.
  • Docstrings in ScoreBody, CreateScoreRequest, and the legacy score_v1 clients are updated to mention text score requirements (string value, 1–500 characters).
  • No manually editable code was modified — all changes are in langfuse/api/, which is the auto-generated Fern client zone per CLAUDE.md.

Confidence Score: 5/5

This PR is safe to merge — it is auto-generated from an updated API spec and introduces no manually written logic.

All changes are in the auto-generated Fern client directory. The new TEXT score type follows established patterns exactly (enum values, discriminated unions, model inheritance, init.py exports). No logic regressions, no broken contracts, no missing migrations.

No files require special attention.

Important Files Changed

Filename Overview
langfuse/api/commons/types/text_score.py New TextScore model inheriting BaseScore, adds required string_value field (1–500 chars).
langfuse/api/commons/types/text_score_v1.py New TextScoreV1 model inheriting BaseScoreV1, parallel legacy-API counterpart to TextScore.
langfuse/api/commons/types/score.py Adds Score_Text discriminated-union member; correctly included in Score type alias.
langfuse/api/commons/types/score_v1.py Adds ScoreV1_Text variant; correctly included in ScoreV1 union type alias.
langfuse/api/commons/types/score_data_type.py Adds TEXT enum value and visit() branch to ScoreDataType.
langfuse/api/commons/types/score_config_data_type.py Adds TEXT enum value and visit() branch to ScoreConfigDataType; CORRECTION intentionally absent.
langfuse/api/scores/types/get_scores_response_data.py Adds GetScoresResponseData_Text inline class and extends GetScoresResponseData discriminated union.
langfuse/api/scores/types/get_scores_response_data_text.py New GetScoresResponseDataText class inheriting TextScore with optional trace field.
langfuse/api/ingestion/types/score_body.py Docstring update only — mentions text scores and 1–500 character constraint.
langfuse/api/legacy/score_v1/types/create_score_request.py Docstring update only — mentions text score value requirements.

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class BaseScore {
        +id: str
        +name: str
        +source: ScoreSource
        +timestamp: datetime
        +created_at: datetime
        +updated_at: datetime
        +metadata: Any
        +environment: str
    }
    class TextScore {
        +string_value: str
    }
    class GetScoresResponseDataText {
        +trace: Optional[GetScoresResponseTraceData]
    }
    class Score_Text {
        +data_type: Literal["TEXT"]
        +string_value: str
    }
    class ScoreV1_Text {
        +data_type: Literal["TEXT"]
        +string_value: str
        +trace_id: str
    }
    class GetScoresResponseData_Text {
        +data_type: Literal["TEXT"]
        +string_value: str
        +trace: Optional[GetScoresResponseTraceData]
    }
    BaseScore <|-- TextScore
    TextScore <|-- GetScoresResponseDataText
    note for Score_Text "Used in discriminated union Score"
    note for ScoreV1_Text "Used in discriminated union ScoreV1"
    note for GetScoresResponseData_Text "Used in GetScoresResponseData union"
Loading

Reviews (1): Last reviewed commit: "feat(api): update API spec from langfuse..." | Re-trigger Greptile

@langfuse-bot langfuse-bot requested a review from wochinge April 7, 2026 14:36
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

@claude review

wochinge and others added 4 commits April 7, 2026 16:57
Extend string-value overloads in create_score, score_current_span,
score_current_trace, score, and score_trace to accept TEXT alongside
CATEGORICAL. Update all related docstrings. Add ExperimentScoreType
to exclude TEXT from experiments/evals. Add integration test for TEXT
scores.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The spec update is now automated via PR from the langfuse repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update cast(Literal["CATEGORICAL"], ...) to include "TEXT" in
score/score_trace/score_current_span/score_current_trace impl bodies.
Add ExperimentScoreType to __all__ in types.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issues from the previous review have been addressed: ScoreDataType now includes TEXT, the string-value overloads accept both CATEGORICAL and TEXT, and the experiment layer correctly uses a separate ExperimentScoreType that excludes TEXT. LGTM.

Extended reasoning...

Overview

This PR adds TEXT as a new score data type throughout the Python SDK. Changes span the auto-generated Fern client (langfuse/api/), the high-level SDK types (langfuse/types.py), the client overloads (langfuse/_client/client.py, langfuse/_client/span.py), the experiment layer (langfuse/experiment.py), and a new integration test.

Security risks

No security-sensitive code is touched. The changes are purely additive type-system and data-model updates. No auth, crypto, or permission logic is affected.

Level of scrutiny

The auto-generated Fern client changes in langfuse/api/ are mechanical and follow established patterns exactly. The hand-maintained changes in langfuse/types.py, langfuse/_client/, and langfuse/experiment.py correctly thread TEXT through the type system, and the introduction of ExperimentScoreType cleanly excludes TEXT from the experiment/eval layer (consistent with the documented design intent). The new test in test_core_sdk.py exercises the full round-trip for TEXT scores.

Other factors

My prior review identified two gaps (missing TEXT in ScoreDataType and CATEGORICAL-only overloads) — both are resolved in this revision, and the inline comment was marked resolved. The tenacity dev dependency addition is a minor, appropriate change to support retry logic in the new test.

@wochinge wochinge force-pushed the api-spec-bot-6603be6 branch from 770e388 to 5eec695 Compare April 7, 2026 16:15
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — previous concerns resolved: and overload signatures now include TEXT, correctly excludes it.

Extended reasoning...

Overview

The PR adds TEXT as a new score data type throughout the Python SDK. It covers: auto-generated Fern client types (, , , , discriminated union updates), public type aliases in , high-level SDK method overloads in and , a new alias that intentionally excludes TEXT, and a new integration test.

Security risks

None. This is a feature addition to score types with no auth, crypto, or permission changes.

Level of scrutiny

Moderate — the PR touches the public SDK API surface but all changes are mechanical and follow established patterns exactly. The auto-generated Fern client files are noise; the meaningful hand-edited changes (types.py, client.py, span.py, experiment.py) are small and correct.

Other factors

My prior review identified that still had the old and the overloads hardcoded . Both gaps are now fixed in this revision. The inline comment was resolved. The only remaining flag is a nit about module-level assignments in the test file, which is a pre-existing anti-pattern that does not affect production code or this PR's correctness.

@wochinge wochinge force-pushed the api-spec-bot-6603be6 branch from 5eec695 to 5e63dcf Compare April 7, 2026 16:50
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous concerns have been addressed: now includes TEXT, overload signatures updated, and correctly excludes TEXT for experiments. LGTM.

Extended reasoning...

Overview\nThis PR adds TEXT as a new score data type throughout the Python SDK. Changes span the auto-generated Fern API layer (new TextScore/TextScoreV1 models, Score_Text/ScoreV1_Text discriminated union members, enum updates) plus manually maintained files (langfuse/types.py, _client/client.py, _client/span.py, experiment.py).\n\n### Security risks\nNone. This is a purely additive data-type extension with no auth, crypto, or permission changes.\n\n### Level of scrutiny\nMy previous review flagged two gaps (missing TEXT in langfuse/types.py and hardcoded CATEGORICAL-only overloads). Both have been fixed in this revision: types.py now exports ScoreDataType with TEXT, ExperimentScoreType correctly excludes TEXT (since text scores are not supported in experiments), and all overloads/casts use Literal["CATEGORICAL", "TEXT"]. The fix is correct and complete.\n\n### Other factors\nAll inline comments from prior reviews are marked resolved. The new test_create_text_score test uses tenacity retries consistent with how similar tests are structured. No outstanding concerns remain.

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — my previous concerns about missing TEXT in the high-level SDK types and overload signatures have been addressed.

Extended reasoning...

Overview

This PR adds TEXT as a new score data type across the Python SDK: auto-generated Fern client files under langfuse/api/ plus manual updates to langfuse/types.py, langfuse/_client/client.py, langfuse/_client/span.py, langfuse/experiment.py, and test files. The new ExperimentScoreType alias correctly excludes TEXT from evals/experiments where it is not supported.

Security risks

None — this is additive type/model plumbing with no auth, crypto, or permission changes.

Level of scrutiny

Moderate. Most changes are auto-generated Fern client boilerplate following established patterns exactly. The manual SDK edits (types, overloads, casts) are small and straightforward. The remaining inline bug about visit() adding a mandatory text parameter is intentional Fern exhaustive-visitor design and affects only low-level API consumers; the code is correct.

Other factors

Both prior blocking comments (missing TEXT in ScoreDataType literal and CATEGORICAL-only overloads) are resolved in this revision. Test coverage for the new TEXT score type is added. The visit() breaking-change concern is a documentation/semver note, not a code defect.

Comment on lines 18 to 26
self,
numeric: typing.Callable[[], T_Result],
boolean: typing.Callable[[], T_Result],
categorical: typing.Callable[[], T_Result],
correction: typing.Callable[[], T_Result],
text: typing.Callable[[], T_Result],
) -> T_Result:
if self is ScoreDataType.NUMERIC:
return numeric()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The visit() methods on ScoreDataType and ScoreConfigDataType add a new mandatory text parameter without a default value, which will raise TypeError at the call site for any existing user code that calls these methods without the new text argument. Any SDK consumer who uses visit()-based score type dispatch (e.g. score_data_type.visit(numeric=..., boolean=..., categorical=..., correction=...)) will have silently broken code after upgrading to this version — this breaking change should be explicitly called out in a changelog entry or semver bump.

Extended reasoning...

What the bug is and how it manifests

The ScoreDataType.visit() method in langfuse/api/commons/types/score_data_type.py previously accepted four callables: numeric, boolean, categorical, and correction. This PR adds a fifth mandatory parameter text with no default value. Similarly, ScoreConfigDataType.visit() in score_config_data_type.py previously took three callables and now requires a fourth mandatory text parameter. Python raises TypeError at the call site when any required keyword argument is missing, regardless of which enum value self currently holds.

The specific code path that triggers it

Any existing caller that pattern-matches on score types via visit() will fail at runtime:

# Previously valid; now raises TypeError on upgrade:
score_data_type.visit(
    numeric=lambda: "numeric",
    boolean=lambda: "bool",
    categorical=lambda: "cat",
    correction=lambda: "corr",
)
# TypeError: visit() missing 1 required keyword argument: text

Python validates all required arguments before dispatching the call, so even if the ScoreDataType value is NUMERIC and text would never be invoked, the TypeError is raised immediately at the call site.

Addressing the refutation

The refuter is correct that the Fern exhaustive visitor pattern is intentionally designed this way: adding a new enum value is a breaking change for visit() callers, and requiring the new handler prevents silent misbehavior (where TEXT scores would be silently ignored). Adding text=None as a default would indeed be a regression — code that does not handle TEXT scores would silently produce wrong results. So the code itself is correct per Fern design contract.

Why this still deserves attention

While the code behavior is intentional, the practical impact is real: this is a breaking change on the public langfuse.api surface. The methods are exported via langfuse/api/init.py and langfuse/api/commons/init.py. Any user who has written visit()-based dispatch code will encounter a runtime TypeError after upgrading. The PR description says no manually editable code was modified, which is accurate but does not negate the user-visible break.

Impact

SDK users who use the Fern client low-level visit() pattern for exhaustive score-type dispatch will have broken code after upgrading. The high-level Langfuse() client does not call visit() internally, so server-side functionality works correctly. The impact is limited to users of the langfuse.api low-level layer who pattern-match on ScoreDataType or ScoreConfigDataType values.

Step-by-step proof

  1. User has existing code: score.data_type.visit(numeric=..., boolean=..., categorical=..., correction=...)
  2. User upgrades the SDK to this version.
  3. ScoreDataType.visit now requires text as a fifth mandatory keyword argument.
  4. Python raises TypeError: visit() missing 1 required keyword argument: text at the call site before any dispatch logic runs.
  5. The user must add text=lambda: ... to every visit() call site to restore functionality.

How to address

The code change itself is correct per Fern exhaustive visitor design. The appropriate remedy is not to change the generated code, but to ensure this breaking change is documented explicitly in the CHANGELOG as a breaking change requiring a major or minor semver bump, and that a migration note is added instructing users to add text=lambda: ... handlers to all visit() call sites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants