feat(api): update API spec from langfuse/langfuse 6603be6#1612
feat(api): update API spec from langfuse/langfuse 6603be6#1612langfuse-bot wants to merge 7 commits intomainfrom
Conversation
|
@claude review |
Extend string-value overloads in create_score, score_current_span, score_current_trace, score, and score_trace to accept TEXT alongside CATEGORICAL. Update all related docstrings. Add ExperimentScoreType to exclude TEXT from experiments/evals. Add integration test for TEXT scores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The spec update is now automated via PR from the langfuse repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update cast(Literal["CATEGORICAL"], ...) to include "TEXT" in score/score_trace/score_current_span/score_current_trace impl bodies. Add ExperimentScoreType to __all__ in types.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
The issues from the previous review have been addressed: ScoreDataType now includes TEXT, the string-value overloads accept both CATEGORICAL and TEXT, and the experiment layer correctly uses a separate ExperimentScoreType that excludes TEXT. LGTM.
Extended reasoning...
Overview
This PR adds TEXT as a new score data type throughout the Python SDK. Changes span the auto-generated Fern client (langfuse/api/), the high-level SDK types (langfuse/types.py), the client overloads (langfuse/_client/client.py, langfuse/_client/span.py), the experiment layer (langfuse/experiment.py), and a new integration test.
Security risks
No security-sensitive code is touched. The changes are purely additive type-system and data-model updates. No auth, crypto, or permission logic is affected.
Level of scrutiny
The auto-generated Fern client changes in langfuse/api/ are mechanical and follow established patterns exactly. The hand-maintained changes in langfuse/types.py, langfuse/_client/, and langfuse/experiment.py correctly thread TEXT through the type system, and the introduction of ExperimentScoreType cleanly excludes TEXT from the experiment/eval layer (consistent with the documented design intent). The new test in test_core_sdk.py exercises the full round-trip for TEXT scores.
Other factors
My prior review identified two gaps (missing TEXT in ScoreDataType and CATEGORICAL-only overloads) — both are resolved in this revision, and the inline comment was marked resolved. The tenacity dev dependency addition is a minor, appropriate change to support retry logic in the new test.
770e388 to
5eec695
Compare
There was a problem hiding this comment.
LGTM — previous concerns resolved: and overload signatures now include TEXT, correctly excludes it.
Extended reasoning...
Overview
The PR adds TEXT as a new score data type throughout the Python SDK. It covers: auto-generated Fern client types (, , , , discriminated union updates), public type aliases in , high-level SDK method overloads in and , a new alias that intentionally excludes TEXT, and a new integration test.
Security risks
None. This is a feature addition to score types with no auth, crypto, or permission changes.
Level of scrutiny
Moderate — the PR touches the public SDK API surface but all changes are mechanical and follow established patterns exactly. The auto-generated Fern client files are noise; the meaningful hand-edited changes (types.py, client.py, span.py, experiment.py) are small and correct.
Other factors
My prior review identified that still had the old and the overloads hardcoded . Both gaps are now fixed in this revision. The inline comment was resolved. The only remaining flag is a nit about module-level assignments in the test file, which is a pre-existing anti-pattern that does not affect production code or this PR's correctness.
5eec695 to
5e63dcf
Compare
There was a problem hiding this comment.
Previous concerns have been addressed: now includes TEXT, overload signatures updated, and correctly excludes TEXT for experiments. LGTM.
Extended reasoning...
Overview\nThis PR adds TEXT as a new score data type throughout the Python SDK. Changes span the auto-generated Fern API layer (new TextScore/TextScoreV1 models, Score_Text/ScoreV1_Text discriminated union members, enum updates) plus manually maintained files (langfuse/types.py, _client/client.py, _client/span.py, experiment.py).\n\n### Security risks\nNone. This is a purely additive data-type extension with no auth, crypto, or permission changes.\n\n### Level of scrutiny\nMy previous review flagged two gaps (missing TEXT in langfuse/types.py and hardcoded CATEGORICAL-only overloads). Both have been fixed in this revision: types.py now exports ScoreDataType with TEXT, ExperimentScoreType correctly excludes TEXT (since text scores are not supported in experiments), and all overloads/casts use Literal["CATEGORICAL", "TEXT"]. The fix is correct and complete.\n\n### Other factors\nAll inline comments from prior reviews are marked resolved. The new test_create_text_score test uses tenacity retries consistent with how similar tests are structured. No outstanding concerns remain.
There was a problem hiding this comment.
LGTM — my previous concerns about missing TEXT in the high-level SDK types and overload signatures have been addressed.
Extended reasoning...
Overview
This PR adds TEXT as a new score data type across the Python SDK: auto-generated Fern client files under langfuse/api/ plus manual updates to langfuse/types.py, langfuse/_client/client.py, langfuse/_client/span.py, langfuse/experiment.py, and test files. The new ExperimentScoreType alias correctly excludes TEXT from evals/experiments where it is not supported.
Security risks
None — this is additive type/model plumbing with no auth, crypto, or permission changes.
Level of scrutiny
Moderate. Most changes are auto-generated Fern client boilerplate following established patterns exactly. The manual SDK edits (types, overloads, casts) are small and straightforward. The remaining inline bug about visit() adding a mandatory text parameter is intentional Fern exhaustive-visitor design and affects only low-level API consumers; the code is correct.
Other factors
Both prior blocking comments (missing TEXT in ScoreDataType literal and CATEGORICAL-only overloads) are resolved in this revision. Test coverage for the new TEXT score type is added. The visit() breaking-change concern is a documentation/semver note, not a code defect.
| self, | ||
| numeric: typing.Callable[[], T_Result], | ||
| boolean: typing.Callable[[], T_Result], | ||
| categorical: typing.Callable[[], T_Result], | ||
| correction: typing.Callable[[], T_Result], | ||
| text: typing.Callable[[], T_Result], | ||
| ) -> T_Result: | ||
| if self is ScoreDataType.NUMERIC: | ||
| return numeric() |
There was a problem hiding this comment.
🔴 The visit() methods on ScoreDataType and ScoreConfigDataType add a new mandatory text parameter without a default value, which will raise TypeError at the call site for any existing user code that calls these methods without the new text argument. Any SDK consumer who uses visit()-based score type dispatch (e.g. score_data_type.visit(numeric=..., boolean=..., categorical=..., correction=...)) will have silently broken code after upgrading to this version — this breaking change should be explicitly called out in a changelog entry or semver bump.
Extended reasoning...
What the bug is and how it manifests
The ScoreDataType.visit() method in langfuse/api/commons/types/score_data_type.py previously accepted four callables: numeric, boolean, categorical, and correction. This PR adds a fifth mandatory parameter text with no default value. Similarly, ScoreConfigDataType.visit() in score_config_data_type.py previously took three callables and now requires a fourth mandatory text parameter. Python raises TypeError at the call site when any required keyword argument is missing, regardless of which enum value self currently holds.
The specific code path that triggers it
Any existing caller that pattern-matches on score types via visit() will fail at runtime:
# Previously valid; now raises TypeError on upgrade:
score_data_type.visit(
numeric=lambda: "numeric",
boolean=lambda: "bool",
categorical=lambda: "cat",
correction=lambda: "corr",
)
# TypeError: visit() missing 1 required keyword argument: textPython validates all required arguments before dispatching the call, so even if the ScoreDataType value is NUMERIC and text would never be invoked, the TypeError is raised immediately at the call site.
Addressing the refutation
The refuter is correct that the Fern exhaustive visitor pattern is intentionally designed this way: adding a new enum value is a breaking change for visit() callers, and requiring the new handler prevents silent misbehavior (where TEXT scores would be silently ignored). Adding text=None as a default would indeed be a regression — code that does not handle TEXT scores would silently produce wrong results. So the code itself is correct per Fern design contract.
Why this still deserves attention
While the code behavior is intentional, the practical impact is real: this is a breaking change on the public langfuse.api surface. The methods are exported via langfuse/api/init.py and langfuse/api/commons/init.py. Any user who has written visit()-based dispatch code will encounter a runtime TypeError after upgrading. The PR description says no manually editable code was modified, which is accurate but does not negate the user-visible break.
Impact
SDK users who use the Fern client low-level visit() pattern for exhaustive score-type dispatch will have broken code after upgrading. The high-level Langfuse() client does not call visit() internally, so server-side functionality works correctly. The impact is limited to users of the langfuse.api low-level layer who pattern-match on ScoreDataType or ScoreConfigDataType values.
Step-by-step proof
- User has existing code: score.data_type.visit(numeric=..., boolean=..., categorical=..., correction=...)
- User upgrades the SDK to this version.
- ScoreDataType.visit now requires text as a fifth mandatory keyword argument.
- Python raises TypeError: visit() missing 1 required keyword argument: text at the call site before any dispatch logic runs.
- The user must add text=lambda: ... to every visit() call site to restore functionality.
How to address
The code change itself is correct per Fern exhaustive visitor design. The appropriate remedy is not to change the generated code, but to ensure this breaking change is documented explicitly in the CHANGELOG as a breaking change requiring a major or minor semver bump, and that a migration note is added instructing users to add text=lambda: ... handlers to all visit() call sites.
Disclaimer: Experimental PR review
Greptile Summary
This PR is an auto-generated API spec update (from
langfuse/langfusecommit6603be6) that adds TEXT as a new score data type throughout the Python SDK. It introducesTextScore,TextScoreV1,Score_Text,ScoreV1_Text,GetScoresResponseData_Text, andGetScoresResponseDataTextmodels, and updates bothScoreDataTypeandScoreConfigDataTypeenums with theTEXTvalue.TextScore,TextScoreV1) follow the established inheritance pattern fromBaseScore/BaseScoreV1.GetScoresResponseDatais correctly extended withGetScoresResponseData_TextusingdataType="TEXT"as the discriminator key.ScoreDataType.TEXTincludes avisit()branch;ScoreConfigDataType.TEXTalso adds TEXT (note: CORRECTION remains intentionally absent from config data type, consistent with the pre-existing design).__init__.pyindex files and__all__lists are updated consistently acrosslangfuse/api/,langfuse/api/commons/, andlangfuse/api/scores/.ScoreBody,CreateScoreRequest, and the legacyscore_v1clients are updated to mention text score requirements (string value, 1–500 characters).langfuse/api/, which is the auto-generated Fern client zone perCLAUDE.md.Confidence Score: 5/5
This PR is safe to merge — it is auto-generated from an updated API spec and introduces no manually written logic.
All changes are in the auto-generated Fern client directory. The new TEXT score type follows established patterns exactly (enum values, discriminated unions, model inheritance, init.py exports). No logic regressions, no broken contracts, no missing migrations.
No files require special attention.
Important Files Changed
Class Diagram
%%{init: {'theme': 'neutral'}}%% classDiagram class BaseScore { +id: str +name: str +source: ScoreSource +timestamp: datetime +created_at: datetime +updated_at: datetime +metadata: Any +environment: str } class TextScore { +string_value: str } class GetScoresResponseDataText { +trace: Optional[GetScoresResponseTraceData] } class Score_Text { +data_type: Literal["TEXT"] +string_value: str } class ScoreV1_Text { +data_type: Literal["TEXT"] +string_value: str +trace_id: str } class GetScoresResponseData_Text { +data_type: Literal["TEXT"] +string_value: str +trace: Optional[GetScoresResponseTraceData] } BaseScore <|-- TextScore TextScore <|-- GetScoresResponseDataText note for Score_Text "Used in discriminated union Score" note for ScoreV1_Text "Used in discriminated union ScoreV1" note for GetScoresResponseData_Text "Used in GetScoresResponseData union"Reviews (1): Last reviewed commit: "feat(api): update API spec from langfuse..." | Re-trigger Greptile