Skip to content

Handle failed hosted eval creation responses#464

Open
d42me wants to merge 5 commits intomainfrom
bugfix/hosted-eval-create-failure
Open

Handle failed hosted eval creation responses#464
d42me wants to merge 5 commits intomainfrom
bugfix/hosted-eval-create-failure

Conversation

@d42me
Copy link
Copy Markdown
Contributor

@d42me d42me commented Mar 23, 2026

Summary

  • treat hosted eval create responses with status FAILED as errors even when an evaluation_id is returned
  • surface the backend error to users instead of printing a false success message
  • add unit coverage for helper-level and CLI-level failure handling

Testing

  • pytest packages/prime/tests/test_hosted_eval.py -q

Note

Medium Risk
Changes hosted eval creation/launch flow to treat status=FAILED responses as errors and exit non-zero, which can alter CLI behavior for users and CI scripts. Risk is limited to CLI error-handling paths and is covered by new unit tests.

Overview
Hosted eval creation failures are now surfaced correctly in the CLI. When the hosted-eval create API returns status=FAILED, the command collects the returned evaluation_id/evaluation_ids, prints the backend error, provides viewer/log commands, and exits with code 1 instead of showing a false success.

Adds helper logic to normalize singular vs plural evaluation IDs and expands tests to cover failed create responses (including plural IDs and empty evaluation_ids fallback).

Written by Cursor Bugbot for commit 05651b0. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 781c7cde6b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/prime/src/prime_cli/commands/evals.py Outdated
Comment thread packages/prime/src/prime_cli/commands/evals.py Outdated
@d42me d42me force-pushed the bugfix/hosted-eval-create-failure branch from 45671fa to dfc47c4 Compare March 24, 2026 00:12
Comment thread packages/prime/src/prime_cli/commands/evals.py
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread packages/prime/src/prime_cli/commands/evals.py Outdated
@d42me d42me requested a review from burnpiro March 24, 2026 05:59
@d42me d42me requested a review from xeophon April 17, 2026 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant