Skip to content

Commit 4a20abf

Browse files
authored
Add evals action example (#59)
2 parents 91e723b + 60134a6 commit 4a20abf

File tree

6 files changed

+94
-0
lines changed

6 files changed

+94
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ gh models eval my_prompt.prompt.yml --json
8080

8181
The JSON output includes detailed test results, evaluation scores, and summary statistics that can be processed by other tools or CI/CD pipelines.
8282

83+
Here's a sample GitHub Action that uses the `eval` command to automatically run the evals in any PR that updates a prompt file: [evals_action.yml](/examples/evals_action.yml).
84+
8385
Learn more about `.prompt.yml` files here: [Storing prompts in GitHub repositories](https://docs.github.com/github-models/use-github-models/storing-prompts-in-github-repositories).
8486

8587
## Notice

examples/evals_action.yml

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# This is a sample GitHub Actions workflow file that runs prompt evaluations
2+
# on pull requests when prompt files are changed. It uses the `gh-models` CLI to evaluate prompts
3+
# and comments the results back on the pull request.
4+
# The workflow is triggered by pull requests that modify any `.prompt.yml` files.
5+
6+
7+
name: Run evaluations for changed prompts
8+
9+
permissions:
10+
models: read
11+
contents: read
12+
pull-requests: write
13+
14+
on:
15+
pull_request:
16+
paths:
17+
- '**/*.prompt.yml'
18+
19+
jobs:
20+
evaluate-model:
21+
runs-on: ubuntu-latest
22+
steps:
23+
- uses: actions/checkout@v4
24+
with:
25+
fetch-depth: 0
26+
27+
- name: Setup gh-models
28+
run: gh extension install github/gh-models
29+
env:
30+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
31+
32+
- name: Find changed prompt files
33+
id: find-prompts
34+
run: |
35+
# Get the list of changed files that match *.prompt.yml pattern
36+
changed_prompts=$(git diff --name-only origin/${{ github.base_ref }}..HEAD | grep '\.prompt\.yml$' | head -1)
37+
38+
if [[ -z "$changed_prompts" ]]; then
39+
echo "No prompt files found in the changes"
40+
echo "skip_evaluation=true" >> "$GITHUB_OUTPUT"
41+
exit 0
42+
fi
43+
44+
echo "first_prompt=$changed_prompts" >> "$GITHUB_OUTPUT"
45+
echo "Found changed prompt file: $changed_prompts"
46+
47+
- name: Run model evaluation
48+
id: eval
49+
run: |
50+
set -e
51+
PROMPT_FILE="${{ steps.find-prompts.outputs.first_prompt }}"
52+
echo "## Model Evaluation Results" >> "$GITHUB_STEP_SUMMARY"
53+
echo "Evaluating: $PROMPT_FILE" >> "$GITHUB_STEP_SUMMARY"
54+
echo "" >> "$GITHUB_STEP_SUMMARY"
55+
56+
if gh models eval "$PROMPT_FILE" > eval_output.txt 2>&1; then
57+
echo "✅ All evaluations passed!" >> "$GITHUB_STEP_SUMMARY"
58+
cat eval_output.txt >> "$GITHUB_STEP_SUMMARY"
59+
echo "eval_status=success" >> "$GITHUB_OUTPUT"
60+
else
61+
echo "❌ Some evaluations failed!" >> "$GITHUB_STEP_SUMMARY"
62+
cat eval_output.txt >> "$GITHUB_STEP_SUMMARY"
63+
echo "eval_status=failure" >> "$GITHUB_OUTPUT"
64+
exit 1
65+
fi
66+
env:
67+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
68+
69+
- name: Comment on PR with evaluation results
70+
if: github.event_name == 'pull_request'
71+
uses: actions/github-script@v7
72+
with:
73+
script: |
74+
const fs = require('fs');
75+
const output = fs.readFileSync('eval_output.txt', 'utf8');
76+
const evalStatus = '${{ steps.eval.outputs.eval_status }}';
77+
const statusMessage = evalStatus === 'success'
78+
? '✅ Evaluation passed'
79+
: '❌ Evaluation failed';
80+
81+
github.rest.issues.createComment({
82+
issue_number: context.issue.number,
83+
owner: context.repo.owner,
84+
repo: context.repo.repo,
85+
body: `## ${statusMessage}
86+
87+
\`\`\`
88+
${output}
89+
\`\`\`
90+
91+
Review the evaluation results above for more details.`
92+
});

0 commit comments

Comments
 (0)