Copy the block below into a coding agent (e.g. Copilot) to investigate the failures:
You are triaging failures from the gh-aw-test E2E suite.
Run: https://github.com/githubnext/gh-aw-test/actions/runs/27969882871
Repository under test: github/gh-aw (the gh-aw CLI/compiler).
Test harness repository: githubnext/gh-aw-test (this repo; runner is e2e.sh).
Goal: for EACH failed test listed in this status report, access the GitHub
Actions logs for the run above (and the per-entry artifacts e2e-<label>-samples-<bool>,
which contain e2e-test-*.log, e2e-output.log and fails.txt), determine the root
cause, and categorize the failure as exactly one of:
1. TRANSIENT — flaky/infra/network/rate-limit/timing; not a real defect.
Action: note it and recommend a re-run (./e2e.sh rerun).
2. TEST-FRAMEWORK BUG — a defect in this repo's harness (e2e.sh), a workflow
source file (.github/workflows/test-*.md), a sample, or CI config.
Action: propose a concrete fix (file + change) in githubnext/gh-aw-test.
3. GH-AW BUG — a defect in github/gh-aw itself (compiler output, runtime
engine behaviour, safe-output handling, etc.).
Action: open an issue in github/gh-aw with a minimal repro, the failing
test name, the gh-aw ref/mode/samples combination, and links to the
relevant log lines. Check for an existing open issue first and link it
instead of filing a duplicate.
Steps:
- Use 'gh run view <run-id> --log' and 'gh run download <run-id>' to fetch logs/artifacts.
- Read AGENTS.md in githubnext/gh-aw-test for harness conventions before proposing fixes.
- Group failures by suspected root cause; the same gh-aw bug may explain several.
- Produce a table: test | category | root cause | recommended action | issue/PR link.
- Only open github/gh-aw issues for category 3, and only after confirming no duplicate exists.
Run: https://github.com/githubnext/gh-aw-test/actions/runs/27969882871 · Trigger:
workflow_dispatch· Generated:2026-06-22T20:53:59Z· Outcome:failure❌ Test errors — samples mode
✅ Test successes — samples mode
🤖 Agent triage prompt
Copy the block below into a coding agent (e.g. Copilot) to investigate the failures:
Previous status reports (closed by this run)