Skip to content

ci(pr): classify PR test failures as new vs. known and post sticky comment#1194

Draft
rgsl888prabhu wants to merge 3 commits intomainfrom
ci/pr-test-classification-comment
Draft

ci(pr): classify PR test failures as new vs. known and post sticky comment#1194
rgsl888prabhu wants to merge 3 commits intomainfrom
ci/pr-test-classification-comment

Conversation

@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

Summary

  • For every PR test job, cross-reference each failure against the target branch's nightly failure history and post a single sticky comment that splits failures into NEW (introduced by this PR) vs. KNOWN (recurring on nightly, known flaky on nightly, or flaked in this run via pytest-rerunfailures).
  • No changes to retry behavior; reuses the existing nightly_report.py JUnit parser and history schema.
  • Sticky single comment per PR, identified by hidden marker <!-- pr-test-classification -->, updated in place on each push. Skipped entirely when nothing failed or flaked.

How it works

  1. Per-matrix classification (PR mode). ci/utils/nightly_report_helper.sh now branches on RAPIDS_BUILD_TYPE. For PR runs it points --s3-history-uri at the target branch's nightly history (read-only — never writes back) and uploads a per-matrix summary to a PR-scoped prefix s3://.../ci_test_reports/pr/run-${GITHUB_RUN_ID}/.
  2. Read-only classifier. nightly_report.py --mode pr annotates each failure with pr_classification:
    • new — not in nightly history, or only there as resolved and not flagged flaky.
    • known_recurring — nightly history status=active.
    • known_flaky_nightly — nightly history is_flaky=true (regardless of status).
    • known_flaky_pr — passed on retry within this run, not previously known to flake.
  3. Aggregator. ci/utils/aggregate_pr.py lists the run-scoped prefix, merges per-matrix summaries (via shared helpers in the new aggregate_common.py), and produces a Markdown comment with two top-level sections (NEW, KNOWN) and three sub-groups inside KNOWN.
  4. Comment poster. ci/pr_summary.sh looks up an existing comment by its hidden marker and either PATCHes it in place or POSTs a new one.
  5. Workflow wiring. A new pr-test-summary job in .github/workflows/pr.yaml runs after every PR test job (if: always(), permissions: pull-requests: write). It is not added to pr-builder's needs: list — comment posting is informational and must not block merging.

Files

File Purpose
ci/utils/aggregate_common.py (new) Shared download_summaries, load_local_summaries, aggregate_summaries, html_escape
ci/utils/aggregate_nightly.py Imports from aggregate_common; behavior unchanged
ci/utils/aggregate_pr.py (new) Builds the Markdown PR comment body
ci/utils/nightly_report.py New --mode pr; preserves pr_classification in summary JSON
ci/utils/nightly_report_helper.sh PR-mode S3 wiring (read target-branch history; write run-scoped summary)
ci/pr_summary.sh (new) Aggregates and posts/updates the sticky comment
.github/workflows/pr.yaml New pr-test-summary job

Test plan

  • Trigger CI on this PR and confirm pr-test-summary runs after the test jobs and produces (or updates) a sticky comment when there are failures.
  • Force a known-recurring failure in the test matrix; confirm it lands in the KNOWN — Already broken on nightly section.
  • Force a brand-new failure (rename a passing test to fail); confirm it lands in NEW failures with classification new.
  • Confirm a flaky-passed-on-retry test (already supported by pytest-rerunfailures) lands in KNOWN — Flaked in this PR run.
  • Confirm an all-green PR run does not post a comment.
  • Confirm re-running the workflow updates the existing comment in place rather than creating a new one.
  • Confirm nightly runs are unaffected: history file still updates, consolidated dashboard renders, Slack notification still fires.
  • Confirm ${{ github.token }} with permissions: pull-requests: write is sufficient (the only existing token usage in pr.yaml is read-only via gh pr view); fall back to a PAT if not.
  • Verify CUOPT_AWS_* credentials have write access to the new ci_test_reports/pr/run-*/ prefix.

Follow-ups (not in this PR)

  • S3 bucket lifecycle rule to expire ci_test_reports/pr/run-*/ after ~14 days.

🤖 Generated with Claude Code

…mment

Cross-reference each PR test failure against the target branch's nightly
failure history and post a single sticky comment that splits failures
into NEW (introduced by this PR) and KNOWN (recurring on nightly, known
flaky on nightly, or flaked in this run via pytest-rerunfailures).

- Add `--mode pr` to nightly_report.py: read history without writing
  back, annotate each failure with `pr_classification`.
- Branch nightly_report_helper.sh on RAPIDS_BUILD_TYPE so PR runs read
  the target branch's history and write per-matrix summaries to a
  PR-scoped S3 prefix (`ci_test_reports/pr/run-{run_id}/`).
- Extract shared aggregator helpers into aggregate_common.py
  (download/load/aggregate/escape) so the PR aggregator can reuse them
  without behavioural drift on nightly.
- Add aggregate_pr.py to build the Markdown comment body and ci/pr_summary.sh
  to post or update the sticky comment via the GitHub API.
- Wire a new `pr-test-summary` job into pr.yaml: gated on `if: always()`,
  permissions `pull-requests: write`, runs after every PR test job.
  Not added to pr-builder needs — the comment is informational and must
  not block merging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 8, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rgsl888prabhu rgsl888prabhu self-assigned this May 8, 2026
@rgsl888prabhu rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels May 8, 2026
rgsl888prabhu and others added 2 commits May 8, 2026 15:47
Mirror the nightly-summary pattern: keep job-level wiring (needs/if) in
pr.yaml and move the implementation into its own reusable workflow.
Use explicit secret pass-through (matching how test.yaml calls
nightly-summary.yaml) instead of `secrets: inherit`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The shell script had three blocks of inline Python plus a bash-specific
${VAR@Q} quoting trick.  Move all GitHub API interactions into
ci/utils/pr_comment_helper.py with two CLI subcommands:

  - base-ref: resolve the PR's target branch
  - post:     post or update the sticky comment by hidden marker

Stdlib only (urllib + json) so it runs in slim CI containers.  The
hidden marker now lives in pr_comment_helper.COMMENT_MARKER as the
single source of truth, imported by aggregate_pr.py.

ci/pr_summary.sh shrinks from ~120 lines of mixed shell+Python to ~70
lines of straight orchestration.  pr_test_summary.yaml's base-ref
lookup goes from a curl+inline-Python pipe to a one-liner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rgsl888prabhu
Copy link
Copy Markdown
Collaborator Author

/ok to test c4876d2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant