ci(pr): classify PR test failures as new vs. known and post sticky comment#1194
Draft
rgsl888prabhu wants to merge 3 commits intomainfrom
Draft
ci(pr): classify PR test failures as new vs. known and post sticky comment#1194rgsl888prabhu wants to merge 3 commits intomainfrom
rgsl888prabhu wants to merge 3 commits intomainfrom
Conversation
…mment
Cross-reference each PR test failure against the target branch's nightly
failure history and post a single sticky comment that splits failures
into NEW (introduced by this PR) and KNOWN (recurring on nightly, known
flaky on nightly, or flaked in this run via pytest-rerunfailures).
- Add `--mode pr` to nightly_report.py: read history without writing
back, annotate each failure with `pr_classification`.
- Branch nightly_report_helper.sh on RAPIDS_BUILD_TYPE so PR runs read
the target branch's history and write per-matrix summaries to a
PR-scoped S3 prefix (`ci_test_reports/pr/run-{run_id}/`).
- Extract shared aggregator helpers into aggregate_common.py
(download/load/aggregate/escape) so the PR aggregator can reuse them
without behavioural drift on nightly.
- Add aggregate_pr.py to build the Markdown comment body and ci/pr_summary.sh
to post or update the sticky comment via the GitHub API.
- Wire a new `pr-test-summary` job into pr.yaml: gated on `if: always()`,
permissions `pull-requests: write`, runs after every PR test job.
Not added to pr-builder needs — the comment is informational and must
not block merging.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Mirror the nightly-summary pattern: keep job-level wiring (needs/if) in pr.yaml and move the implementation into its own reusable workflow. Use explicit secret pass-through (matching how test.yaml calls nightly-summary.yaml) instead of `secrets: inherit`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The shell script had three blocks of inline Python plus a bash-specific
${VAR@Q} quoting trick. Move all GitHub API interactions into
ci/utils/pr_comment_helper.py with two CLI subcommands:
- base-ref: resolve the PR's target branch
- post: post or update the sticky comment by hidden marker
Stdlib only (urllib + json) so it runs in slim CI containers. The
hidden marker now lives in pr_comment_helper.COMMENT_MARKER as the
single source of truth, imported by aggregate_pr.py.
ci/pr_summary.sh shrinks from ~120 lines of mixed shell+Python to ~70
lines of straight orchestration. pr_test_summary.yaml's base-ref
lookup goes from a curl+inline-Python pipe to a one-liner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
/ok to test c4876d2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
pytest-rerunfailures).nightly_report.pyJUnit parser and history schema.<!-- pr-test-classification -->, updated in place on each push. Skipped entirely when nothing failed or flaked.How it works
ci/utils/nightly_report_helper.shnow branches onRAPIDS_BUILD_TYPE. For PR runs it points--s3-history-uriat the target branch's nightly history (read-only — never writes back) and uploads a per-matrix summary to a PR-scoped prefixs3://.../ci_test_reports/pr/run-${GITHUB_RUN_ID}/.nightly_report.py --mode prannotates each failure withpr_classification:new— not in nightly history, or only there asresolvedand not flagged flaky.known_recurring— nightly historystatus=active.known_flaky_nightly— nightly historyis_flaky=true(regardless of status).known_flaky_pr— passed on retry within this run, not previously known to flake.ci/utils/aggregate_pr.pylists the run-scoped prefix, merges per-matrix summaries (via shared helpers in the newaggregate_common.py), and produces a Markdown comment with two top-level sections (NEW, KNOWN) and three sub-groups inside KNOWN.ci/pr_summary.shlooks up an existing comment by its hidden marker and either PATCHes it in place or POSTs a new one.pr-test-summaryjob in.github/workflows/pr.yamlruns after every PR test job (if: always(),permissions: pull-requests: write). It is not added topr-builder'sneeds:list — comment posting is informational and must not block merging.Files
ci/utils/aggregate_common.py(new)download_summaries,load_local_summaries,aggregate_summaries,html_escapeci/utils/aggregate_nightly.pyaggregate_common; behavior unchangedci/utils/aggregate_pr.py(new)ci/utils/nightly_report.py--mode pr; preservespr_classificationin summary JSONci/utils/nightly_report_helper.shci/pr_summary.sh(new).github/workflows/pr.yamlpr-test-summaryjobTest plan
pr-test-summaryruns after the test jobs and produces (or updates) a sticky comment when there are failures.new.pytest-rerunfailures) lands in KNOWN — Flaked in this PR run.${{ github.token }}withpermissions: pull-requests: writeis sufficient (the only existing token usage inpr.yamlis read-only viagh pr view); fall back to a PAT if not.CUOPT_AWS_*credentials have write access to the newci_test_reports/pr/run-*/prefix.Follow-ups (not in this PR)
ci_test_reports/pr/run-*/after ~14 days.🤖 Generated with Claude Code