[RDF] Always write nominal columns in Snapshot with variations. by hageboeck · Pull Request #21246 · root-project/root

hageboeck · 2026-02-11T12:52:08Z

Before, the nominal columns were omitted from the output if their filter
didn't pass. This is, however, insufficient, because if at least two
variations are run, (some of) the nominal columns might still be in use.

One example is this setup:

df.Vary("x", [...], "var0")
  .Vary("y", [...], "var1")
  .Filter([...], {"x", "y"})
  .Snapshot([...], {"x", "y"})

In every variation universe, a Filter on x and y is used. If one writes down
what the value of the nominal and varied columns should be in the snapshot,
one gets conflicting requirements for the nominal columns if one keeps zeroeing
the invalid columns to save space as was done before.

Imagine a dataset where only the var0 variation passes the filter. One gets:

	x	y	x_var0	y_var1
Filter	fail	fail	pass	fail
Should the columns be valid?	No, because nominal didn't pass	No, because nominal didn't pass, but yes because `var0` passed	Yes	No
What's written before this PR	0	0	x_var0	0
What's written now	x	y	x_var0	0 (invalid)

The problem is the column y:
In the nominal universe, it shouldn't be present, but since it's used as an unvaried column also in the var0 universe, it must be in the output dataset. The same argument can be made for x if var1 were to pass. Therefore, the nominal columns must remain visible in the output dataset as soon as any filter in any variation passes.

The tests have been updated to reflect the fact that the nominal columns are not zeroed out.

github-actions · 2026-02-11T15:58:58Z

Test Results

22 files 22 suites 3d 9h 49m 23s ⏱️
3 849 tests 3 847 ✅ 1 💤 1 ❌
76 910 runs 76 891 ✅ 18 💤 1 ❌

For more details on these failures, see this check.

Results for commit d7414e6.

♻️ This comment has been updated with latest results.

Before, the nominal columns were omitted from the output if their filter didn't pass. This is, however, insufficient, because if at least two variations are run, (some of) the nominal columns might still be in use. The tests have been updated to reflect the fact that the nominal columns are not zeroed out.

vepadulano

Thanks for the fix! This will enable more use cases for the Snapshot with variations 👍 Changes LGTM, I left a couple of comments in the test only

When two variations are run, e.g. x_var and y_var, and the computation graph uses both x and y, the pairs {x, y_var} and {x_var, y} need to be visible in the output file. This means that the nominal columsn always need to be written, even if {x, y} would not pass the filter.

hageboeck · 2026-04-28T14:46:18Z

/backport to 6.40

root-project-bot · 2026-04-28T14:47:24Z

Preparing to backport PR #21246 to branch 6.40 requested by hageboeck

root-project-bot · 2026-04-28T14:48:06Z

This PR has been backported to branch 6.40: #22089

hageboeck self-assigned this Feb 11, 2026

hageboeck closed this Apr 21, 2026

hageboeck reopened this Apr 21, 2026

hageboeck force-pushed the RDF_snapshot_twoVariations branch from dab901e to e6d8632 Compare April 27, 2026 09:47

hageboeck marked this pull request as ready for review April 27, 2026 09:48

hageboeck requested review from dpiparo, martamaja10, pcanal and vepadulano as code owners April 27, 2026 09:48

hageboeck force-pushed the RDF_snapshot_twoVariations branch 2 times, most recently from 545855a to 899088e Compare April 27, 2026 09:51

vepadulano approved these changes Apr 27, 2026

View reviewed changes

hageboeck force-pushed the RDF_snapshot_twoVariations branch from 899088e to ce5b8f8 Compare April 27, 2026 14:15

hageboeck force-pushed the RDF_snapshot_twoVariations branch from ce5b8f8 to d7414e6 Compare April 27, 2026 14:16

hageboeck merged commit 6ce7930 into root-project:master Apr 28, 2026
51 of 53 checks passed

hageboeck deleted the RDF_snapshot_twoVariations branch April 28, 2026 14:42

root-project-bot mentioned this pull request Apr 28, 2026

[6.40] [RDF] Always write nominal columns in Snapshot with variations. #22089

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RDF] Always write nominal columns in Snapshot with variations.#21246

[RDF] Always write nominal columns in Snapshot with variations.#21246
hageboeck merged 2 commits into
root-project:masterfrom
hageboeck:RDF_snapshot_twoVariations

hageboeck commented Feb 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

vepadulano left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hageboeck commented Apr 28, 2026

Uh oh!

root-project-bot commented Apr 28, 2026

Uh oh!

root-project-bot commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hageboeck commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

vepadulano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hageboeck commented Apr 28, 2026

Uh oh!

root-project-bot commented Apr 28, 2026

Uh oh!

root-project-bot commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hageboeck commented Feb 11, 2026 •

edited

Loading

github-actions Bot commented Feb 11, 2026 •

edited

Loading