Skip to content

C++: Support reasoning about whether a phi node overwrites the entire buffer#21836

Open
MathiasVP wants to merge 9 commits into
github:mainfrom
MathiasVP:uncertain-def-more-complete
Open

C++: Support reasoning about whether a phi node overwrites the entire buffer#21836
MathiasVP wants to merge 9 commits into
github:mainfrom
MathiasVP:uncertain-def-more-complete

Conversation

@MathiasVP
Copy link
Copy Markdown
Contributor

@MathiasVP MathiasVP commented May 12, 2026

This PR is a combination of two fixes with the overarching theme of computing which SSA definitions are certain (which, in this context, means "overwrites the entire buffer").

In order to model flow through something like:

p[i] = source();
p[j] = 0;
sink(p[k]);

(without knowing the concrete values for i, j, and k) we model the write to p[i] and p[j] as "uncertain" writes. That is, the write to the underlying SSA variable isn't overwritten. So whatever value was previously tracked by SSA is not killed by the SSA definition.

  • The first improvement is in 8585bb6 which changes some of the SSA writes to be certain (whereas before they were only certain if the address they were writing to didn't involve pointer arithmetic).

  • The next change is slightly less well-motivated. This change is most likely necessary if we ever want to switch the instantiation of the shared range analysis library over from the sound IR-based SSA to dataflow-based SSA.

    On main we currently consider any definition from a PhiNode to be an uncertain definition. This PR changes this so that we consider a PhiNode to be a certain definition whenever its inputs are all certain definitions.

    There are some technical things that make this slightly more complicated than what you would expect: Since a PhNode can be its own input we need recursion over a graph with cycles, and because of QL's least fixed-point semantics we don't get the right result when doing recursion over a graph with cycles. So we instead recurse over a graph whose nodes are the the recursive PhiNodes.

    As always, recursion through forall is also dangerous and should ideally be rewritten using explicit recursion over integers to reduce the risk of performance problems. However, I'll delay that until we actually see a performance problem on a database.

    (I expect this work to be necessary for switching the instantiation of the shared range analysis library over to dataflow-based SSA because I expect the relevant SSA variables for range analysis to be those for which any SSA definition is a certain SSA definition.)

DCA shows a couple of lost results. They're all FPs of the form added in 8e25240 (which this PR fixes) 🎉

@github-actions github-actions Bot added the C++ label May 12, 2026
@MathiasVP MathiasVP added no-change-note-required This PR does not need a change note and removed C++ labels May 12, 2026
@github-actions github-actions Bot added the C++ label May 12, 2026
@MathiasVP MathiasVP force-pushed the uncertain-def-more-complete branch from 0df8ed9 to 07b8d7e Compare May 13, 2026 12:15
@MathiasVP MathiasVP marked this pull request as ready for review May 13, 2026 12:22
@MathiasVP MathiasVP requested a review from a team as a code owner May 13, 2026 12:22
Copilot AI review requested due to automatic review settings May 13, 2026 12:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the C++ IR-based SSA/dataflow implementation to better classify SSA definitions as “certain” (overwrites the entire buffer), including propagating certainty through PhiNodes, and adjusts/extends tests to cover the new behavior.

Changes:

  • Refines write-certainty tracking by replacing a boolean “certain” flag with a Certainty type and updating write handling accordingly.
  • Changes PhiNode certainty so phi definitions can be “certain” when their inputs are certain, including SCC-based handling for phi cycles.
  • Updates existing dataflow test expectations and adds a new inline-expectations test suite for “certain/uncertain” SSA definitions.
Show a summary per file
File Description
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImplCommon.qll Introduces Certainty-typed write certainty and updates write classification logic.
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll Adjusts SSA source-variable stringification and implements phi-node certainty propagation (incl. cycle handling).
cpp/ql/test/library-tests/dataflow/dataflow-tests/test.cpp Adds a regression test case exercising behavior around certainty/uninitialized flow in a loop.
cpp/ql/test/library-tests/dataflow/certain/test.ql Adds an inline expectations harness to assert Ssa::Definition.isCertain() outcomes.
cpp/ql/test/library-tests/dataflow/certain/test.cpp Adds C++ scenarios to validate “certain vs uncertain” SSA defs and phi behavior.
cpp/ql/test/library-tests/dataflow/certain/test.expected Empty expected file for the new inline expectations test.
cpp/ql/test/library-tests/dataflow/dataflow-tests/uninitialized.expected Updates expected results for uninitialized/dataflow tests.
cpp/ql/test/library-tests/dataflow/dataflow-tests/test-source-sink.expected Updates expected source-to-sink flow results.
cpp/ql/test/library-tests/dataflow/dataflow-tests/localFlow-ir.expected Updates expected IR local flow output (notably SSA phi read formatting).
cpp/ql/test/library-tests/dataflow/dataflow-tests/dataflow-consistency.expected Updates expected dataflow consistency outputs.
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-193/constant-size/ConstantSizeArrayOffByOne.expected Updates experimental query-test expectations to match new flow behavior.

Copilot's findings

  • Files reviewed: 10/11 changed files
  • Comments generated: 3

Comment thread cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImplCommon.qll
Comment thread cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll Outdated
Comment thread cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants