C++: Support reasoning about whether a phi node overwrites the entire buffer#21836
Open
MathiasVP wants to merge 9 commits into
Open
C++: Support reasoning about whether a phi node overwrites the entire buffer#21836MathiasVP wants to merge 9 commits into
MathiasVP wants to merge 9 commits into
Conversation
…itialized instructions.
…ntire buffer is overwritten).
0df8ed9 to
07b8d7e
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the C++ IR-based SSA/dataflow implementation to better classify SSA definitions as “certain” (overwrites the entire buffer), including propagating certainty through PhiNodes, and adjusts/extends tests to cover the new behavior.
Changes:
- Refines write-certainty tracking by replacing a boolean “certain” flag with a
Certaintytype and updating write handling accordingly. - Changes
PhiNodecertainty so phi definitions can be “certain” when their inputs are certain, including SCC-based handling for phi cycles. - Updates existing dataflow test expectations and adds a new inline-expectations test suite for “certain/uncertain” SSA definitions.
Show a summary per file
| File | Description |
|---|---|
| cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImplCommon.qll | Introduces Certainty-typed write certainty and updates write classification logic. |
| cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll | Adjusts SSA source-variable stringification and implements phi-node certainty propagation (incl. cycle handling). |
| cpp/ql/test/library-tests/dataflow/dataflow-tests/test.cpp | Adds a regression test case exercising behavior around certainty/uninitialized flow in a loop. |
| cpp/ql/test/library-tests/dataflow/certain/test.ql | Adds an inline expectations harness to assert Ssa::Definition.isCertain() outcomes. |
| cpp/ql/test/library-tests/dataflow/certain/test.cpp | Adds C++ scenarios to validate “certain vs uncertain” SSA defs and phi behavior. |
| cpp/ql/test/library-tests/dataflow/certain/test.expected | Empty expected file for the new inline expectations test. |
| cpp/ql/test/library-tests/dataflow/dataflow-tests/uninitialized.expected | Updates expected results for uninitialized/dataflow tests. |
| cpp/ql/test/library-tests/dataflow/dataflow-tests/test-source-sink.expected | Updates expected source-to-sink flow results. |
| cpp/ql/test/library-tests/dataflow/dataflow-tests/localFlow-ir.expected | Updates expected IR local flow output (notably SSA phi read formatting). |
| cpp/ql/test/library-tests/dataflow/dataflow-tests/dataflow-consistency.expected | Updates expected dataflow consistency outputs. |
| cpp/ql/test/experimental/query-tests/Security/CWE/CWE-193/constant-size/ConstantSizeArrayOffByOne.expected | Updates experimental query-test expectations to match new flow behavior. |
Copilot's findings
- Files reviewed: 10/11 changed files
- Comments generated: 3
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is a combination of two fixes with the overarching theme of computing which SSA definitions are certain (which, in this context, means "overwrites the entire buffer").
In order to model flow through something like:
(without knowing the concrete values for
i,j, andk) we model the write top[i]andp[j]as "uncertain" writes. That is, the write to the underlying SSA variable isn't overwritten. So whatever value was previously tracked by SSA is not killed by the SSA definition.The first improvement is in 8585bb6 which changes some of the SSA writes to be certain (whereas before they were only certain if the address they were writing to didn't involve pointer arithmetic).
The next change is slightly less well-motivated. This change is most likely necessary if we ever want to switch the instantiation of the shared range analysis library over from the sound IR-based SSA to dataflow-based SSA.
On
mainwe currently consider any definition from aPhiNodeto be an uncertain definition. This PR changes this so that we consider aPhiNodeto be a certain definition whenever its inputs are all certain definitions.There are some technical things that make this slightly more complicated than what you would expect: Since a
PhNodecan be its own input we need recursion over a graph with cycles, and because of QL's least fixed-point semantics we don't get the right result when doing recursion over a graph with cycles. So we instead recurse over a graph whose nodes are the the recursivePhiNodes.As always, recursion through
forallis also dangerous and should ideally be rewritten using explicit recursion over integers to reduce the risk of performance problems. However, I'll delay that until we actually see a performance problem on a database.(I expect this work to be necessary for switching the instantiation of the shared range analysis library over to dataflow-based SSA because I expect the relevant SSA variables for range analysis to be those for which any SSA definition is a certain SSA definition.)
DCA shows a couple of lost results. They're all FPs of the form added in 8e25240 (which this PR fixes) 🎉