Skip to content

Commit 30c56a8

Browse files
yoffCopilot
andcommitted
Python: visit function parameter and return annotations in new CFG
The new (shared-CFG-based) Python control flow graph in `semmle.python.controlflow.internal.Cfg` previously did not emit CFG nodes for parameter type annotations (`def f(x: T): ...`) or for the return type annotation (`-> T`). The legacy CFG emitted both, and a small number of framework models rely on this: `LocalSources.qll`'s `annotatedInstance` walks the parameter annotation expression by way of its CFG node to track that a parameter receives an instance of the annotated class. After the dataflow flip to the new CFG/SSA this regression manifested as lost flows in any test exercising annotation-based parameter tracking: FastAPI `Depends()` receivers, Pydantic request bodies, Starlette `WebSocket`, the call-graph type-annotation test, and so on. Extend `FunctionDefExpr` to visit each annotation as a child of the function-def expression, in CPython evaluation order: positional parameter annotations, `*args` annotation, keyword-only parameter annotations, `**kwargs` annotation, then the return annotation. (Lambda expressions have no annotations in Python syntax, so `LambdaExpr` is unchanged.) PEP 695 type parameters remain out of scope; they belong to the inner annotation scope, not the enclosing CFG. Restored test results across `framework/aiohttp`, `framework/fastapi`, `framework/lxml`, the `CallGraph-type-annotations` test, and `CWE-022-PathInjection`. Two FastAPI list-comprehension MISSING markers become positive (`taint_test.py:41,55`). CPython CFG consistency remains clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 7194c65 commit 30c56a8

2 files changed

Lines changed: 63 additions & 4 deletions

File tree

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
category: minorAnalysis
3+
---
4+
* The new (shared-CFG-based) Python control flow graph now visits parameter and return type annotations as CFG nodes for function definitions, matching the legacy CFG. This restores annotation-based type tracking through framework models such as FastAPI's `Depends()`, Pydantic request models, Starlette `WebSocket` handlers, and any other models that flow a class reference through `Parameter.getAnnotation()` to identify instances of the annotated class.

python/ql/lib/semmle/python/controlflow/internal/AstNodeImpl.qll

Lines changed: 59 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1474,10 +1474,19 @@ module Ast implements AstSig<Py::Location> {
14741474

14751475
/**
14761476
* A function definition expression (visits positional and keyword
1477-
* defaults, but NOT PEP 695 type parameters — those bind in an
1478-
* annotation scope that nests the function body, so they belong to
1479-
* the inner scope's CFG, not the enclosing scope's; the legacy CFG
1480-
* also omitted them).
1477+
* defaults followed by parameter and return type annotations, but NOT
1478+
* PEP 695 type parameters — those bind in an annotation scope that
1479+
* nests the function body, so they belong to the inner scope's CFG,
1480+
* not the enclosing scope's; the legacy CFG also omitted them).
1481+
*
1482+
* Evaluation order follows CPython: defaults are pushed first, then
1483+
* keyword-only defaults, then annotations (the `__annotations__` dict
1484+
* is built last, before `MAKE_FUNCTION`). Annotations are emitted as
1485+
* CFG nodes so that flows from a class reference into a parameter's
1486+
* type annotation are visible to dataflow (e.g. so that framework
1487+
* models like FastAPI's `Depends()` can use a parameter's type hint
1488+
* to track that the parameter receives an instance of the annotated
1489+
* class — see `LocalSources::annotatedInstance`).
14811490
*/
14821491
additional class FunctionDefExpr extends Expr {
14831492
private Py::FunctionExpr funcExpr;
@@ -1501,15 +1510,61 @@ module Ast implements AstSig<Py::Location> {
15011510
rank[n + 1](Py::Expr d, int i | d = funcExpr.getArgs().getKwDefault(i) | d order by i)
15021511
}
15031512

1513+
/**
1514+
* Gets the `n`th annotation expression, in CPython evaluation
1515+
* order: positional parameter annotations (by argument position),
1516+
* `*args` annotation, keyword-only parameter annotations (by
1517+
* argument position), `**kwargs` annotation, then the return
1518+
* annotation. Each annotation appears at most once.
1519+
*/
1520+
Expr getAnnotation(int n) {
1521+
result.asExpr() =
1522+
rank[n + 1](Py::Expr a, int subOrder, int subIndex |
1523+
functionAnnotation(funcExpr, a, subOrder, subIndex)
1524+
|
1525+
a order by subOrder, subIndex
1526+
)
1527+
}
1528+
15041529
int getNumberOfDefaults() { result = count(funcExpr.getArgs().getADefault()) }
15051530

1531+
int getNumberOfKwDefaults() { result = count(funcExpr.getArgs().getAKwDefault()) }
1532+
1533+
int getNumberOfAnnotations() {
1534+
result = count(Py::Expr a | functionAnnotation(funcExpr, a, _, _))
1535+
}
1536+
15061537
override AstNode getChild(int index) {
15071538
result = this.getDefault(index)
15081539
or
15091540
result = this.getKwDefault(index - this.getNumberOfDefaults())
1541+
or
1542+
result = this.getAnnotation(index - this.getNumberOfDefaults() - this.getNumberOfKwDefaults())
15101543
}
15111544
}
15121545

1546+
/**
1547+
* Holds if `a` is an annotation of `funcExpr` in slot
1548+
* `(subOrder, subIndex)`. Slots are CPython evaluation order:
1549+
* positional param annotations (subOrder 0, subIndex = argument
1550+
* position), `*args` annotation (1, 0), keyword-only annotations
1551+
* (2, position), `**kwargs` annotation (3, 0), return annotation
1552+
* (4, 0).
1553+
*/
1554+
private predicate functionAnnotation(
1555+
Py::FunctionExpr funcExpr, Py::Expr a, int subOrder, int subIndex
1556+
) {
1557+
a = funcExpr.getArgs().getAnnotation(subIndex) and subOrder = 0
1558+
or
1559+
a = funcExpr.getArgs().getVarargannotation() and subOrder = 1 and subIndex = 0
1560+
or
1561+
a = funcExpr.getArgs().getKwAnnotation(subIndex) and subOrder = 2
1562+
or
1563+
a = funcExpr.getArgs().getKwargannotation() and subOrder = 3 and subIndex = 0
1564+
or
1565+
a = funcExpr.getReturns() and subOrder = 4 and subIndex = 0
1566+
}
1567+
15131568
/** A lambda expression (has default args evaluated at definition time). */
15141569
additional class LambdaExpr extends Expr {
15151570
private Py::Lambda lambda;

0 commit comments

Comments
 (0)