SSA: Add a shared signature for SSA and a module to implement it. by aschackmull · Pull Request #20646 · github/codeql

aschackmull · 2025-10-15T08:57:20Z

Unused so far, but this is intended as a step towards a unified SSA api, and a way to share more code by removing the need for individual languages to cache the SSA predicates and add wrapper classes.

Copilot

Pull Request Overview

This PR introduces a shared signature for SSA (Static Single Assignment) and a module to implement it, as a step towards a unified SSA API. The change enables code sharing across languages by providing standardized signatures and implementations that eliminate the need for individual languages to cache SSA predicates and add wrapper classes.

Key changes:

Added a comprehensive SsaSig signature defining the complete SSA class hierarchy and predicates
Introduced a MakeSsa module that implements the SsaSig signature using existing SSA infrastructure
Updated the InputSig to use generic type signatures instead of specific BasicBlockSig

shared/ssa/codeql/ssa/Ssa.qll

hvitved

Overall LGTM, some comments.

hvitved · 2025-10-20T09:24:44Z

shared/ssa/codeql/ssa/Ssa.qll

+     * For SSA definitions occurring at the beginning of a basic block, such as
+     * phi definitions, this will get the first control flow node of the basic block.


I'm not sure that is the case for existing languages. Would it be problematic for this predicate to not return a value for phi nodes (as well as SsaImplicitEntryDefinitions)?

I know that this is a change for several languages, but Java depends on this in a couple of places, and I actually also made the change independently for C# over in #20558, since I needed it for nullness.
And in a vacuum I think it does make sense to always return a value - not returning a value for phis and entries present a bit of a gotcha, I think.

hvitved · 2025-10-20T09:28:41Z

shared/ssa/codeql/ssa/Ssa.qll

+   * variable would be visible.
+   */
+  class SsaPhiDefinition extends SsaDefinition {
+    /** Holds if `inp` is an input to the phi definition along the edge originating in `bb`. */


the phi -> this phi

hvitved · 2025-10-20T09:30:53Z

shared/ssa/codeql/ssa/Ssa.qll

+   * point in the flow graph where otherwise two or more definitions for the
+   * variable would be visible.
+   */
+  class SsaPhiDefinition extends SsaDefinition {


I suggest copying the example from

codeql/ruby/ql/lib/codeql/ruby/dataflow/SSA.qll

Lines 370 to 381 in 7b32cd4

* A phi node. For example, in

*

* ```rb

* if b

* x = 0

* else

* x = 1

* end

* puts x

* ```

*

* a phi node for `x` is inserted just before the call `puts x`.

.

hvitved · 2025-10-20T09:31:11Z

shared/ssa/codeql/ssa/Ssa.qll

+    /** Holds if `inp` is an input to the phi definition along the edge originating in `bb`. */
+    predicate hasInputFromBlock(SsaDefinition inp, BasicBlock bb);
+
+    /** Gets an input of this phi definition. */


Copy example from

codeql/ruby/ql/lib/codeql/ruby/dataflow/SSA.qll

Line 408 in 7b32cd4

final Definition getAnInput() { this.hasInputFromBlock(result, _) }

?

hvitved · 2025-10-20T09:32:42Z

shared/ssa/codeql/ssa/Ssa.qll

+       */
+      Expr getValue();
+
+      /** Holds if this write is an initialization of a parameter. */


a parameter -> parameter p.

hvitved · 2025-10-20T09:44:41Z

shared/ssa/codeql/ssa/Ssa.qll

+     * pseudo-read of the captured variable at the point of capture.
+     */
+    predicate variableCapture(
+      SourceVariable captured, SourceVariable closureVar, BasicBlock bb, int i


I know that in C# and Java, we tag local variables with a callable to account for captures, but I forgot why that is needed? In both Ruby and Rust, we simply have SourceVariable = LocalVariable, and that works perfectly fine, but would that work with the signature above (i.e., captured = closureVar)?

I seem to recall that it was necessary - at least at some point, but I also forgot exactly why. Looking at the shared SSA lib, I don't think it's necessary there. It might be necessary (or perhaps just convenient) in the language-specific portions for C# and Java.
This signature is indeed an example where it would break if SourceVariables were not tied to a callable - we could fix it by adding a column with e.g. the first BasicBlock of the closure to implicitly identify the callable. Do you think we should do that here? Otherwise we'll indeed require tagging if this is to be implemented with anything other than none().

Yeah, I was also thinking if we should replace closureVar with the callable representing the closure or, as you suggest, its entry basic block.

hvitved · 2025-10-20T09:47:19Z

shared/ssa/codeql/ssa/Ssa.qll

+       * phi nodes, this will get the first control flow node of the basic block.
+       */
+      ControlFlowNode getControlFlowNode() {
+        exists(BasicBlock bb, int i | this.definesAt(_, bb, i) | result = bb.getNode(0.maximum(i)))


Again, I'm not convinced that we want 0.maximum(i).

hvitved · 2025-10-20T09:48:27Z

shared/ssa/codeql/ssa/Ssa.qll

+     */
+    class SsaWriteDefinition extends SsaDefinition instanceof WriteDefinition { }
+
+    private predicate explicitWrite(WriteDefinition def, VariableWrite write) {


I would cache this.

hvitved · 2025-10-20T09:49:06Z

shared/ssa/codeql/ssa/Ssa.qll

+      Expr getValue() { result = this.getDefinition().getValue() }
+    }
+
+    private predicate parameterInit(WriteDefinition def, Parameter p) {


I would cache this.

hvitved · 2025-10-20T09:54:51Z

shared/ssa/codeql/ssa/Ssa.qll

+      SsaImplicitEntryDefinition() { this.definesAt(_, any(EntryBasicBlock bb), -1) }
+
+      /** Holds if this is a closure definition that captures the value of `capturedvar`. */
+      predicate captures(SsaDefinition capturedvar) { captures(this, capturedvar) }


I worry that this predicate may give the wrong impression that this definition will always read the same value as capturedvar (I suspect it you included it, because that is actually the case in Java?).

I was a bit back and forth on this. Firstly, I think it's useful to be able to get to the values of the outer scope in ad-hoc def-use chains like getAnUltimateDefinition, and Java certainly makes good use of it, even though it's technically a bit wrong for Kotlin (Kotlin allows post-capture modifications IIRC).

Another option would be to extend it to get all the possible inputs from the outer scope in case the variable is modified post-capture - that way the Java case would stay the same and other languages would see potentially multiple origins for a capture init - a bit like a phi node.

x = 0 f = () => { .. x .. } // currently sees x=0, could be extended to also see x=1 x = 1

What do you think? OTOH maybe that's silly, if we don't also include the following

x = 0 f () => { .. x .. } // can see x=0, but what about x=2 from g? g () => { .. x=2; .. }

C# used to have rather complex logic for doing this kind of thing, but I got rid of it when switching over to the shared capture library. I think it is hard to make a good analysis for this kind of thing, without ultimately resolving to saying that all other SSA definitions could potentially be a source, which would make it useless.

So, if you still want it for Java, I think I would prefer to have that a Java-specific thing.

Alright, then I think I'll just delete it from the shared class.

aschackmull requested a review from a team as a code owner October 15, 2025 08:57

Copilot AI review requested due to automatic review settings October 15, 2025 08:57

aschackmull added the no-change-note-required This PR does not need a change note label Oct 15, 2025

Copilot AI reviewed Oct 15, 2025

View reviewed changes

shared/ssa/codeql/ssa/Ssa.qll Outdated Show resolved Hide resolved

shared/ssa/codeql/ssa/Ssa.qll Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems Oct 15, 2025

View reviewed changes

shared/ssa/codeql/ssa/Ssa.qll Fixed Show fixed Hide fixed

shared/ssa/codeql/ssa/Ssa.qll Fixed Show fixed Hide fixed

SSA: Add a shared signature for SSA and a module to implement it.

b196714

aschackmull force-pushed the ssa/ssa-sig branch from caaaf36 to b196714 Compare October 15, 2025 09:02

hvitved reviewed Oct 20, 2025

View reviewed changes

aschackmull added 2 commits October 20, 2025 14:02

SSA: Address some review comments.

be626bf

SSA: Remove variable capture reference from shared class.

242f12d

hvitved approved these changes Oct 21, 2025

View reviewed changes

aschackmull merged commit 414e5ec into github:main Oct 21, 2025
36 checks passed

aschackmull deleted the ssa/ssa-sig branch October 21, 2025 10:14

		* For SSA definitions occurring at the beginning of a basic block, such as
		* phi definitions, this will get the first control flow node of the basic block.

	* A phi node. For example, in
	*
	* ```rb
	* if b
	* x = 0
	* else
	* x = 1
	* end
	* puts x
	* ```
	*
	* a phi node for `x` is inserted just before the call `puts x`.

Conversation

aschackmull commented Oct 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hvitved left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants