Skip to content

Add alias-aware token threading for memory operations.#89

Draft
shreyas-omkar wants to merge 1 commit intoJuliaGPU:mainfrom
shreyas-omkar:main
Draft

Add alias-aware token threading for memory operations.#89
shreyas-omkar wants to merge 1 commit intoJuliaGPU:mainfrom
shreyas-omkar:main

Conversation

@shreyas-omkar
Copy link

@shreyas-omkar shreyas-omkar commented Feb 18, 2026

Feat #1

Introduce alias analysis based token threading:

  • Group pointers into alias sets.
  • Maintain per-alias-set token chains.
  • Thread tokens only between potentially aliasing operations.
  • Conservatively fall back to the global set for unknown pointers.
  • Preserve existing control flow token merging semantics.

Enables independent memory operations to execute without unnecessary serialization.

 Introduce alias analysis–based token threading:

- Group pointers into alias sets.
- Maintain per-alias-set token chains.
- Thread tokens only between potentially aliasing operations.
- Conservatively fall back to the global set for unknown pointers.
- Preserve existing control-flow token merging semantics.

Enables independent memory operations to execute without unnecessary
serialization.
Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this with a concrete example that would benefit from it?

for arg in stmt.args[2:end]
# Find the pointer argument and propagate
arg_aliases = tracker[arg]
if arg_aliases !== ALIAS_UNIVERSE || arg_aliases isa Set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What else can arg_aliases be if not ALIAS_UNIVERSE or an AliasSet?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this condition is redundant. Will fix it.

Comment on lines +149 to +153
function is_tile_array_constructor(func)
# Check if this is a TileArray constructor call
# You'll need to detect the specific GlobalRef for TileArray
return false # TODO: implement
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TileArrays are never constructed in the kernel. Or do you mean tensor and partition views?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, misnaming on my part. Renaming this to is_partition_or_tensor_view and implementing it to detect partition/tensor view call sites. The intent was to identify the point where a new SSA value gets a distinct alias set rooted at a specific base argument

Comment on lines +83 to +88
# Block has args, body, terminator
# body is an iterator that yields (ssa_idx, entry) where entry has .stmt and .typ
for (ssa_idx, entry) in block.body
analyze_statement!(tracker, SSAValue(ssa_idx), entry.stmt)
end
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No recursion into nested ops?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flat traversal was intentional as a first pass wanted to establish correct alias propagation at the top level before handling the loop/branch cases, since nested blocks raise questions about how loop carried pointer SSA values should inherit alias sets across iterations. Will add the recursion now and descend into nested blocks from analyze_statement!. Have a benchmark with an interleaved multi-array kernel in progress to confirm per-alias chains form correctly across the branch boundaries before pushing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants