-
Notifications
You must be signed in to change notification settings - Fork 699
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add head dim 256 support for SDPA on Blackwell
#2906
opened Apr 21, 2026 by
yaox12
Member
Loading…
1 of 13 tasks
[PyTorch] Fix cuteDSL kernel incorrect numerics when K is 64 aligned
2.15.0
#2905
opened Apr 21, 2026 by
ksivaman
Member
Loading…
6 of 13 tasks
Make NS coefficients parameter 2D in Python API
2.15.0
#2904
opened Apr 20, 2026 by
vcherepanov-nv
Collaborator
Loading…
5 of 13 tasks
Fix CP crash with GQA + asymmetric KV head dims (#2868)
community-contribution
PRs from external contributor outside the core maintainers, representing community-driven work.
#2901
opened Apr 19, 2026 by
beccohov
Loading…
7 of 13 tasks
[PyTorch] Expose function to bulk-allocate tensors backed by the same buffer
#2900
opened Apr 18, 2026 by
timmoon10
Collaborator
Loading…
9 of 13 tasks
add support for enabling cuda graph under thd format in megatron.
#2898
opened Apr 17, 2026 by
HaochenYuan
Loading…
13 tasks
Improve the dimension checks for the FP8 recipes
#2894
opened Apr 16, 2026 by
ptrendx
Member
Loading…
13 tasks
Bias/Dbias Support for GroupedLinear
#2885
opened Apr 15, 2026 by
vthumbe1503
Collaborator
Loading…
13 tasks
[Debug] Add AutoswitchGEmm for Debug Precision Tool
#2883
opened Apr 15, 2026 by
shangxiaokang
•
Draft
3 of 13 tasks
fix(readme): update broken links and modernize project description
#2879
opened Apr 14, 2026 by
sbhavani
Collaborator
Loading…
3 of 13 tasks
[PyTorch] Split TE ops op_forward into op_forward and setup_context
#2877
opened Apr 14, 2026 by
pggPL
Collaborator
Loading…
5 of 7 tasks
[DONOT MERGE] Wgrad cute dsl v2
#2872
opened Apr 13, 2026 by
vthumbe1503
Collaborator
•
Draft
13 tasks
[JAX] Add debug validation mode for runtime group size alignment
#2867
opened Apr 11, 2026 by
jberchtold-nvidia
Collaborator
•
Draft
13 tasks
Optimizations for MXFP8/NVFP4 dequantize kernels
#2865
opened Apr 10, 2026 by
YigongQin
Loading…
8 of 13 tasks
Adds GEMM Profiling Guide to TE
#2863
opened Apr 9, 2026 by
jomitchellnv
Contributor
Loading…
7 tasks
Add cpplint and ruff linter to pre-commit and fix lint violations
#2853
opened Apr 8, 2026 by
pstjohn
Contributor
Loading…
Bump transformers from 4.55.0 to 5.0.0rc3 in /docs/examples/te_gemma
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#2851
opened Apr 8, 2026 by
dependabot
bot
Loading…
Bump transformers from 4.57.0 to 5.0.0rc3 in /docs/examples/te_llama
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#2850
opened Apr 8, 2026 by
dependabot
bot
Loading…
Skip activation kernels when tensor size is zero
bug
Something isn't working
#2848
opened Apr 8, 2026 by
timmoon10
Collaborator
Loading…
8 of 13 tasks
[Core] Report CUDA versions when NVRTC compilation fails
enhancement
New feature or request
#2842
opened Apr 7, 2026 by
timmoon10
Collaborator
Loading…
8 of 13 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.