The Platform for Self-Improving Code. Ideal for GPU kernels, ML model development, feature engineering, prompt engineering, and other optimizable code.
-
Updated
Feb 17, 2026 - Python
The Platform for Self-Improving Code. Ideal for GPU kernels, ML model development, feature engineering, prompt engineering, and other optimizable code.
Extended TileLang as a unified DSL to enable high-performance kernel development for Near-Memory Computing, Distributed Memory AI Accelerators, and Networked Accelerators.
Forge: Swarm Agents That Turn Slow PyTorch Into Fast CUDA/Triton Kernels
A collection of high-performance CUDA kernels and experiments for learning and optimizing GPU compute primitives.
Optimized Ubuntu Touch for Lenovo Tab M8 HD (TB-8505F) - Kernel improvements, performance tuning, boot experience, and system optimizations for the MediaTek Helio A22 tablet
Skill pack for custom PyTorch MPS kernels on Apple Silicon (examples, tests, and optimization patterns).
Add a description, image, and links to the kernel-optimization topic page so that developers can more easily learn about it.
To associate your repository with the kernel-optimization topic, visit your repo's landing page and select "manage topics."