Releases: stackav-oss/conch
Releases · stackav-oss/conch
v1.3
v1.2.1
This release updates tuning parameters for mixed-precision GEMM, giving better performance on all tested platforms.
v1.2.0
This release includes more fixes/optimizations for Varlen Attention on A10/H100/MI300X.
v1.1.0
This version fixes microbenchmarks, speeds up Varlen attention, and contains a myriad of fixes for vLLM compatibility.
v1.0.1
This release contains vLLM compatibility fixes for mixed-precision gemm
v1.0.0
This release adds Varlen Attention and switches the cache format for Paged Attention (FlashDecoding) and Varlen Attention (FlashAttention Varlen) to use the same format as Dao's FlashAttention.
v0.0.1
Initial OSS release 🎉