Skip to content

Releases: stackav-oss/conch

v1.3

05 Sep 16:35
accec84

Choose a tag to compare

This release adds three new kernels:

  • BEVPool
  • Non-Max Suppression (NMS)
  • Voxelization

It also adds the Conch CUDA extension, which allows compilation of a reference CUDA kernel for testing/benchmarking.

v1.2.1

18 Jun 20:44
daa2ab9

Choose a tag to compare

This release updates tuning parameters for mixed-precision GEMM, giving better performance on all tested platforms.

v1.2.0

13 Jun 20:52
2c95868

Choose a tag to compare

This release includes more fixes/optimizations for Varlen Attention on A10/H100/MI300X.

v1.1.0

12 Jun 20:48
30f3dec

Choose a tag to compare

This version fixes microbenchmarks, speeds up Varlen attention, and contains a myriad of fixes for vLLM compatibility.

v1.0.1

10 Jun 16:15
b898dc6

Choose a tag to compare

This release contains vLLM compatibility fixes for mixed-precision gemm

v1.0.0

06 Jun 20:25
3685b96

Choose a tag to compare

This release adds Varlen Attention and switches the cache format for Paged Attention (FlashDecoding) and Varlen Attention (FlashAttention Varlen) to use the same format as Dao's FlashAttention.

v0.0.1

22 Apr 18:19
2e49bfb

Choose a tag to compare

Initial OSS release 🎉