Releases · stackav-oss/conch · GitHub

05 Sep 16:35

jmanning-stackav

v1.3 Latest

Latest

This release adds three new kernels:

BEVPool
Non-Max Suppression (NMS)
Voxelization

It also adds the Conch CUDA extension, which allows compilation of a reference CUDA kernel for testing/benchmarking.

Assets 2

18 Jun 20:44

jmanning-stackav

v1.2.1

This release updates tuning parameters for mixed-precision GEMM, giving better performance on all tested platforms.

Assets 2

13 Jun 20:52

jmanning-stackav

v1.2.0

This release includes more fixes/optimizations for Varlen Attention on A10/H100/MI300X.

Assets 2

12 Jun 20:48

jmanning-stackav

v1.1.0

This version fixes microbenchmarks, speeds up Varlen attention, and contains a myriad of fixes for vLLM compatibility.

Assets 2

10 Jun 16:15

jmanning-stackav

v1.0.1

This release contains vLLM compatibility fixes for mixed-precision gemm

Assets 2

06 Jun 20:25

jmanning-stackav

v1.0.0

This release adds Varlen Attention and switches the cache format for Paged Attention (FlashDecoding) and Varlen Attention (FlashAttention Varlen) to use the same format as Dao's FlashAttention.

Assets 2

22 Apr 18:19

jmanning-stackav

v0.0.1

Initial OSS release 🎉

Assets 2