Skip to content

TurboQuant encoding for Vectors#7269

Draft
connortsui20 wants to merge 2 commits intodevelopfrom
ct/turboquant
Draft

TurboQuant encoding for Vectors#7269
connortsui20 wants to merge 2 commits intodevelopfrom
ct/turboquant

Conversation

@connortsui20
Copy link
Copy Markdown
Contributor

Continuation of #7167, authored by @lwwmanning

Summary

Lossy quantization for vector data (e.g., embeddings) based on TurboQuant (https://arxiv.org/abs/2504.19874). Supports both MSE-optimal and inner-product-optimal (Prod with QJL correction) variants at 1-8 bits per coordinate.

Key components:

  • Single TurboQuant array encoding with optional QJL correction fields, storing quantized codes, norms, centroids, and rotation signs as children.
  • Structured Random Hadamard Transform (SRHT) for O(d log d) rotation, fully self-contained with no external linear algebra library.
  • Max-Lloyd centroid computation on Beta(d/2, d/2) distribution.
  • Approximate cosine similarity and dot product compute directly on quantized arrays without full decompression.
  • Pluggable TurboQuantScheme for BtrBlocks, exposed via WriteStrategyBuilder::with_vector_quantization().
  • Benchmarks covering common embedding dimensions (128, 768, 1024, 1536).

Also refactors CompressingStrategy to a single constructor, and adds vortex_tensor::initialize() for session registration of tensor types, encodings, and scalar functions.

API Changes

Adds a new TurboQuant encoding + some other things. TODO

Testing

TODO

@connortsui20 connortsui20 added the changelog/feature A new feature label Apr 2, 2026
@connortsui20 connortsui20 force-pushed the ct/turboquant branch 4 times, most recently from 44ca104 to 2bbee51 Compare April 2, 2026 19:54
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Lossy quantization for vector data (e.g., embeddings) based on TurboQuant
(https://arxiv.org/abs/2504.19874). Supports both MSE-optimal and
inner-product-optimal (Prod with QJL correction) variants at 1-8 bits per
coordinate.

Key components:
- Single TurboQuant array encoding with optional QJL correction fields,
  storing quantized codes, norms, centroids, and rotation signs as children.
- Structured Random Hadamard Transform (SRHT) for O(d log d) rotation,
  fully self-contained with no external linear algebra library.
- Max-Lloyd centroid computation on Beta(d/2, d/2) distribution.
- Approximate cosine similarity and dot product compute directly on
  quantized arrays without full decompression.
- Pluggable TurboQuantScheme for BtrBlocks, exposed via
  WriteStrategyBuilder::with_vector_quantization().
- Benchmarks covering common embedding dimensions (128, 768, 1024, 1536).

Also refactors CompressingStrategy to a single constructor, and adds
vortex_tensor::initialize() for session registration of tensor types,
encodings, and scalar functions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Will Manning <will@willmanning.io>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants