Skip to content

fix: fail fast HNSW index when embedding dimension exceeds 2000#361

Merged
nicoloboschi merged 1 commit intovectorize-io:mainfrom
slayoffer:fix/hnsw-dimension-limit
Feb 26, 2026
Merged

fix: fail fast HNSW index when embedding dimension exceeds 2000#361
nicoloboschi merged 1 commit intovectorize-io:mainfrom
slayoffer:fix/hnsw-dimension-limit

Conversation

@slayoffer
Copy link
Copy Markdown
Contributor

Problem

ensure_vector_extension() and ensure_embedding_dimension() unconditionally create HNSW indexes at startup. pgvector's HNSW implementation has a hard limit of 2000 dimensions — any column with more dimensions causes a ProgramLimitExceeded error:

psycopg2.errors.ProgramLimitExceeded: column cannot have more than 2000 dimensions for hnsw index

[SQL: CREATE INDEX IF NOT EXISTS idx_memory_units_embedding
      ON public.memory_units
      USING hnsw (embedding vector_cosine_ops)
      WITH (m = 16, ef_construction = 64)]

This crashes the application at startup for anyone using high-dimensional embeddings like OpenAI text-embedding-3-large (3072 dimensions) with standard pgvector (not VChord).

This was introduced by #350 and #355 which added VChord support and the ensure_vector_extension() startup check.

Fix

Before creating an HNSW index, check the embedding column dimension. If it exceeds 2000, log a warning and skip index creation — falling back to sequential scan, which works correctly at any dimension.

Applied in two places:

  1. ensure_vector_extension() — queries pg_attribute.atttypmod per table to get the embedding dimension
  2. ensure_embedding_dimension() — uses the already-available required_dimension parameter

Context

  • pgvector 0.8.x HNSW limit: 2000 dimensions (pgvector docs)
  • Cloud SQL for PostgreSQL ships pgvector 0.8.1 which has this limit
  • VChord (vchordrq) does not have this limitation — only the pgvector HNSW path is affected
  • Sequential scan performance is acceptable for small-to-medium datasets; users needing indexed performance at >2000 dims should use VChord

Test plan

  • Verified on Cloud SQL pgvector 0.8.1 with 3072-dim OpenAI embeddings — startup succeeds
  • Log output shows: Skipping HNSW index on memory_units: embedding dimension 3072 exceeds pgvector HNSW limit of 2000 dims
  • Verify HNSW index is still created for ≤2000-dim embeddings (e.g., local sentence-transformers 384-dim)

🤖 Generated with Claude Code

Copy link
Copy Markdown
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should raise the exception if pgvector can't handle that model

Instead of silently skipping HNSW index creation for embeddings > 2000
dimensions, raise a RuntimeError with an actionable message suggesting
pgvectorscale/DiskANN as an alternative.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@slayoffer slayoffer force-pushed the fix/hnsw-dimension-limit branch from 3d98097 to 9b4401a Compare February 16, 2026 23:50
@slayoffer
Copy link
Copy Markdown
Contributor Author

Updated: now raises RuntimeError instead of silently skipping HNSW index creation when dimensions exceed 2000. Both code paths (ensure_embedding_dimension and ensure_vector_extension) raise with an actionable message suggesting pgvectorscale/DiskANN as alternatives.

Copy link
Copy Markdown
Contributor Author

@slayoffer slayoffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — updated both ensure_embedding_dimension and ensure_vector_extension to raise RuntimeError when embedding dimensions exceed 2000. The error message suggests switching to pgvectorscale/DiskANN which now has native support after #378/#381.

@nicoloboschi nicoloboschi changed the title fix: skip HNSW index when embedding dimension exceeds 2000 fix: fail fast HNSW index when embedding dimension exceeds 2000 Feb 26, 2026
@nicoloboschi nicoloboschi merged commit 8cd65b9 into vectorize-io:main Feb 26, 2026
14 of 29 checks passed
r0gig0r pushed a commit to r0gig0r/hindsight that referenced this pull request Feb 26, 2026
…nd customization

Key upstream features:
- Batch observations consolidation (vectorize-io#430) - multiple facts per LLM call
- 18 new MCP tools + per-bank tool filtering (vectorize-io#435, vectorize-io#439)
- retain_mission, observations_mission, reflect_mission customization (vectorize-io#419)
- Disposition settings as hierarchical config (vectorize-io#419)
- Graph memories filtering with tags (vectorize-io#431)
- Observation invalidation on memory delete (vectorize-io#429)
- Reflect agent improvements (vectorize-io#428)
- Bank config API enabled by default (vectorize-io#426)
- pgvector HNSW dimension limit check (vectorize-io#361)
- Dynamic schema_getter for multi-tenant file storage (vectorize-io#440)

Conflict resolution:
- config.py: kept our tuned CONSOLIDATION_BATCH_SIZE=500 and MAX_TOKENS=1024
- consolidator.py, prompts.py, fact_extraction.py: took upstream (batch refactor)
- orchestrator.py: removed obsolete bank_mission kwarg (now config-based)
- Deleted test_consolidation_quality.py (tested removed _consolidate_with_llm)

Our primary LLM fallback (claude-code + OpenRouter) preserved in all files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants