From 04e370661d5b90c3be6f51d065d8147e8d8f6f15 Mon Sep 17 00:00:00 2001
From: "mintlify[bot]" <109931778+mintlify[bot]@users.noreply.github.com>
Date: Mon, 29 Jun 2026 21:35:43 +0000
Subject: [PATCH 1/3] refactor: use Steps component on vector-search KB page
---
.../general-faqs/vector-search.mdx | 69 ++++++++++---------
1 file changed, 36 insertions(+), 33 deletions(-)
diff --git a/resources/support-center/knowledge-base/general-faqs/vector-search.mdx b/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
index edf0628cd..a3339e037 100644
--- a/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
+++ b/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
@@ -21,44 +21,47 @@ The main advantages of using ClickHouse for vector search compared to using more
Here is a quick tutorial on how to use ClickHouse for vector search.
-## 1. Create embeddings {#1-create-embeddings}
-Your data (documents, images, or structured data) must be converted to _embeddings_. We recommend creating embeddings using the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) or using the open-source Python library [SentenceTransformers](https://www.sbert.net/).
+
+
+ Your data (documents, images, or structured data) must be converted to _embeddings_. We recommend creating embeddings using the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) or using the open-source Python library [SentenceTransformers](https://www.sbert.net/).
-You can think of an embedding as a large array of floating-point numbers that represent your data. [Check out this guide from OpenAI to learn more about embeddings](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).
+ You can think of an embedding as a large array of floating-point numbers that represent your data. [Check out this guide from OpenAI to learn more about embeddings](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).
+
+
+ Once you have generated embeddings, you need to store them in ClickHouse. Each embedding should be stored in a separate row and can include metadata for filtering, aggregations, or analytics. Here's an example of a table that can store images with captions:
-## 2. Store the embeddings {#2-store-the-embeddings}
-Once you have generated embeddings, you need to store them in ClickHouse. Each embedding should be stored in a separate row and can include metadata for filtering, aggregations, or analytics. Here's an example of a table that can store images with captions:
+ ```sql
+ CREATE TABLE images
+ (
+ `_file` LowCardinality(String),
+ `caption` String,
+ `image_embedding` Array(Float32)
+ )
+ ENGINE = MergeTree;
+ ```
+
+
+ Let's say you want to search for pictures of dogs in your dataset. You can use a distance function like `cosineDistance` to take an embedding of a dog image and search for related images:
-```sql
-CREATE TABLE images
-(
- `_file` LowCardinality(String),
- `caption` String,
- `image_embedding` Array(Float32)
-)
-ENGINE = MergeTree;
-```
+ ```sql
+ SELECT
+ _file,
+ caption,
+ cosineDistance(
+ -- An embedding of your "input" dog picture
+ [0.5736801028251648, 0.2516217529773712, ..., -0.6825592517852783],
+ image_embedding
+ ) AS score
+ FROM images
+ ORDER BY score ASC
+ LIMIT 10
+ ```
-## 3. Search for related embeddings {#3-search-for-related-embeddings}
-Let's say you want to search for pictures of dogs in your dataset. You can use a distance function like `cosineDistance` to take an embedding of a dog image and search for related images:
+ This query returns the `_file` names and `caption` of the top 10 images most likely to be related to your provided dog image.
+
+
-```sql
-SELECT
- _file,
- caption,
- cosineDistance(
- -- An embedding of your "input" dog picture
- [0.5736801028251648, 0.2516217529773712, ..., -0.6825592517852783],
- image_embedding
- ) AS score
-FROM images
-ORDER BY score ASC
-LIMIT 10
-```
-
-This query returns the `_file` names and `caption` of the top 10 images most likely to be related to your provided dog image.
-
-## Further Reading {#further-reading}
+## Further reading {#further-reading}
To follow a more in-depth tutorial on vector search using ClickHouse, please see:
- [Exact and Approximate Vector Search](/reference/engines/table-engines/mergetree-family/annindexes)
From 1063ada978a500cef338dd6c65556bca6287c4b9 Mon Sep 17 00:00:00 2001
From: "mintlify[bot]" <109931778+mintlify[bot]@users.noreply.github.com>
Date: Mon, 29 Jun 2026 21:43:19 +0000
Subject: [PATCH 2/3] refactor: wrap existing headings with Step, drop
numbering, keep anchors
---
.../knowledge-base/general-faqs/vector-search.mdx | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/resources/support-center/knowledge-base/general-faqs/vector-search.mdx b/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
index a3339e037..078d4fd5d 100644
--- a/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
+++ b/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
@@ -22,12 +22,16 @@ The main advantages of using ClickHouse for vector search compared to using more
Here is a quick tutorial on how to use ClickHouse for vector search.
-
+
+ ## Create embeddings {#1-create-embeddings}
+
Your data (documents, images, or structured data) must be converted to _embeddings_. We recommend creating embeddings using the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) or using the open-source Python library [SentenceTransformers](https://www.sbert.net/).
You can think of an embedding as a large array of floating-point numbers that represent your data. [Check out this guide from OpenAI to learn more about embeddings](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).
-
+
+ ## Store the embeddings {#2-store-the-embeddings}
+
Once you have generated embeddings, you need to store them in ClickHouse. Each embedding should be stored in a separate row and can include metadata for filtering, aggregations, or analytics. Here's an example of a table that can store images with captions:
```sql
@@ -40,7 +44,9 @@ Here is a quick tutorial on how to use ClickHouse for vector search.
ENGINE = MergeTree;
```
-
+
+ ## Search for related embeddings {#3-search-for-related-embeddings}
+
Let's say you want to search for pictures of dogs in your dataset. You can use a distance function like `cosineDistance` to take an embedding of a dog image and search for related images:
```sql
From 793544290b5da17c0fc8317f5d66356101c90152 Mon Sep 17 00:00:00 2001
From: "mintlify[bot]" <109931778+mintlify[bot]@users.noreply.github.com>
Date: Tue, 30 Jun 2026 08:21:07 +0000
Subject: [PATCH 3/3] fix: use Step title prop and unindent markdown in
vector-search Steps
---
.../general-faqs/vector-search.mdx | 86 +++++++++----------
1 file changed, 43 insertions(+), 43 deletions(-)
diff --git a/resources/support-center/knowledge-base/general-faqs/vector-search.mdx b/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
index 078d4fd5d..70abfa34a 100644
--- a/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
+++ b/resources/support-center/knowledge-base/general-faqs/vector-search.mdx
@@ -22,49 +22,49 @@ The main advantages of using ClickHouse for vector search compared to using more
Here is a quick tutorial on how to use ClickHouse for vector search.
-
- ## Create embeddings {#1-create-embeddings}
-
- Your data (documents, images, or structured data) must be converted to _embeddings_. We recommend creating embeddings using the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) or using the open-source Python library [SentenceTransformers](https://www.sbert.net/).
-
- You can think of an embedding as a large array of floating-point numbers that represent your data. [Check out this guide from OpenAI to learn more about embeddings](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).
-
-
- ## Store the embeddings {#2-store-the-embeddings}
-
- Once you have generated embeddings, you need to store them in ClickHouse. Each embedding should be stored in a separate row and can include metadata for filtering, aggregations, or analytics. Here's an example of a table that can store images with captions:
-
- ```sql
- CREATE TABLE images
- (
- `_file` LowCardinality(String),
- `caption` String,
- `image_embedding` Array(Float32)
- )
- ENGINE = MergeTree;
- ```
-
-
- ## Search for related embeddings {#3-search-for-related-embeddings}
-
- Let's say you want to search for pictures of dogs in your dataset. You can use a distance function like `cosineDistance` to take an embedding of a dog image and search for related images:
-
- ```sql
- SELECT
- _file,
- caption,
- cosineDistance(
- -- An embedding of your "input" dog picture
- [0.5736801028251648, 0.2516217529773712, ..., -0.6825592517852783],
- image_embedding
- ) AS score
- FROM images
- ORDER BY score ASC
- LIMIT 10
- ```
-
- This query returns the `_file` names and `caption` of the top 10 images most likely to be related to your provided dog image.
-
+
+
+Your data (documents, images, or structured data) must be converted to _embeddings_. We recommend creating embeddings using the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) or using the open-source Python library [SentenceTransformers](https://www.sbert.net/).
+
+You can think of an embedding as a large array of floating-point numbers that represent your data. [Check out this guide from OpenAI to learn more about embeddings](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).
+
+
+
+
+Once you have generated embeddings, you need to store them in ClickHouse. Each embedding should be stored in a separate row and can include metadata for filtering, aggregations, or analytics. Here's an example of a table that can store images with captions:
+
+```sql
+CREATE TABLE images
+(
+ `_file` LowCardinality(String),
+ `caption` String,
+ `image_embedding` Array(Float32)
+)
+ENGINE = MergeTree;
+```
+
+
+
+
+Let's say you want to search for pictures of dogs in your dataset. You can use a distance function like `cosineDistance` to take an embedding of a dog image and search for related images:
+
+```sql
+SELECT
+ _file,
+ caption,
+ cosineDistance(
+ -- An embedding of your "input" dog picture
+ [0.5736801028251648, 0.2516217529773712, ..., -0.6825592517852783],
+ image_embedding
+ ) AS score
+FROM images
+ORDER BY score ASC
+LIMIT 10
+```
+
+This query returns the `_file` names and `caption` of the top 10 images most likely to be related to your provided dog image.
+
+
## Further reading {#further-reading}