Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 0 additions & 46 deletions docs/ai/conceptual/vector-databases.md

This file was deleted.

30 changes: 27 additions & 3 deletions docs/ai/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,6 @@ items:
href: conceptual/understanding-tokens.md
- name: Embeddings
href: conceptual/embeddings.md
- name: Vector databases
href: conceptual/vector-databases.md
- name: Data ingestion
href: conceptual/data-ingestion.md
- name: Prompt engineering
Expand Down Expand Up @@ -79,7 +77,7 @@ items:
- name: Get started with the RAG sample
href: get-started-app-chat-template.md
- name: Implement RAG using vector search
href: tutorials/tutorial-ai-vector-search.md
href: vector-stores/tutorial-vector-search.md
- name: Scale Azure OpenAI with Azure Container Apps
href: get-started-app-chat-scaling-with-azure-container-apps.md
- name: MCP client/server
Expand All @@ -94,6 +92,32 @@ items:
items:
- name: Use Microsoft.ML.Tokenizers
href: how-to/use-tokenizers.md
- name: Vector stores
items:
- name: Vector databases overview
href: vector-stores/overview.md
- name: How-to
items:
- name: Use vector stores
href: vector-stores/how-to/use-vector-stores.md
- name: Build a vector search app
href: vector-stores/how-to/build-vector-search-app.md
- name: Ingest data into a vector store
href: vector-stores/how-to/vector-store-data-ingestion.md
- name: Build a vector store connector
href: vector-stores/how-to/build-your-own-connector.md
- name: Define your data model
href: vector-stores/defining-your-data-model.md
- name: Define schema with record definitions
href: vector-stores/schema-with-record-definition.md
- name: Dynamic data model
href: vector-stores/dynamic-data-model.md
- name: Generate embeddings
href: vector-stores/embedding-generation.md
- name: Vector search
href: vector-stores/vector-search.md
- name: Hybrid search
href: vector-stores/hybrid-search.md
- name: Security and content safety
items:
- name: Authentication for Azure-hosted apps and services
Expand Down
117 changes: 117 additions & 0 deletions docs/ai/vector-stores/defining-your-data-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
title: Define your Vector Store data model
description: Describes how to create a data model with Microsoft.Extensions.VectorData to use when writing to or reading from a Vector Store.
ms.topic: reference
ms.date: 07/08/2024
---
# Define your data model

## Overview

The Vector Store connectors use a model-first approach to interacting with databases.

All methods to upsert or get records use strongly typed model classes.
The properties on these classes are decorated with attributes that indicate the purpose of each property.

> [!TIP]
> For an alternative to using attributes, see [defining your schema with a record definition](./schema-with-record-definition.md).
> [!TIP]
> For an alternative to defining your own data model, see [using Vector Store abstractions without defining your own data model](./generic-data-model.md).

Here is an example of a model that is decorated with these attributes.

```csharp
using Microsoft.Extensions.VectorData;

public class Hotel
{
[VectorStoreKey]
public ulong HotelId { get; set; }

[VectorStoreData(IsIndexed = true)]
public string HotelName { get; set; }

[VectorStoreData(IsFullTextIndexed = true)]
public string Description { get; set; }

[VectorStoreVector(Dimensions: 4, DistanceFunction = DistanceFunction.CosineSimilarity, IndexKind = IndexKind.Hnsw)]
public ReadOnlyMemory<float>? DescriptionEmbedding { get; set; }

[VectorStoreData(IsIndexed = true)]
public string[] Tags { get; set; }
}
```

## Attributes

### VectorStoreKeyAttribute

Use the <xref:Microsoft.Extensions.VectorData.VectorStoreKeyAttribute> attribute to indicate that your property is the key of the record.

```csharp
[VectorStoreKey]
public ulong HotelId { get; set; }
```

#### VectorStoreKeyAttribute parameters

| Parameter | Required | Description |
|---------------|:--------:|-------------|
| <xref:Microsoft.Extensions.VectorData.VectorStoreKeyAttribute.StorageName> | No | Can be used to supply an alternative name for the property in the database. This parameter isn't supported by all connectors, for example, where alternatives like `JsonPropertyNameAttribute` are supported. |

### VectorStoreDataAttribute

Use the <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute> attribute to indicate that your property contains general data that is not a key or a vector.

```csharp
[VectorStoreData(IsIndexed = true)]
public string HotelName { get; set; }
```

#### VectorStoreDataAttribute parameters

| Parameter | Required | Description |
|-------------|:--------:|-------------|
| <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute.IsIndexed> | No | Indicates whether the property should be indexed for filtering in cases where a database requires opting in to indexing per property. The default is `false`. |
| <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute.IsFullTextIndexed> | No | Indicates whether the property should be indexed for full text search for databases that support full text search. The default is `false`. |
| <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute.StorageName> | No | Can be used to supply an alternative name for the property in the database. This parameter is not supported by all connectors, for example, where alternatives like `JsonPropertyNameAttribute` are supported. |

> [!TIP]
> For more information on which connectors support <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute.StorageName> and what alternatives are available, see [the documentation for each connector](./out-of-the-box-connectors/index.md).

### VectorStoreVectorAttribute

Use the <xref:Microsoft.Extensions.VectorData.VectorStoreVectorAttribute> attribute to indicate that your property contains a vector.

```csharp
[VectorStoreVector(Dimensions: 4, DistanceFunction = DistanceFunction.CosineSimilarity, IndexKind = IndexKind.Hnsw)]
public ReadOnlyMemory<float>? DescriptionEmbedding { get; set; }
```

It's also possible to use <xref:Microsoft.Extensions.VectorData.VectorStoreVectorAttribute> on properties that dont' have a vector type, for example, a property of type `string`.
When a property is decorated in this way, you need to provide an <xref:Microsoft.Extensions.AI.IEmbeddingGenerator> instance to the vector store.
When upserting the record, the text that is in the `string` property is automatically converted and stored as a vector in the database.
It's not possible to retrieve a vector using this mechanism.

```csharp
[VectorStoreVector(Dimensions: 4, DistanceFunction = DistanceFunction.CosineSimilarity, IndexKind = IndexKind.Hnsw)]
public string DescriptionEmbedding { get; set; }
```

> [!TIP]
> For more information on how to use built-in embedding generation, see [Let the Vector Store generate embeddings](./embedding-generation.md#letting-the-vector-store-generate-embeddings).

#### VectorStoreVectorAttribute parameters

| Parameter | Required | Description |
|------------|:--------:|-------------|
| `Dimensions` | Yes | The number of dimensions that the vector has. This is required when creating a vector index for a collection. |
| <xref:Microsoft.Extensions.VectorData.IndexKind> | No | The type of index to index the vector with. Default varies by vector store type. |
| <xref:Microsoft.Extensions.VectorData.DistanceFunction> | No | The type of function to use when doing vector comparison during vector search over this vector. Default varies by vector store type. |
| <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute.StorageName> | No | Can be used to supply an alternative name for the property in the database. This parameter is not supported by all connectors, for example, where alternatives like `JsonPropertyNameAttribute` is supported. |

Common index kinds and distance function types are supplied as static values on the <xref:Microsoft.Extensions.VectorData.IndexKind> and <xref:Microsoft.Extensions.VectorData.DistanceFunction> classes.
Individual Vector Store implementations might also use their own index kinds and distance functions, where the database supports unusual types.

> [!TIP]
> For more information on which connectors support <xref:Microsoft.Extensions.VectorData.VectorStoreDataAttribute.StorageName> and what alternatives are available, see [the documentation for each connector](./out-of-the-box-connectors/index.md).
77 changes: 77 additions & 0 deletions docs/ai/vector-stores/dynamic-data-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
title: Using Vector Store abstractions without defining your own data model
description: Describes how to use Vector Store abstractions without defining your own data model.
ms.topic: reference
ms.date: 10/16/2024
---
# Use Vector Store abstractions without defining your own data model

The Vector Store connectors use a model-first approach to interact with databases. This makes using the connectors easy and simple, since
your data model reflects the schema of your database records and to add any additional schema information required, you can simply add attributes to your data model properties.

However, there are cases where it isn't desirable or possible to define your own data model. For example, imagine that you don't know at compile time what your
database schema looks like, and the schema is only provided via configuration. Creating a data model that reflects the schema would be impossible in this case.

In this case, you can use a `Dictionary<string, object?>` for the record type. Properties are added to the `Dictionary` with key as the property name and the value as the property value.

## Supply schema information when using `Dictionary`

When using a `Dictionary`, connectors still need to know what the database schema looks like. Without the schema information
the connector would not be able to create a collection, or know how to map to and from the storage representation that each database uses.

A record definition can be used to provide the schema information. Unlike a data model, a record definition can be created from configuration at runtime, providing a solution for when schema information is not known at compile time.

> [!TIP]
> To see how to create a record definition, see [defining your schema with a record definition](./schema-with-record-definition.md).

## Example

To use the `Dictionary` with a connector, simply specify it as your data model when creating a collection, and simultaneously provide a record definition.

```csharp
// Create the definition to define the schema.
VectorStoreCollectionDefinition definition = new()
{
Properties = new List<VectorStoreProperty>
{
new VectorStoreKeyProperty("Key", typeof(string)),
new VectorStoreDataProperty("Term", typeof(string)),
new VectorStoreDataProperty("Definition", typeof(string)),
new VectorStoreVectorProperty("DefinitionEmbedding", typeof(ReadOnlyMemory<float>), dimensions: 1536)
}
};

// When getting your collection instance from a vector store instance
// specify the Dictionary, using object as the key type for your database
// and also pass your record definition.
// Note that you have to use GetDynamicCollection instead of the regular GetCollection method
// to get an instance of a collection using Dictionary<string, object?>.
var dynamicDataModelCollection = vectorStore.GetDynamicCollection(
"glossary",
definition);

// Since schema information is available from the record definition
// it's possible to create a collection with the right vectors,
// dimensions, indexes, and distance functions.
await dynamicDataModelCollection.EnsureCollectionExistsAsync();

// When retrieving a record from the collection, key, data, and vector values can
// now be accessed via the dictionary entries.
var record = await dynamicDataModelCollection.GetAsync("SK");
Console.WriteLine(record["Definition"]);
```

When constructing a collection instance directly, the record definition
is passed as an option. For example, here is an example of constructing
an Azure AI Search collection instance with `Dictionary`.

Each vector store collection implementation has a separate `*DynamicCollection`
class that can be used with `Dictionary<string, object?>`.
This is because these implementations might support NativeAOT/trimming.

```csharp
new AzureAISearchDynamicCollection(
searchIndexClient,
"glossary",
new() { Definition = definition });
```
Loading
Loading