Table of Contents

Class PgVectorEmbeddingStore

Namespace
LMKit.Data.Storage.PgVector
Assembly
LM-Kit.NET.Data.Connectors.PgVector.dll

Implements the IVectorStore interface using PostgreSQL with the pgvector extension as the backend. Provides operations for creating, deleting, updating, and querying vector data with associated metadata, leveraging pgvector's similarity search capabilities.

public sealed class PgVectorEmbeddingStore : IVectorStore, IDisposable
Inheritance
PgVectorEmbeddingStore
Implements
Inherited Members

Remarks

Each LM-Kit collection is mapped to a PostgreSQL table inside the configured schema. Every table has three columns: id (a text primary key), embedding (a vector(N) column whose dimension is fixed when the collection is created), and metadata (a jsonb column holding the key-value metadata as a flat JSON object).

Similarity is computed with the cosine distance operator (<=>). Scores returned by SearchSimilarVectorsAsync(string, float[], uint, VectorRetrievalOptions, MetadataCollection, CancellationToken) are cosine similarities (1 - cosine_distance), so a higher score means a closer match, consistent with the other IVectorStore implementations.

The store is thread-safe: every operation borrows a pooled connection from an Npgsql.NpgsqlDataSource, so concurrent calls (for example parallel upserts) are supported.

Constructors

PgVectorEmbeddingStore(NpgsqlDataSource, bool, string)

Initializes a new instance of the PgVectorEmbeddingStore class using the specified Npgsql.NpgsqlDataSource.

PgVectorEmbeddingStore(string, string)

Initializes a new instance of the PgVectorEmbeddingStore class from a PostgreSQL connection string. The store builds and owns an internal pooled Npgsql.NpgsqlDataSource.

Methods

CollectionExistsAsync(string, CancellationToken)

Asynchronously checks if a collection with the specified identifier exists in the storage system.

CreateCollectionAsync(string, uint, IEnumerable<string>, CancellationToken)

Asynchronously creates a new collection for storing vectors of a fixed size.

DeleteCollectionAsync(string, CancellationToken)

Asynchronously deletes the specified collection from the storage system.

DeleteFromMetadataAsync(string, MetadataCollection, CancellationToken)

Asynchronously deletes all vector entries from the specified collection that match the provided metadata filter.

Dispose()

Releases all resources used by the PgVectorEmbeddingStore.

EnsureDatabaseExistsAsync(string, string, CancellationToken)

Ensures that the PostgreSQL database named in the supplied connection string exists, creating it if it does not. Because a connection cannot be opened to a database that does not yet exist, this method connects to an existing maintenance database (maintenanceDatabase, postgres by default) using the same host and credentials, and issues CREATE DATABASE there.

GetMetadataAsync(string, string, CancellationToken)

Asynchronously retrieves the metadata associated with a specific vector entry in a collection.

ListCollectionsAsync(CancellationToken)

Asynchronously lists all collection identifiers available in the storage system.

RetrieveFromMetadataAsync(string, MetadataCollection, VectorRetrievalOptions, uint, CancellationToken)

Asynchronously retrieves vector entries from the specified collection that match the given metadata criteria.

SearchSimilarVectorsAsync(string, float[], uint, VectorRetrievalOptions, MetadataCollection, CancellationToken)

Asynchronously searches for vectors within the specified collection that are similar to a given query vector.

UpdateMetadataAsync(string, string, MetadataCollection, MetadataUpdateMode, CancellationToken)

Asynchronously updates the metadata associated with an existing vector entry in the specified collection.

UpsertAsync(string, string, float[], MetadataCollection, CancellationToken)

Asynchronously inserts a new vector entry or updates an existing one in the specified collection, along with its associated metadata.

UpsertBatchAsync(string, IEnumerable<(string Id, float[] Vectors, MetadataCollection Metadata)>, CancellationToken)

Upserts multiple vectors with their associated metadata into the specified collection in a single batched transaction. The points are written in chunks, and either all chunks succeed or the transaction is rolled back.

Share