Table of Contents

Class Embedder

Namespace
LMKit.Embeddings
Assembly
LM-Kit.NET.dll

A class designed for generating embeddings from text and image. It facilitates execution of tasks related to natural language and coding, such as semantic search, clustering, topic modeling, and classification. Supports embeddings from:

  • Plain text strings
  • Tokenized text
  • File attachments (e.g. images like PNG, JPEG, TIFF, and documents like TXT, HTML) when the underlying model provides image-embedding capabilities.
public sealed class Embedder
Inheritance
Embedder
Inherited Members

Examples

Example: Generate text embeddings

using LMKit.Model;
using LMKit.Embeddings;
using System;

// Load an embedding model
LM model = LM.LoadFromModelID("embeddinggemma-300m");

// Create the embedder
Embedder embedder = new Embedder(model);

// Generate embedding for a single text
float[] embedding = embedder.GetEmbeddings("Machine learning is fascinating.");

Console.WriteLine($"Embedding dimension: {embedding.Length}");
Console.WriteLine($"First 5 values: [{string.Join(", ", embedding.Take(5).Select(v => v.ToString("F4")))}...]");

Example: Batch embeddings for similarity comparison

using LMKit.Model;
using LMKit.Embeddings;
using System;
using System.Collections.Generic;

LM model = LM.LoadFromModelID("embeddinggemma-300m");
Embedder embedder = new Embedder(model);

// Generate embeddings for multiple texts
var texts = new List<string>
{
    "The cat sat on the mat.",
    "A feline rested on the rug.",
    "The stock market closed higher today."
};

float[][] embeddings = embedder.GetEmbeddings(texts);

// Calculate cosine similarity between first two texts
float similarity = VectorOperations.CosineSimilarity(embeddings[0], embeddings[1]);
Console.WriteLine($"Similarity between text 1 and 2: {similarity:F4}");

// Compare with unrelated text
float dissimilarity = VectorOperations.CosineSimilarity(embeddings[0], embeddings[2]);
Console.WriteLine($"Similarity between text 1 and 3: {dissimilarity:F4}");

Example: Image embeddings (multimodal model)

using LMKit.Model;
using LMKit.Embeddings;
using LMKit.Media.Image;
using System;

// Load a vision-enabled embedding model
LM model = LM.LoadFromModelID("nomic-embed-vision");
Embedder embedder = new Embedder(model);

// Generate embedding from an image
ImageBuffer image = ImageBuffer.LoadAsRGB("photo.jpg");
float[] imageEmbedding = embedder.GetEmbeddings(image);

Console.WriteLine($"Image embedding dimension: {imageEmbedding.Length}");

Remarks

Key Features

Common Use Cases

  • Semantic search: Find similar documents by comparing embedding vectors
  • Clustering: Group similar texts or images together
  • Classification: Use embeddings as features for ML classifiers
  • RAG systems: Generate embeddings for retrieval-augmented generation

The embedding dimension is determined by the model and can be queried via EmbeddingSize.

Constructors

Embedder(LM)

Initializes a new instance of the Embedder class.

Properties

Model

Gets the LM instance associated with this object.

Methods

GetCosineSimilarity(IList<float>, IList<float>)

Calculates the cosine similarity between two embedding vectors, representing the cosine of the angle between them in a multidimensional space.

GetEmbeddings(Attachment, CancellationToken)

Generates the embedding vector for a given file attachment. If the associated LM supports image embeddings, image attachments (for example, PNG, JPEG, TIFF) will be processed into embeddings; otherwise, text‑based attachments (for example, TXT, HTML) will be used.

GetEmbeddings(ImageBuffer, CancellationToken)

Generates the embedding vector for a given image.

GetEmbeddings(IEnumerable<ImageBuffer>, CancellationToken)

Generates embedding vectors for a collection of images.

GetEmbeddings(IEnumerable<IList<int>>, CancellationToken)

Generates embedding vectors for a collection of tokenized texts. Each vector represents a text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddings(IEnumerable<string>, CancellationToken)

Generates embedding vectors for a collection of text strings. Each vector represents a text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddings(IList<int>, CancellationToken)

Generates the embedding vector for a given tokenized text. This vector represents the text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddings(string, CancellationToken)

Generates the embedding vector for a given text string. This vector represents the text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddingsAsync(Attachment, CancellationToken)

Asynchronously generates the embedding vector for a given file attachment. If the associated LM supports image embeddings, image attachments (for example, PNG, JPEG, TIFF) will be processed into embeddings; otherwise, text‑based attachments (for example, TXT, HTML) will be used.

GetEmbeddingsAsync(ImageBuffer, CancellationToken)

Asynchronously generates the embedding vector for a given image.

GetEmbeddingsAsync(IEnumerable<ImageBuffer>, CancellationToken)

Asynchronously generates embedding vectors for a collection of images.

GetEmbeddingsAsync(IEnumerable<IList<int>>, CancellationToken)

Asynchronously generates embedding vectors for a collection of tokenized texts. Each vector represents a text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddingsAsync(IEnumerable<string>, CancellationToken)

Asynchronously generates embedding vectors for a collection of text strings. Each vector represents a text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddingsAsync(IList<int>, CancellationToken)

Asynchronously generates the embedding vector for a given tokenized text. This vector represents the text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.

GetEmbeddingsAsync(string, CancellationToken)

Asynchronously generates the embedding vector for a given text string. This vector represents the text in a high-dimensional space, enabling various natural language processing tasks by capturing semantic meaning.