Table of Contents

Class DocumentRag

Namespace
LMKit.Retrieval
Assembly
LM-Kit.NET.dll

Provides document-centric Retrieval-Augmented Generation (RAG) capabilities with built-in support for multi-page document processing, OCR, and vision-based document understanding.

public sealed class DocumentRag : RagEngine
Inheritance
DocumentRag
Inherited Members

Examples

// Basic document RAG setup
LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m");
DocumentRag docRag = new DocumentRag(embeddingModel);

// Optional: Configure OCR for scanned documents
docRag.OcrEngine = new OcrEngine();

// Import a PDF document
var attachment = Attachment.FromFile("report.pdf");
var dataSource = await docRag.ImportDocumentAsync(attachment, "reports");

// Search for relevant content
var matches = await docRag.FindMatchingPartitionsAsync("quarterly revenue", topK: 5);

// Generate a response with source references
LM chatModel = LM.LoadFromModelID("llama-3.1-8b-instruct");
var conversation = new SingleTurnConversation(chatModel);
var result = await docRag.QueryPartitionsAsync("What was the quarterly revenue?", matches, conversation, default);

Console.WriteLine(result.Response.Text);
foreach (var reference in result.SourceReferences)
{
    Console.WriteLine($"Source: {reference.DocumentName}, Page {reference.PageNumber}");
}

Remarks

DocumentRag extends RagEngine to simplify working with document attachments such as PDFs, images, and other multi-page formats. It automatically handles page-by-page extraction, text chunking, and embedding generation.

The class supports three processing modes:

  • Auto (default): Automatically selects the best processing strategy per page based on content type and available engines.
  • TextExtraction: Uses traditional text extraction with optional OCR for image-based pages.
  • DocumentUnderstanding: Uses vision language models (VLM) for advanced document parsing, preserving layout and structure as markdown.

For OCR-based text extraction, configure the OcrEngine property. For vision-based document understanding, configure the VisionParser property.

Constructors

DocumentRag(LM, IVectorStore)

Initializes a new instance of the DocumentRag class with the specified embedding model.

Properties

MaxChunkSize

Gets or sets the maximum size in characters for text chunks during document import.

OcrEngine

Gets or sets the OCR engine used for extracting text from image-based document pages.

ProcessingMode

Gets or sets the page processing mode that determines how document pages are analyzed and text is extracted.

PromptTemplate

Gets or sets the prompt template used when querying partitions.

VisionParser

Gets or sets the vision language model (VLM) parser used for advanced document understanding.

Methods

ImportDocument(Attachment, string, string, DocumentMetadata, CancellationToken)

Imports a document into a DataSource, extracting text from each page and generating embeddings for retrieval.

ImportDocumentAsync(Attachment, string, string, DocumentMetadata, CancellationToken)

Asynchronously imports a document into a DataSource, extracting text from each page and generating embeddings for retrieval.

QueryPartitions(string, IEnumerable<PartitionSimilarity>, IConversation, bool, CancellationToken)

Generates a response by querying the specified partitions, optionally including page renderings for visual context, and returns the result with source document references.

QueryPartitions(string, IEnumerable<PartitionSimilarity>, IConversation, CancellationToken)

Generates a response by querying the specified partitions and returns the result with source document references.

QueryPartitionsAsync(string, IEnumerable<PartitionSimilarity>, IConversation, bool, CancellationToken)

Asynchronously generates a response by querying the specified partitions, optionally including page renderings for visual context, and returns the result with source document references.

QueryPartitionsAsync(string, IEnumerable<PartitionSimilarity>, IConversation, CancellationToken)

Asynchronously generates a response by querying the specified partitions and returns the result with source document references.

Events

Progress

Occurs when document import progress changes, providing status updates for each processing phase.