Namespace LMKit.Retrieval
Namespaces
Classes
- DocumentIndexingResult
Describes the outcome of loading a document into PdfChat, including how the document was processed and its resource consumption.
- DocumentQueryResult
Contains the result of a document query, including the generated response and references to source passages.
- DocumentRag
Provides document-centric Retrieval-Augmented Generation (RAG) capabilities with built-in support for multi-page document processing, OCR, and vision-based document understanding.
- DocumentRag.DocumentMetadata
Represents metadata associated with a document during import into a DocumentRag instance.
- DocumentReference
Represents a reference to a specific location within a document, retrieved during a retrieval operation.
- MarkdownChunking
Provides Markdown-aware chunking configuration for retrieval workflows. The underlying splitter favors Markdown structural boundaries (such as headings and paragraph breaks) to produce chunks that preserve semantic coherence.
- PartitionSimilarity
Represents the result of a retrieval operation, capturing the similarity between a partition (or vector entry) and a target item.
- PdfChat
Provides conversational question-answering over PDF documents by combining intelligent document understanding with grounded response generation.
- RagEngine
Provides core functionalities for Retrieval-Augmented Generation (RAG) within a data processing system.
- RagEngine.RagReranker
Encapsulates a reranking model and blending factor for adjusting raw similarity scores in RAG workflows.
- TextChunking
Implements a recursive chunking strategy for partitioning text into manageable segments, known as "chunks," to support retrieval-augmented generation tasks.
This approach is particularly effective for processing extensive texts, systematically breaking them down into smaller segments that are easier to handle.
Unlike linear chunking methods that sequentially divide text, this recursive strategy dynamically adjusts the segmentation process based on the complexity and structure of the text.
This allows for more nuanced and efficient handling of text data, especially when dealing with nested or hierarchical information.
- VectorSearch
Provides methods for searching partitions across one or more data sources by comparing vector embeddings for similarity.
Interfaces
- IChunking
Defines configurable settings for text chunking. Implementations control how input text is partitioned into chunks suitable for retrieval and embedding workflows.
Enums
- DocumentIndexingResult.DocumentIndexingMode
Specifies how a document was processed for retrieval.
- PageProcessingMode
Specifies how document pages are interpreted during document preparation.