Table of Contents

Namespace LMKit.Extraction

Namespaces

LMKit.Extraction.Ocr
LMKit.Extraction.Taxonomy
LMKit.Extraction.Training

Classes

DocumentSegment

Represents a single logical document detected within a multi-page file, defined by its page range.

DocumentSplitting

Provides functionality to detect logical document boundaries within a multi-page file using a vision language model (VLM).

DocumentSplittingResult

Represents the result of a document splitting operation, containing the detected logical document segments and their page ranges.

EntityValidationResult

Contains the outcome of automatic entity detection and validation for a single extracted element.

ExtractionProgressEventArgs

Provides data for the Progress event, reporting the current phase and pass information of an extraction operation.

TextExtraction

Provides functionality to extract structured data from unstructured content using a language model.

TextExtractionElement

Represents an element used in text extraction processes, encapsulating metadata such as name, type, description, and optional nested elements for complex data structures.

TextExtractionElementFormat

Defines user-configurable constraints and normalization hints for extracted values.

TextExtractionResult

Represents the result of a text extraction process, encapsulating the extracted elements and their JSON representation.

TextExtractionResultElement

Represents a single extracted element produced by TextExtraction, holding the extracted value together with confidence, validation, and source location metadata.

Enums

EntityValidationStatus

Describes the outcome of entity validation on an extracted value.

ExtractionPhase

Defines the phases of a text extraction operation.

TextExtractionElementFormat.PredefinedStringFormat

Enumerates the standard string formats supported by the extractor.

TextExtractionElementFormat.TextCaseMode

Enumerates the options for converting the case of the extracted text.