Table of Contents

Constructor ExtractionTrainingDataset

Namespace
LMKit.Extraction.Training
Assembly
LM-Kit.NET.dll

ExtractionTrainingDataset(TextExtraction)

Initializes an extraction-focused training dataset bound to a specific TextExtraction configuration.

public ExtractionTrainingDataset(TextExtraction textExtraction)

Parameters

textExtraction TextExtraction

The configured text extraction engine whose elements, prompts, model, and preferred modality will be used to generate training samples.

Examples

var extraction = new TextExtraction(/* configured elsewhere */);
var dataset = new ExtractionTrainingDataset(extraction)
{
    EnableModalityAugmentation = true
};
dataset.AddSample("Order #123, total: €42.50", "{\"order_id\":\"123\",\"total\":42.50}");

Remarks

The constructor captures the current state of textExtraction (e.g., elements, titles/descriptions, and prompt templates). Subsequent calls to AddSample(Attachment, string) and overloads will validate elements and synthesize chat histories consistent with this configuration.