Table of Contents

Method AddSample

Namespace
LMKit.Extraction.Training
Assembly
LM-Kit.NET.dll

AddSample(string, string)

Adds a training sample from raw text content using the engine’s preferred modality.

public void AddSample(string content, string jsonGroundTruth)

Parameters

content string

The textual content to extract from.

jsonGroundTruth string

The labeled ground-truth JSON matching the extraction schema (elements) configured on the underlying TextExtraction instance.

Examples

dataset.AddSample(
    "Invoice ACME-2024-09: amount 120.00 USD.",
    "{\"invoice_id\":\"ACME-2024-09\",\"amount\":120.00,\"currency\":\"USD\"}");

Remarks

The content is wrapped as a text Attachment and forwarded to AddSample(InferenceModality, Attachment, string).

AddSample(Attachment, string)

Adds a training sample from an Attachment using the engine’s preferred modality.

public void AddSample(Attachment content, string groundTruth)

Parameters

content Attachment

The input attachment (e.g., text, image, or multimodal source) to extract from.

groundTruth string

The labeled ground-truth JSON matching the configured extraction elements.

Remarks

For text inputs, prefer CreateFromText(string, string). This overload delegates to AddSample(InferenceModality, Attachment, string).

AddSample(InferenceModality, Attachment, string)

Adds a training sample with an explicit InferenceModality.

public void AddSample(InferenceModality modality, Attachment content, string jsonGroundTruth)

Parameters

modality InferenceModality

The inference modality to use for generating prompts and responses.

content Attachment

The content attachment to extract from.

jsonGroundTruth string

The labeled ground-truth JSON. It must be compatible with the configured Elements.

Examples

var attachment = Attachment.CreateFromText(invoiceText, "text");
dataset.AddSample(InferenceModality.Text, attachment, jsonLabel);

Remarks

This method:

  1. Validates that extraction elements are defined on the bound TextExtraction.
  2. Builds an LMKit.Extraction.ExtractionInput and runs engine.ExtractElements with disableInference=true to materialize the final system and user prompts without invoking the model.
  3. Converts jsonGroundTruth into the canonical JSON completion using AsJson(bool, bool) with formatted element names and empty value normalization.
  4. Assembles a ChatHistory: includes the last system prompt (if any), the single chat prompt from the engine, and the assistant response composed as engine.ResponsePrefix + completion + engine.ResponseSuffix.
  5. Creates and appends a ChatTrainingSample tagged with engine.LastInferenceModality.

If EnableModalityAugmentation is true and the last inference modality is Multimodal, two additional samples are added automatically for Text and Vision.

Exceptions

NotImplementedException

Thrown when the underlying engine provides more than one chat prompt (engine.ChatPrompts.Count != 1), which is not supported by this builder.