Method AddSample
- Namespace
- LMKit.Extraction.Training
- Assembly
- LM-Kit.NET.dll
AddSample(string, string)
Adds a training sample from raw text content using the engine’s preferred modality.
public void AddSample(string content, string jsonGroundTruth)
Parameters
content
stringThe textual content to extract from.
jsonGroundTruth
stringThe labeled ground-truth JSON matching the extraction schema (elements) configured on the underlying TextExtraction instance.
Examples
dataset.AddSample(
"Invoice ACME-2024-09: amount 120.00 USD.",
"{\"invoice_id\":\"ACME-2024-09\",\"amount\":120.00,\"currency\":\"USD\"}");
Remarks
The content
is wrapped as a text Attachment and
forwarded to AddSample(InferenceModality, Attachment, string).
AddSample(Attachment, string)
Adds a training sample from an Attachment using the engine’s preferred modality.
public void AddSample(Attachment content, string groundTruth)
Parameters
content
AttachmentThe input attachment (e.g., text, image, or multimodal source) to extract from.
groundTruth
stringThe labeled ground-truth JSON matching the configured extraction elements.
Remarks
For text inputs, prefer CreateFromText(string, string). This overload delegates to AddSample(InferenceModality, Attachment, string).
AddSample(InferenceModality, Attachment, string)
Adds a training sample with an explicit InferenceModality.
public void AddSample(InferenceModality modality, Attachment content, string jsonGroundTruth)
Parameters
modality
InferenceModalityThe inference modality to use for generating prompts and responses.
content
AttachmentThe content attachment to extract from.
jsonGroundTruth
stringThe labeled ground-truth JSON. It must be compatible with the configured Elements.
Examples
var attachment = Attachment.CreateFromText(invoiceText, "text");
dataset.AddSample(InferenceModality.Text, attachment, jsonLabel);
Remarks
This method:
- Validates that extraction elements are defined on the bound TextExtraction.
-
Builds an LMKit.Extraction.ExtractionInput and runs
engine.ExtractElements
withdisableInference=true
to materialize the final system and user prompts without invoking the model. -
Converts
jsonGroundTruth
into the canonical JSON completion using AsJson(bool, bool) with formatted element names and empty value normalization. -
Assembles a ChatHistory: includes the last system prompt (if any), the
single chat prompt from the engine, and the assistant response composed as
engine.ResponsePrefix + completion + engine.ResponseSuffix
. -
Creates and appends a ChatTrainingSample tagged with
engine.LastInferenceModality
.
If EnableModalityAugmentation is true and the last inference modality is Multimodal, two additional samples are added automatically for Text and Vision.
Exceptions
- NotImplementedException
Thrown when the underlying engine provides more than one chat prompt (
engine.ChatPrompts.Count != 1
), which is not supported by this builder.