Method AddSample

Namespace: LMKit.TextAnalysis.Training

Assembly: LM-Kit.NET.dll

AddSample(string, IEnumerable<EntityAnnotation>)

Adds a training sample from raw text content using the engine's preferred modality.

public void AddSample(string content, IEnumerable<EntityAnnotation> annotations)

Parameters

content string: The textual content to analyze for named entities.
annotations IEnumerable<EntityAnnotation>: Ground-truth entity annotations (label + representative text) expected in content.

Examples

using var model = new LM("path/to/model.gguf");
var ner = new NamedEntityRecognition(model);
var dataset = new NamedEntityRecognitionTrainingDataset(ner);

// Add samples with various entity types
dataset.AddSample(
    "Amazon reported $134 billion in revenue for Q3 2023.",
    new[]
    {
        new EntityAnnotation("Organization", "Amazon"),
        new EntityAnnotation("Money", "$134 billion"),
        new EntityAnnotation("Date", "Q3 2023")
    });

dataset.AddSample(
    "The Treaty of Versailles was signed on June 28, 1919.",
    new[]
    {
        new EntityAnnotation("Event", "Treaty of Versailles"),
        new EntityAnnotation("Date", "June 28, 1919")
    });

// Negative sample: no entities present
dataset.AddSample(
    "The weather is nice today.",
    Array.Empty<EntityAnnotation>());

dataset.ExportAsSharegpt("ner_dataset.json", overwrite: true);

Remarks

The content is wrapped as a text Attachment and forwarded to AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>).

AddSample(Attachment, IEnumerable<EntityAnnotation>)

Adds a training sample from an Attachment using the engine's preferred modality.

public void AddSample(Attachment content, IEnumerable<EntityAnnotation> annotations)

Parameters

content Attachment: The input attachment (e.g., text, image, or multimodal source) to analyze.
annotations IEnumerable<EntityAnnotation>: Ground-truth entity annotations (label + representative text) expected in content.

Examples

using var model = new LM("path/to/model.gguf");
var ner = new NamedEntityRecognition(model);
var dataset = new NamedEntityRecognitionTrainingDataset(ner);

// Create attachment from text
var textAttachment = Attachment.CreateFromText(
    "President Biden met with Chancellor Scholz in Berlin on February 15, 2024.",
    "news");

dataset.AddSample(
    textAttachment,
    new[]
    {
        new EntityAnnotation("Person", "President Biden"),
        new EntityAnnotation("Person", "Chancellor Scholz"),
        new EntityAnnotation("Location", "Berlin"),
        new EntityAnnotation("Date", "February 15, 2024")
    });

// Create attachment from image file (e.g., scanned news article)
var imageAttachment = Attachment.CreateFromFile("news_clipping.png");

dataset.AddSample(
    imageAttachment,
    new[]
    {
        new EntityAnnotation("Organization", "United Nations"),
        new EntityAnnotation("Location", "Geneva"),
        new EntityAnnotation("Date", "March 2024")
    });

dataset.ExportAsSharegpt("multimodal_ner_dataset.json", overwrite: true);

Remarks

For text inputs, prefer CreateFromText(string, string). This overload delegates to AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>).

AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>)

Adds a training sample with an explicit InferenceModality.

public void AddSample(InferenceModality modality, Attachment content, IEnumerable<EntityAnnotation> annotations)

Parameters

modality InferenceModality: The inference modality to use for generating prompts and responses.
content Attachment: The content attachment to analyze.
annotations IEnumerable<EntityAnnotation>: Ground-truth entity annotations (label + representative text) expected in content.

Examples

using var model = new LM("path/to/model.gguf");
var ner = new NamedEntityRecognition(model);
var dataset = new NamedEntityRecognitionTrainingDataset(ner);

var attachment = Attachment.CreateFromText(
    "SpaceX launched Falcon 9 from Cape Canaveral carrying Starlink satellites.",
    "text");

// Explicitly specify Text modality
dataset.AddSample(
    InferenceModality.Text,
    attachment,
    new[]
    {
        new EntityAnnotation("Organization", "SpaceX"),
        new EntityAnnotation("Product", "Falcon 9"),
        new EntityAnnotation("Location", "Cape Canaveral"),
        new EntityAnnotation("Product", "Starlink")
    });

// Add a Vision-only sample from an image
var imageAttachment = Attachment.CreateFromFile("business_card.png");

dataset.AddSample(
    InferenceModality.Vision,
    imageAttachment,
    new[]
    {
        new EntityAnnotation("Person", "John Smith"),
        new EntityAnnotation("Organization", "Acme Corporation"),
        new EntityAnnotation("Phone", "+1 555-123-4567"),
        new EntityAnnotation("Email", "john.smith@acme.com")
    });

dataset.ExportAsSharegpt("modality_specific_ner_dataset.json", overwrite: true);

Remarks

This method assembles a ShareGPT-style conversation from the configured prompts and appends a ChatTrainingSample whose assistant response reflects the provided annotations. When EnableModalityAugmentation is true and modality is Multimodal, additional samples are appended for Text and Vision.

Table of Contents

Method AddSample

AddSample(string, IEnumerable<EntityAnnotation>)

Parameters

Examples

Remarks

AddSample(Attachment, IEnumerable<EntityAnnotation>)

Parameters

Examples

Remarks

AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>)

Parameters

Examples

Remarks