Method AddSample

Namespace: LMKit.TextAnalysis.Training

Assembly: LM-Kit.NET.dll

AddSample(string, IEnumerable<EntityAnnotation>)

Adds a training sample from raw text content using the engine's preferred modality.

public void AddSample(string content, IEnumerable<EntityAnnotation> annotations)

Parameters

content string: The textual content to analyze for PII/entities.
annotations IEnumerable<EntityAnnotation>: Ground-truth entity annotations (label + representative text) expected in content.

Examples

using var model = new LM("path/to/model.gguf");
using var pii = new PiiExtraction(model);
var dataset = new PiiExtractionTrainingDataset(pii);

// Add samples with various PII types
dataset.AddSample(
    "SSN: 123-45-6789",
    new[] { new EntityAnnotation("US_SSN", "123-45-6789") });

dataset.AddSample(
    "My credit card is 4111-1111-1111-1111, expires 12/25.",
    new[]
    {
        new EntityAnnotation("CreditCard", "4111-1111-1111-1111"),
        new EntityAnnotation("ExpirationDate", "12/25")
    });

dataset.AddSample(
    "No sensitive information in this text.",
    Array.Empty<EntityAnnotation>()); // Negative sample

dataset.ExportAsSharegpt("pii_dataset.json", overwrite: true);

Remarks

The content is wrapped as a text Attachment and forwarded to AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>).

AddSample(Attachment, IEnumerable<EntityAnnotation>)

Adds a training sample from an Attachment using the engine's preferred modality.

public void AddSample(Attachment content, IEnumerable<EntityAnnotation> annotations)

Parameters

content Attachment: The input attachment (e.g., text, image, or multimodal source) to analyze.
annotations IEnumerable<EntityAnnotation>: Ground-truth entity annotations (label + representative text) expected in content.

Examples

using var model = new LM("path/to/model.gguf");
using var pii = new PiiExtraction(model);
var dataset = new PiiExtractionTrainingDataset(pii);

// Create attachment from text
var textAttachment = Attachment.CreateFromText(
    "Invoice to: Jane Smith, 123 Main St, New York, NY 10001",
    "invoice");

dataset.AddSample(
    textAttachment,
    new[]
    {
        new EntityAnnotation("Person", "Jane Smith"),
        new EntityAnnotation("Address", "123 Main St, New York, NY 10001")
    });

// Create attachment from image file
var imageAttachment = Attachment.CreateFromFile("scanned_document.png");

dataset.AddSample(
    imageAttachment,
    new[]
    {
        new EntityAnnotation("Person", "John Doe"),
        new EntityAnnotation("DateOfBirth", "1985-03-15")
    });

dataset.ExportAsSharegpt("multimodal_dataset.json", overwrite: true);

Remarks

For text inputs, prefer CreateFromText(string, string). This overload delegates to AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>).

AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>)

Adds a training sample with an explicit InferenceModality.

public void AddSample(InferenceModality modality, Attachment content, IEnumerable<EntityAnnotation> annotations)

Parameters

modality InferenceModality: The inference modality to use for generating prompts and responses.
content Attachment: The content attachment to analyze.
annotations IEnumerable<EntityAnnotation>: Ground-truth entity annotations (label + representative text) expected in content.

Examples

using var model = new LM("path/to/model.gguf");
using var pii = new PiiExtraction(model);
var dataset = new PiiExtractionTrainingDataset(pii);

var attachment = Attachment.CreateFromText(
    "Contact: alice@example.com, bob@example.org",
    "text");

// Explicitly specify Text modality
dataset.AddSample(
    InferenceModality.Text,
    attachment,
    new[]
    {
        new EntityAnnotation("EmailAddress", "alice@example.com"),
        new EntityAnnotation("EmailAddress", "bob@example.org")
    });

// Add a Vision-only sample from an image
var imageAttachment = Attachment.CreateFromFile("id_card_scan.png");

dataset.AddSample(
    InferenceModality.Vision,
    imageAttachment,
    new[]
    {
        new EntityAnnotation("Person", "Marie Dupont"),
        new EntityAnnotation("IDNumber", "FR-123456789")
    });

dataset.ExportAsSharegpt("modality_specific_dataset.json", overwrite: true);

Remarks

This method assembles a ShareGPT-style conversation from the configured prompts and appends a ChatTrainingSample whose assistant response reflects the provided annotations. When EnableModalityAugmentation is true and modality is Multimodal, additional samples are appended for Text and Vision.

Table of Contents

Method AddSample

AddSample(string, IEnumerable<EntityAnnotation>)

Parameters

Examples

Remarks

AddSample(Attachment, IEnumerable<EntityAnnotation>)

Parameters

Examples

Remarks

AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>)

Parameters

Examples

Remarks