Method AddSample
- Namespace
- LMKit.TextAnalysis.Training
- Assembly
- LM-Kit.NET.dll
AddSample(string, IEnumerable<EntityAnnotation>)
Adds a training sample from raw text content using the engine's preferred modality.
public void AddSample(string content, IEnumerable<EntityAnnotation> annotations)
Parameters
contentstringThe textual content to analyze for named entities.
annotationsIEnumerable<EntityAnnotation>Ground-truth entity annotations (label + representative text) expected in
content.
Examples
using var model = new LM("path/to/model.gguf");
var ner = new NamedEntityRecognition(model);
var dataset = new NamedEntityRecognitionTrainingDataset(ner);
// Add samples with various entity types
dataset.AddSample(
"Amazon reported $134 billion in revenue for Q3 2023.",
new[]
{
new EntityAnnotation("Organization", "Amazon"),
new EntityAnnotation("Money", "$134 billion"),
new EntityAnnotation("Date", "Q3 2023")
});
dataset.AddSample(
"The Treaty of Versailles was signed on June 28, 1919.",
new[]
{
new EntityAnnotation("Event", "Treaty of Versailles"),
new EntityAnnotation("Date", "June 28, 1919")
});
// Negative sample: no entities present
dataset.AddSample(
"The weather is nice today.",
Array.Empty<EntityAnnotation>());
dataset.ExportAsSharegpt("ner_dataset.json", overwrite: true);
Remarks
The content is wrapped as a text Attachment and
forwarded to AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>).
AddSample(Attachment, IEnumerable<EntityAnnotation>)
Adds a training sample from an Attachment using the engine's preferred modality.
public void AddSample(Attachment content, IEnumerable<EntityAnnotation> annotations)
Parameters
contentAttachmentThe input attachment (e.g., text, image, or multimodal source) to analyze.
annotationsIEnumerable<EntityAnnotation>Ground-truth entity annotations (label + representative text) expected in
content.
Examples
using var model = new LM("path/to/model.gguf");
var ner = new NamedEntityRecognition(model);
var dataset = new NamedEntityRecognitionTrainingDataset(ner);
// Create attachment from text
var textAttachment = Attachment.CreateFromText(
"President Biden met with Chancellor Scholz in Berlin on February 15, 2024.",
"news");
dataset.AddSample(
textAttachment,
new[]
{
new EntityAnnotation("Person", "President Biden"),
new EntityAnnotation("Person", "Chancellor Scholz"),
new EntityAnnotation("Location", "Berlin"),
new EntityAnnotation("Date", "February 15, 2024")
});
// Create attachment from image file (e.g., scanned news article)
var imageAttachment = Attachment.CreateFromFile("news_clipping.png");
dataset.AddSample(
imageAttachment,
new[]
{
new EntityAnnotation("Organization", "United Nations"),
new EntityAnnotation("Location", "Geneva"),
new EntityAnnotation("Date", "March 2024")
});
dataset.ExportAsSharegpt("multimodal_ner_dataset.json", overwrite: true);
Remarks
For text inputs, prefer CreateFromText(string, string). This overload delegates to AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>).
AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>)
Adds a training sample with an explicit InferenceModality.
public void AddSample(InferenceModality modality, Attachment content, IEnumerable<EntityAnnotation> annotations)
Parameters
modalityInferenceModalityThe inference modality to use for generating prompts and responses.
contentAttachmentThe content attachment to analyze.
annotationsIEnumerable<EntityAnnotation>Ground-truth entity annotations (label + representative text) expected in
content.
Examples
using var model = new LM("path/to/model.gguf");
var ner = new NamedEntityRecognition(model);
var dataset = new NamedEntityRecognitionTrainingDataset(ner);
var attachment = Attachment.CreateFromText(
"SpaceX launched Falcon 9 from Cape Canaveral carrying Starlink satellites.",
"text");
// Explicitly specify Text modality
dataset.AddSample(
InferenceModality.Text,
attachment,
new[]
{
new EntityAnnotation("Organization", "SpaceX"),
new EntityAnnotation("Product", "Falcon 9"),
new EntityAnnotation("Location", "Cape Canaveral"),
new EntityAnnotation("Product", "Starlink")
});
// Add a Vision-only sample from an image
var imageAttachment = Attachment.CreateFromFile("business_card.png");
dataset.AddSample(
InferenceModality.Vision,
imageAttachment,
new[]
{
new EntityAnnotation("Person", "John Smith"),
new EntityAnnotation("Organization", "Acme Corporation"),
new EntityAnnotation("Phone", "+1 555-123-4567"),
new EntityAnnotation("Email", "john.smith@acme.com")
});
dataset.ExportAsSharegpt("modality_specific_ner_dataset.json", overwrite: true);
Remarks
This method assembles a ShareGPT-style conversation from the configured prompts and
appends a ChatTrainingSample whose assistant response reflects the
provided annotations. When EnableModalityAugmentation is
true and modality is
Multimodal, additional samples are appended for
Text and Vision.