Table of Contents

Class PiiExtractionTrainingDataset

Namespace
LMKit.TextAnalysis.Training
Assembly
LM-Kit.NET.dll

Training dataset builder specialized for the PII/Entity Extraction engine. Converts labeled entity annotations into ChatTrainingSample items usable for supervised fine-tuning.

public sealed class PiiExtractionTrainingDataset : TrainingDataset
Inheritance
PiiExtractionTrainingDataset
Inherited Members

Remarks

This dataset uses the current PiiExtraction configuration (entity types, prompts, model, and preferred modality) to synthesize ShareGPT-style chat conversations where the assistant response reflects the ground-truth labels provided via EntityAnnotation instances.

Constructors

PiiExtractionTrainingDataset(PiiExtraction)

Initializes a PII/Entity-extraction-focused training dataset bound to a specific PiiExtraction configuration.

Properties

EnableModalityAugmentation

Gets or sets whether to add modality-augmented samples when the engine runs in Multimodal.

Methods

AddSample(Attachment, IEnumerable<EntityAnnotation>)

Adds a training sample from an Attachment using the engine’s preferred modality.

AddSample(InferenceModality, Attachment, IEnumerable<EntityAnnotation>)

Adds a training sample with an explicit InferenceModality.

AddSample(string, IEnumerable<EntityAnnotation>)

Adds a training sample from raw text content using the engine’s preferred modality.

See Also