Method GetTrainingData
- Namespace
- LMKit.TextAnalysis
- Assembly
- LM-Kit.NET.dll
GetTrainingData(TrainingDataset, int, bool, int?)
Retrieves training data for fine-tuning a sarcasm detection model from a specified dataset.
public static List<(string, bool)> GetTrainingData(SarcasmDetection.TrainingDataset dataset, int maxSamples = 1000, bool shuffle = true, int? seed = null)
Parameters
dataset
SarcasmDetection.TrainingDatasetThe dataset to retrieve training data from.
maxSamples
intThe maximum number of samples to retrieve from the dataset. Default is
1000
.shuffle
boolIndicates whether to shuffle the dataset before selecting samples. Default is
true
.seed
int?An optional seed for the random number generator used when shuffling. If
null
, the shuffle operation will not be seeded.
Returns
- List<(string, bool)>
A list of tuples where each tuple contains a string (the text) and a bool (the sarcastic tone label).
Examples
// Retrieve training data
var trainingData = SarcasmDetection.GetTrainingData(
SarcasmDetection.TrainingDataset.OpenAIEvalsSarcasm,
maxSamples: 500,
shuffle: true,
seed: 42);
// Use the training data as needed
foreach (var sample in trainingData)
{
Console.WriteLine($"Text: {sample.Item1}, Is Sarcastic: {sample.Item2}");
}
Remarks
This method provides predefined datasets that can be used for training or fine-tuning the sarcasm detection model.
Exceptions
- ArgumentException
Thrown if the dataset is not recognized.