Method GetTrainingData
- Namespace
- LMKit.TextAnalysis
- Assembly
- LM-Kit.NET.dll
GetTrainingData(TrainingDataset, int, bool, int?, bool)
Retrieves training data for fine-tuning a sentiment analysis model from the specified dataset.
public static List<(string, SentimentAnalysis.SentimentCategory)> GetTrainingData(SentimentAnalysis.TrainingDataset dataset, int maxSamples = 1000, bool shuffle = true, int? seed = null, bool neutralSupport = true)
Parameters
datasetSentimentAnalysis.TrainingDatasetThe dataset from which to retrieve the training data.
maxSamplesintThe maximum number of samples to retrieve from the dataset. The default is
1000.shuffleboolIndicates whether to shuffle the dataset before selecting samples. The default is
true.seedint?An optional seed for the random number generator used when shuffling. If
null, the shuffle operation will not be seeded.neutralSupportboolSpecifies whether support for neutral samples should be included. The default is
true.
Returns
- List<(string, SentimentAnalysis.SentimentCategory)>
A list of tuples, where each tuple contains a string (the text) and a SentimentAnalysis.SentimentCategory (the sentiment label).
Examples
// Retrieve training data
var trainingData = SentimentAnalysis.GetTrainingData(
SentimentAnalysis.TrainingDataset.LMKit2024_09_INT,
maxSamples: 500,
shuffle: true,
seed: 42,
neutralSupport: true);
// Use the training data as needed
foreach (var sample in trainingData)
{
Console.WriteLine($"Text: {sample.Item1}, Sentiment: {sample.Item2}");
}
Remarks
This method provides predefined datasets that can be used for training or fine-tuning the sentiment analysis model.
Exceptions
- ArgumentException
Thrown if the dataset is not recognized.