Class Categorization
- Namespace
- LMKit.TextAnalysis
- Assembly
- LM-Kit.NET.dll
Provides functionality to classify content into predefined categories using a language model.
public sealed class Categorization
- Inheritance
-
Categorization
- Inherited Members
Examples
Example: Basic text classification
using LMKit.Model;
using LMKit.TextAnalysis;
using System;
using System.Collections.Generic;
// Load the language model
LM model = LM.LoadFromModelID("lmkit-tasks:4b-preview");
// Create the categorization engine
Categorization categorizer = new Categorization(model);
// Define categories
var categories = new List<string> { "sports", "technology", "politics", "entertainment" };
// Classify text
string article = "The team won the championship in overtime.";
int bestIndex = categorizer.GetBestCategory(categories, article);
Console.WriteLine($"Category: {categories[bestIndex]}");
Console.WriteLine($"Confidence: {categorizer.Confidence:P1}");
Example: Classification with category descriptions
using LMKit.Model;
using LMKit.TextAnalysis;
using System;
using System.Collections.Generic;
LM model = LM.LoadFromModelID("lmkit-tasks:4b-preview");
Categorization categorizer = new Categorization(model);
// Categories with descriptions for better accuracy
var categories = new List<string> { "urgent", "normal", "low" };
var descriptions = new List<string>
{
"Critical issues requiring immediate attention",
"Standard requests with normal priority",
"Minor issues that can wait"
};
string ticket = "Server is down, customers cannot access the website!";
int priority = categorizer.GetBestCategory(categories, descriptions, ticket);
Console.WriteLine($"Priority: {categories[priority]}");
Remarks
This engine supports categorization tasks for various content types, including plain text, images, PDF documents, HTML files, and Microsoft Office formats (DOCX, XLSX, PPTX).
For a complete list of supported file formats, see the Attachment class documentation.
Key Features
- Single-category classification with GetBestCategory(IList<string>, string, bool, CancellationToken)
- Multi-category classification with GetTopCategories(IList<string>, string, int, bool, CancellationToken)
- Support for category descriptions to improve accuracy
- Confidence scoring via Confidence property
- Optional embedding-based classification for performance
- Multimodal support for text, images, and documents
Constructors
- Categorization(LM)
Initializes a new instance of the Categorization class using the specified language model.
Properties
- AllowUnknownCategory
Gets or sets a value indicating whether the categorization engine can identify when content does not match any of the predefined categories. If set to
true, the engine returns-1(for single-category methods) or an empty list (for multi-category methods) when no suitable category is identified. If set tofalse, the engine always assigns the content to the nearest matching category, even if confidence is low.
- Confidence
Gets the confidence level of the most recent categorization operation.
- Guidance
Gets or sets optional guidance text that can influence the categorization process. This can be used to steer the model toward certain themes or constraints.
- MaxInputTokens
Gets or sets the maximum number of input tokens that the categorization engine will analyze.
- Model
Gets the underlying language model instance associated with this categorization object.
- PreferredInferenceModality
Gets or sets the preferred modality for inference. This determines whether text, image, or both modalities are used when processing input. Defaults to Multimodal.
- UseEmbeddingClassifier
Gets or sets a value indicating whether the classifier should use an embedding-based classification strategy. When set to
true, the engine is forced to classify using embeddings instead of a completion-based approach.
Methods
- GetBestCategory(IList<string>, Attachment, bool, CancellationToken)
Classifies the content provided as an attachment into one of the predefined categories.
- GetBestCategory(IList<string>, ImageBuffer, bool, CancellationToken)
Classifies the content provided in an image into one of the predefined categories. The underlying language model must support vision input.
- GetBestCategory(IList<string>, IList<string>, Attachment, bool, CancellationToken)
Classifies the content provided as an attachment into one of the predefined categories.
- GetBestCategory(IList<string>, IList<string>, string, bool, CancellationToken)
Classifies the specified plain text into one of the predefined categories.
- GetBestCategory(IList<string>, string, bool, CancellationToken)
Classifies the specified plain text into one of the predefined categories.
- GetBestCategoryAsync(IList<string>, Attachment, bool, CancellationToken)
Asynchronously classifies the content provided as an attachment into one of the predefined categories.
- GetBestCategoryAsync(IList<string>, IList<string>, Attachment, bool, CancellationToken)
Asynchronously classifies the content provided as an attachment into one of the predefined categories.
- GetBestCategoryAsync(IList<string>, IList<string>, ImageBuffer, bool, CancellationToken)
Asynchronously classifies the given image into one of the predefined categories. The underlying language model must support vision input.
- GetBestCategoryAsync(IList<string>, IList<string>, string, bool, CancellationToken)
Asynchronously classifies the specified plain text into one of the predefined categories.
- GetBestCategoryAsync(IList<string>, string, bool, CancellationToken)
Asynchronously classifies the specified plain text into one of the predefined categories.
- GetTopCategories(IList<string>, Attachment, int, bool, CancellationToken)
Classifies the content provided as an attachment into up to a specified maximum number of categories.
- GetTopCategories(IList<string>, ImageBuffer, int, bool, CancellationToken)
Classifies the given image into up to a specified maximum number of categories. The underlying language model must support vision input.
- GetTopCategories(IList<string>, IList<string>, Attachment, int, bool, CancellationToken)
Classifies the content provided as an attachment into up to a specified maximum number of categories.
- GetTopCategories(IList<string>, IList<string>, string, int, bool, CancellationToken)
Classifies the specified plain text into up to a specified maximum number of categories.
- GetTopCategories(IList<string>, string, int, bool, CancellationToken)
Classifies the specified plain text into up to a specified maximum number of categories.
- GetTopCategoriesAsync(IList<string>, Attachment, int, bool, CancellationToken)
Asynchronously classifies the content provided as an attachment into up to a specified maximum number of categories.
- GetTopCategoriesAsync(IList<string>, ImageBuffer, int, bool, CancellationToken)
Asynchronously classifies the given image into up to a specified maximum number of categories. The underlying language model must support vision input.
- GetTopCategoriesAsync(IList<string>, IList<string>, Attachment, int, bool, CancellationToken)
Asynchronously classifies the content provided as an attachment into up to a specified maximum number of categories.
- GetTopCategoriesAsync(IList<string>, IList<string>, string, int, bool, CancellationToken)
Asynchronously classifies the specified plain text into up to a specified maximum number of categories.
- GetTopCategoriesAsync(IList<string>, string, int, bool, CancellationToken)
Asynchronously classifies the specified plain text into up to a specified maximum number of categories.