Method Tokenize
- Namespace
- LMKit.Tokenization
- Assembly
- LM-Kit.NET.dll
Tokenize(string)
Tokenizes the given text into an array of token identifiers. Special tokens are added and parsed based on the configuration.
public int[] Tokenize(string text)
Parameters
textstringThe text to tokenize.
Returns
- int[]
An array of integers where each entry represents a token identifier.
Examples
using LMKit.Model;
using System;
LM model = LM.LoadFromModelID("llama-3.2-1b");
int[] tokens = model.Vocabulary.Tokenize("Hello, world!");
Console.WriteLine($"Token count: {tokens.Length}");
Console.WriteLine($"Tokens: [{string.Join(", ", tokens)}]");