Class MirostatSampling
- Namespace
- LMKit.TextGeneration.Sampling
- Assembly
- LM-Kit.NET.dll
Specifies the Mirostat sampling strategy, a neural text decoding algorithm that directly controls perplexity.
Mirostat is a sophisticated algorithm designed to proactively uphold the quality of generated text within a predefined range throughout the text generation process.
It endeavors to achieve a harmonious equilibrium between coherence and diversity, skillfully sidestepping the pitfalls of subpar output resulting from either excessive repetition,
commonly referred to as "boredom traps," or lapses in coherence, known as "confusion traps."
The Mirostat algorithm is described in the paper https://arxiv.org/abs/2007.14966
public class MirostatSampling : TokenSampling
- Inheritance
-
MirostatSampling
- Derived
- Inherited Members
Properties
- LearningRate
Specifies the learning rate.
Use a floating-point value within the range [0 (more deterministic), 1 (more random)].
- Seed
Specifies the seed used for random number generation.
If set, the seed ensures reproducibility of the sampling process by controlling the randomness in token generation.
When not set (null), the model's behavior is non-deterministic as it relies on a system-generated random seed.
Use an unsigned integer (uint) value to define the seed for reproducibility, or leave it null for standard random behavior.
- TargetEntropy
Specifies the desired target cross-entropy (or surprise) value to be attained for the generated text.
A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
Use a floating-point value within the range [0, 10].
- Temperature
Specifies output randomness level.
Lowering the temperature leads to fewer random completions.
As the temperature approaches zero, the model becomes more deterministic and repetitive.
Use a floating-point value within the range [0 (more deterministic), 1 (more random)].