Table of Contents

Class MirostatSampling

Namespace
LMKit.TextGeneration.Sampling
Assembly
LM-Kit.NET.dll

Specifies the Mirostat sampling strategy, a neural text decoding algorithm that directly controls perplexity.
Mirostat is a sophisticated algorithm designed to proactively uphold the quality of generated text within a predefined range throughout the text generation process.
It endeavors to achieve a harmonious equilibrium between coherence and diversity, skillfully sidestepping the pitfalls of subpar output resulting from either excessive repetition, commonly referred to as "boredom traps," or lapses in coherence, known as "confusion traps."
The Mirostat algorithm is described in the paper https://arxiv.org/abs/2007.14966

public class MirostatSampling : TokenSampling
Inheritance
MirostatSampling
Derived
Inherited Members

Properties

LearningRate

Specifies the learning rate.
Use a floating-point value within the range [0 (more deterministic), 1 (more random)].

Seed

Specifies the seed used for random number generation.
If set, the seed ensures reproducibility of the sampling process by controlling the randomness in token generation.
When not set (null), the model's behavior is non-deterministic as it relies on a system-generated random seed.
Use an unsigned integer (uint) value to define the seed for reproducibility, or leave it null for standard random behavior.

TargetEntropy

Specifies the desired target cross-entropy (or surprise) value to be attained for the generated text.
A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
Use a floating-point value within the range [0, 10].

Temperature

Specifies output randomness level.
Lowering the temperature leads to fewer random completions.
As the temperature approaches zero, the model becomes more deterministic and repetitive.
Use a floating-point value within the range [0 (more deterministic), 1 (more random)].