Namespace LMKit.TextGeneration.Sampling
Classes
- Grammar
Represents a grammar used in text generation models to define and enforce grammar rules during the generation process. The Grammar class enables constrained output generation by specifying the allowed syntax and structure of the generated text. This allows developers to ensure that the model's output adheres to a predefined format, such as JSON, arithmetic expressions, or custom-defined grammars.
Benefits of using the Grammar class include:
- Enforcing syntactic correctness in the generated output.
- Restricting the output to a specific format or language.
- Reducing the likelihood of invalid or nonsensical outputs.
- Facilitating the extraction and parsing of generated data.
- GreedyDecoding
Handles greedy decoding strategy.
This algorithm selects the token with the highest probability, ensuring complete determinism.
- LogitBias
Handles the rules for applying logit bias during token sampling.
Logit bias enables the prevention of specific text chunks or provides guidance to increase or decrease the likelihood of a word appearing.
- Mirostat2Sampling
Specifies the Mirostat sampling strategy in version 2, a neural text decoding algorithm that directly controls perplexity.
Mirostat is a sophisticated algorithm designed to proactively uphold the quality of generated text within a predefined range throughout the text generation process.
It endeavors to achieve a harmonious equilibrium between coherence and diversity, skillfully sidestepping the pitfalls of subpar output resulting from either excessive repetition, commonly referred to as "boredom traps," or lapses in coherence, known as "confusion traps."
The Mirostat algorithm is described in the paper https://arxiv.org/abs/2007.14966
- MirostatSampling
Specifies the Mirostat sampling strategy, a neural text decoding algorithm that directly controls perplexity.
Mirostat is a sophisticated algorithm designed to proactively uphold the quality of generated text within a predefined range throughout the text generation process.
It endeavors to achieve a harmonious equilibrium between coherence and diversity, skillfully sidestepping the pitfalls of subpar output resulting from either excessive repetition, commonly referred to as "boredom traps," or lapses in coherence, known as "confusion traps."
The Mirostat algorithm is described in the paper https://arxiv.org/abs/2007.14966
- RandomSampling
Handles random sampling strategy (also known as temperature-based sampling).
- RepetitionPenalty
Handles the rules for repetition penalties applied during text completion.
- TokenSampling
Handles the sampling strategy used during text completion.
Enums
- Grammar.PredefinedGrammar
Defines the types of predefined grammar rules available for use in text generation.
- LogitBiasSetMode
Defines the modes for updating bias values in a bias configuration.
- RandomSampling.RandomSamplers
Provides a comprehensive overview of the assortment of samplers available for implementation within the RandomSampling strategy, each offering unique selection mechanisms.