Class Configuration
A static class providing global configuration settings for the LM-Kit runtime.
public static class Configuration
- Inheritance
-
Configuration
- Inherited Members
Fields
- DownloadChunkSize
Gets or sets the size of the download chunks. This value determines the size of each chunk when downloading data.
- EnableCompletionHealing
Gets or sets a value indicating whether completion healing is enabled. Completion healing attempts to correct errors during the completion process.
- EnableContextRecycling
Gets or sets a value indicating whether context recycling is enabled. This helps in reusing context data to improve performance.
- EnableContextTokenHealing
Gets or sets a value indicating whether context token healing is enabled. This setting helps in correcting errors in the context tokenization.
- EnableDynamicSampling
Gets or sets a value indicating whether the dynamic sampling strategy is enabled.
When enabled, the dynamic sampling strategy is applied during inference time to optimize model performance and the quality of generated outputs by utilizing multiple token selection methods.
Key features of dynamic sampling include:
- Dynamic Constrained GenerationRestricts the token space at each decoding step based on real-time conditions, ensuring relevance and adherence to specific constraints.
- Perplexity-Based Token SelectionSelects tokens that minimize perplexity, enhancing the coherence and contextual consistency of the generated output.
- Context-Aware SamplingLeverages predefined contextual data to guide token choices, resulting in more fluent and contextually appropriate completions.
- Speculative SamplingIncorporates speculative sampling techniques based on real-time natural language processing (NLP) analysis during the decoding process.
- Adaptive Model CompatibilityEliminates the need for model fine-tuning to achieve high accuracy. The strategy adapts to the model's stylistic preferences during inference while maintaining low perplexity for future token selections.
Dynamic sampling acts as a real-time "voting" mechanism, blending constrained sampling with speculative sampling based on the current decoding state during inference.
This strategy is particularly effective at reducing inference times while improving accuracy and quality. It excels in tasks such as function calling, classification, and information extraction.
- EnableKVCacheRecycling
Gets or sets a value indicating whether KV (key-value) cache recycling is enabled. KV cache recycling can help reduce memory footprint and improve performance.
- EnableModelCache
Gets or sets a value indicating whether model caching is enabled. This helps in improving the performance by reusing models.
- EnableTokenHealing
Gets or sets a value indicating whether token healing is enabled. Token healing attempts to correct errors in tokenization.
- EnableTokenizationCache
Gets or sets a value indicating whether tokenization caching is enabled. Tokenization caching improves performance by storing the results of tokenization operations.
- MaxCachedContextLength
Gets or sets the maximum length of cached context. This value determines the maximum length of context data that can be cached.
- TokenizerCacheLength
Gets or sets the length of the tokenizer cache. This value determines how many tokens are stored in the cache.
Properties
- MinContextSize
Gets or sets the minimum context size. This value determines the smallest context size allowed.
- ThreadCount
Gets or sets the max number of threads to be used for processing. This value defaults to the number of physical cores available on the machine.