Table of Contents

Class Configuration

Namespace
LMKit.Global
Assembly
LM-Kit.NET.dll

A static class providing global configuration settings for the LM-Kit runtime.

public static class Configuration
Inheritance
Configuration
Inherited Members

Fields

DownloadChunkSize

Gets or sets the size of the download chunks. This value determines the size of each chunk when downloading data.

EnableCompletionHealing

Gets or sets a value indicating whether completion healing is enabled. Completion healing attempts to correct errors during the completion process.

EnableContextRecycling

Gets or sets a value indicating whether context recycling is enabled. This helps in reusing context data to improve performance.

EnableContextTokenHealing

Gets or sets a value indicating whether context token healing is enabled. This setting helps in correcting errors in the context tokenization.

EnableDynamicSampling

Gets or sets a value indicating whether the dynamic sampling strategy is enabled.

When enabled, the dynamic sampling strategy is applied during inference time to optimize model performance and the quality of generated outputs by utilizing multiple token selection methods.

Key features of dynamic sampling include:

  • Dynamic Constrained GenerationRestricts the token space at each decoding step based on real-time conditions, ensuring relevance and adherence to specific constraints.
  • Perplexity-Based Token SelectionSelects tokens that minimize perplexity, enhancing the coherence and contextual consistency of the generated output.
  • Context-Aware SamplingLeverages predefined contextual data to guide token choices, resulting in more fluent and contextually appropriate completions.
  • Speculative SamplingIncorporates speculative sampling techniques based on real-time natural language processing (NLP) analysis during the decoding process.
  • Adaptive Model CompatibilityEliminates the need for model fine-tuning to achieve high accuracy. The strategy adapts to the model's stylistic preferences during inference while maintaining low perplexity for future token selections.

Dynamic sampling acts as a real-time "voting" mechanism, blending constrained sampling with speculative sampling based on the current decoding state during inference.

This strategy is particularly effective at reducing inference times while improving accuracy and quality. It excels in tasks such as function calling, classification, and information extraction.

EnableKVCacheRecycling

Gets or sets a value indicating whether KV (key-value) cache recycling is enabled. KV cache recycling can help reduce memory footprint and improve performance.

EnableModelCache

Gets or sets a value indicating whether model caching is enabled. This helps in improving the performance by reusing models.

EnableModelChecksumValidation

Gets or sets a value indicating whether model file integrity must be validated during loading.

EnableTokenHealing

Gets or sets a value indicating whether token healing is enabled. Token healing attempts to correct errors in tokenization.

EnableTokenizationCache

Gets or sets a value indicating whether tokenization caching is enabled. Tokenization caching improves performance by storing the results of tokenization operations.

MaxCachedContextLength

Gets or sets the maximum length of cached context. This value determines the maximum length of context data that can be cached.

TokenizerCacheLength

Gets or sets the length of the tokenizer cache. This value determines how many tokens are stored in the cache.

UseAsyncModelAttributesLoading

Gets or sets a value indicating whether asynchronous model attributes loading is enabled. This can improve performance by loading model attributes asynchronously.

Properties

MinContextSize

Gets or sets the minimum context size. This value determines the smallest context size allowed.

ModelStorageDirectory

Gets or sets the default storage path for model files. If the path does not exist during a set operation, it will be created automatically.

ThreadCount

Gets or sets the max number of threads to be used for processing. This value defaults to the number of physical cores available on the machine.