Class LLM
A class designed to manage instances of Large Language Models (LLMs) in GGUF format.
public sealed class LLM : IDisposable
- Inheritance
-
LLM
- Implements
- Inherited Members
Constructors
- LLM(string, DeviceConfiguration, LoadingOptions, ModelLoadingProgressCallback)
Creates an instance of the Model class from a file.
- LLM(Uri, string, DeviceConfiguration, LoadingOptions, ModelDownloadingProgressCallback, ModelLoadingProgressCallback)
Creates an instance of the Model class from a System.Uri object.
Properties
- Architecture
Retrieves the architecture type of the model. Ie: 'llama', 'bert', 'phi'...
- ChatTemplateFormat
Retrieves or sets the format of the model chat template as detected by LMKit.
- ContextLength
Gets the context size the model was trained on.
- Description
Specifies the model description.
- EmbeddingSize
Gets the dimension of embedding vectors produced by the model.
- GpuLayerCount
Gets the count of layers that have been previously loaded into the VRAM (Video Random Access Memory) or GPU (Graphics Processing Unit) memory.
- IsEmbeddingModel
Indicates whether the model primarily functions as an embedding model.
- LayerCount
Specifies the number of input layers in the model.
- MainGpu
Gets the GPU used for scratch and small tensors.
- ModelMetadata
Specifies metadata keys in the model.
- ModelPath
Gets the path of the model's file.
- ModelType
Gets the precision of the model input tensors.
- Name
Specifies the model name.
- ParameterCount
Gets the number of parameters of the model.
- RopeAlgorithm
Gets the type of rope algorithm used for positional encoding in the model.
- RopeFreqScaleTrain
Gets the RoPE frequency scaling factor used for training the rope positional encoding in the model.
- Size
Gets the model size, in bytes.
- Vocabulary
Gets the model's vocabulary handler, offering tokenization features.
Methods
- ApplyLoraAdapter(LoraAdapterSource, int)
Applies a Low-Rank Adaptation (LoRA) transformation to the model weights using parameters from a LoraAdapterSource instance. This method adjusts the model weights dynamically based on the adaptation parameters provided in the LoraAdapterSource.
- ApplyLoraAdapter(string, float, int)
Applies a Low-Rank Adaptation (LoRA) transformation to the model weights using parameters from a specified file. This method adjusts the model weights dynamically based on the adaptation parameters provided in the file.
- ClearCache()
Removes all cached resources linked to this model instance from memory.
- Dispose()
Ensures the release of this instance and the complete removal of all associated unmanaged resources.
- GetDefaultModelStoragePath()
Retrieves the default storage path for model files within the application's data directory.
- ValidateFormat(string, bool)
Validates the model file to ensure it conforms to the GGUF format, including its header and overall structure.