Table of Contents

Class LLM

Namespace
LMKit.Model
Assembly
LM-Kit.NET.dll

A class designed to manage instances of Large Language Models (LLMs) in GGUF format.

public sealed class LLM : IDisposable
Inheritance
LLM
Implements
Inherited Members

Constructors

LLM(string, DeviceConfiguration, LoadingOptions, ModelLoadingProgressCallback)

Creates an instance of the Model class from a file.

LLM(Uri, string, DeviceConfiguration, LoadingOptions, ModelDownloadingProgressCallback, ModelLoadingProgressCallback)

Creates an instance of the Model class from a System.Uri object.

Properties

Architecture

Retrieves the architecture type of the model. Ie: 'llama', 'bert', 'phi'...

ChatTemplateFormat

Retrieves or sets the format of the model chat template as detected by LMKit.

ContextLength

Gets the context size the model was trained on.

Description

Specifies the model description.

EmbeddingSize

Gets the dimension of embedding vectors produced by the model.

GpuLayerCount

Gets the count of layers that have been previously loaded into the VRAM (Video Random Access Memory) or GPU (Graphics Processing Unit) memory.

IsEmbeddingModel

Indicates whether the model primarily functions as an embedding model.

LayerCount

Specifies the number of input layers in the model.

MainGpu

Gets the GPU used for scratch and small tensors.

ModelMetadata

Specifies metadata keys in the model.

ModelPath

Gets the path of the model's file.

ModelType

Gets the precision of the model input tensors.

Name

Specifies the model name.

ParameterCount

Gets the number of parameters of the model.

RopeAlgorithm

Gets the type of rope algorithm used for positional encoding in the model.

RopeFreqScaleTrain

Gets the RoPE frequency scaling factor used for training the rope positional encoding in the model.

Size

Gets the model size, in bytes.

Vocabulary

Gets the model's vocabulary handler, offering tokenization features.

Methods

ApplyLoraAdapter(LoraAdapterSource, int)

Applies a Low-Rank Adaptation (LoRA) transformation to the model weights using parameters from a LoraAdapterSource instance. This method adjusts the model weights dynamically based on the adaptation parameters provided in the LoraAdapterSource.

ApplyLoraAdapter(string, float, int)

Applies a Low-Rank Adaptation (LoRA) transformation to the model weights using parameters from a specified file. This method adjusts the model weights dynamically based on the adaptation parameters provided in the file.

ClearCache()

Removes all cached resources linked to this model instance from memory.

Dispose()

Ensures the release of this instance and the complete removal of all associated unmanaged resources.

GetDefaultModelStoragePath()

Retrieves the default storage path for model files within the application's data directory.

ValidateFormat(string, bool)

Validates the model file to ensure it conforms to the GGUF format, including its header and overall structure.