Class MemoryEstimation

Namespace: LMKit.Hardware

Assembly: LM-Kit.NET.dll

Provides methods for estimating memory requirements and fitting model and context parameters to available device memory using llama.cpp's native memory estimation.

public static class MemoryEstimation

Inheritance: object

MemoryEstimation

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Unlike the heuristic-based approach in DeviceConfiguration, this class uses llama.cpp's built-in llama_params_fit function to accurately probe available memory across all devices (CPU + GPUs), accounting for KV cache, compute buffers, and tensor placement.

The estimation runs without loading the full model weights, making it suitable for pre-flight checks before allocating resources.

Methods

FitParameters(LM, uint, uint): Fits model and context parameters to available device memory using a loaded model instance.

FitParameters(string, GgufEncryptionScheme, string, uint, uint, DeviceConfiguration): Fits model and context parameters to available device memory for an LM-Kit encrypted GGUF container. Only the plaintext metadata block (a few MB) is decrypted; tensor bytes are never decoded.

FitParameters(string, uint, uint, DeviceConfiguration): Fits model and context parameters to available device memory, determining the optimal context size and GPU layer count that can be allocated without running out of memory.

Table of Contents

Class MemoryEstimation

Remarks

Methods