Table of Contents

Method GetOptimalContextSize

Namespace
LMKit.Graphics
Assembly
LM-Kit.NET.dll

GetOptimalContextSize()

Determines the optimal GPU context size based on the currently available GPU device's free memory.

public static int GetOptimalContextSize()

Returns

int

An int that represents an optimal context size for the detected GPU device. Typical values range from 2,048 to 32,768, depending on the available GPU memory.

Remarks

The method identifies the GPU device with the best performance characteristics and then calculates its free memory. Based on predefined memory thresholds, it returns an integer representing an optimal "context size" that can be used for GPU-accelerated operations. The context size scales with available memory, allowing for more complex or larger data sets to be processed efficiently.

If no suitable device is found or if the device's free memory does not meet any threshold, a default value is returned.

GetOptimalContextSize(LM)

Determines the optimal GPU context size based on the currently available GPU device's free memory, with an option to cap this size according to the specified language model.

public static int GetOptimalContextSize(LM model)

Parameters

model LM

An optional LM instance whose maximum allowable context length will be used to cap the computed size.

Returns

int

An int representing the final optimal context size. Typical values range from 2,048 to 32,768, depending on the available GPU memory and the model's ContextLength.

Remarks

This method first identifies the GPU device with the best performance using LMKit.Graphics.Gpu.GpuDeviceInfo.GetBestGpuDevice(). It then calculates the device's free memory to select an appropriate context size based on pre-defined thresholds. If a model is provided, the final context size is further constrained by the model's maximum context length.

If no suitable GPU device is found or the device's free memory does not meet any threshold, a default size is returned. The final recommended value ensures efficient GPU-accelerated operations while respecting the model's capabilities.