Table of Contents

Property KVCacheQuantization

Namespace
LMKit.Inference
Assembly
LM-Kit.NET.dll

KVCacheQuantization

Gets the data type the context's KV-cache is stored in, that is, its quantization level. F16 is the unquantized default; lower-precision types such as Q8_0 trade accuracy for a smaller per-token footprint.

public KVCacheType KVCacheQuantization { get; }

Property Value

KVCacheType
Share