Property KVCacheQuantization

Namespace: LMKit.Inference

Assembly: LM-Kit.NET.dll

KVCacheQuantization

Gets the data type the context's KV-cache is stored in, that is, its quantization level. F16 is the unquantized default; lower-precision types such as Q8_0 trade accuracy for a smaller per-token footprint.

public KVCacheType KVCacheQuantization { get; }

Property Value

KVCacheType

Table of Contents

Property KVCacheQuantization

KVCacheQuantization

Property Value