Property KVCacheQuantization
KVCacheQuantization
Gets the data type the context's KV-cache is stored in, that is, its quantization level. F16 is the unquantized default; lower-precision types such as Q8_0 trade accuracy for a smaller per-token footprint.
public KVCacheType KVCacheQuantization { get; }