Method Quantize
- Namespace
- LMKit.Quantization
- Assembly
- LM-Kit.NET.dll
Quantize(string, Precision, bool, MetadataCollection)
Quantizes the model specified by Quantizer(string) and saves the quantized model to the specified destination file.
public void Quantize(string dstFileName, LLM.Precision modelPrecision = Precision.MOSTLY_Q4_K_M, bool quantizeOutputTensor = true, LLM.MetadataCollection metadataOverrides = null)
Parameters
dstFileName
stringThe file path where the quantized model will be saved.
modelPrecision
LLM.PrecisionThe desired precision mode for the quantized model.
Only the following enumeration members are accepted:- MOSTLY_Q4_0
- MOSTLY_Q4_1
- MOSTLY_Q5_0
- MOSTLY_Q5_1
- MOSTLY_Q8_0
- MOSTLY_F16
- ALL_F32
- MOSTLY_Q2_K
- MOSTLY_Q3_K_S
- MOSTLY_Q3_K_M
- MOSTLY_Q3_K_L
- MOSTLY_Q4_K_S
- MOSTLY_Q4_K_M
- MOSTLY_Q5_K_S
- MOSTLY_Q5_K_M
- MOSTLY_Q6_K
quantizeOutputTensor
boolIndicates whether the output tensor should be quantized. Defaults to true.
metadataOverrides
LLM.MetadataCollectionA collection of metadata overrides to apply during the quantization process. Defaults to
null
.
Exceptions
- QuantizationException
Thrown when the quantization process fails.