Method Quantize

Namespace: LMKit.Quantization

Assembly: LM-Kit.NET.dll

Quantize(string, Precision, bool, MetadataCollection)

Quantizes the model specified by Quantizer(string) and saves the quantized model to the specified destination file.

public void Quantize(string dstFileName, LM.Precision modelPrecision = Precision.MOSTLY_Q4_K_M, bool quantizeOutputTensor = true, LM.MetadataCollection metadataOverrides = null)

Parameters

dstFileName string

The file path where the quantized model will be saved.

modelPrecision LM.Precision

The desired precision mode for the quantized model.
Only the following enumeration members are accepted:

MOSTLY_Q4_0
MOSTLY_Q4_1
MOSTLY_Q5_0
MOSTLY_Q5_1
MOSTLY_Q8_0
MOSTLY_F16
ALL_F32
MOSTLY_Q2_K
MOSTLY_Q3_K_S
MOSTLY_Q3_K_M
MOSTLY_Q3_K_L
MOSTLY_Q4_K_S
MOSTLY_Q4_K_M
MOSTLY_Q5_K_S
MOSTLY_Q5_K_M
MOSTLY_Q6_K

Defaults to MOSTLY_Q4_K_M.

quantizeOutputTensor bool

Indicates whether the output tensor should be quantized. Defaults to true.

metadataOverrides LM.MetadataCollection

A collection of metadata overrides to apply during the quantization process. Defaults to null.

Exceptions

QuantizationException: Thrown when the quantization process fails.

Table of Contents

Method Quantize

Quantize(string, Precision, bool, MetadataCollection)

Parameters

Exceptions