Table of Contents

Method Quantize

Namespace
LMKit.Quantization
Assembly
LM-Kit.NET.dll

Quantize(string, Precision, bool, MetadataCollection)

Quantizes the model specified by Quantizer(string) and saves the quantized model to the specified destination file.

public void Quantize(string dstFileName, LM.Precision modelPrecision = Precision.MOSTLY_Q4_K_M, bool quantizeOutputTensor = true, LM.MetadataCollection metadataOverrides = null)

Parameters

dstFileName string

The file path where the quantized model will be saved.

modelPrecision LM.Precision

The desired precision mode for the quantized model.
Only the following enumeration members are accepted:

<ul><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q4_0" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q4_1" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q5_0" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q5_1" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q8_0" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_F16" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.ALL_F32" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q2_K" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q3_K_S" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q3_K_M" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q3_K_L" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q4_K_S" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q4_K_M" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q5_K_S" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q5_K_M" data-throw-if-not-resolved="false"></xref></li><li><xref href="LMKit.Model.LM.Precision.MOSTLY_Q6_K" data-throw-if-not-resolved="false"></xref></li></ul>

Defaults to <xref href="LMKit.Model.LM.Precision.MOSTLY_Q4_K_M" data-throw-if-not-resolved="false"></xref>.
quantizeOutputTensor bool

Indicates whether the output tensor should be quantized. Defaults to true.

metadataOverrides LM.MetadataCollection

A collection of metadata overrides to apply during the quantization process. Defaults to null.

Exceptions

QuantizationException

Thrown when the quantization process fails.

Share