Table of Contents

Enum LLM.Precision

Namespace
LMKit.Model
Assembly
LM-Kit.NET.dll

Represents the different precision types for LLM models.

public enum LLM.Precision

Fields

ALL_F32 = 0

Full precision using 32-bit floating point (FP32).

MOSTLY_F16 = 1

Mixed precision using mostly 16-bit floating point (FP16), except for 1D tensors.

MOSTLY_Q4_0 = 2

Quantized precision using mostly 4-bit integers (Q4_0), except for 1D tensors.

MOSTLY_Q4_1 = 3

Quantized precision using mostly 4-bit integers (Q4_1), except for 1D tensors.

MOSTLY_Q8_0 = 7

Quantized precision using mostly 8-bit integers (Q8_0), except for 1D tensors.

MOSTLY_Q5_0 = 8

Quantized precision using mostly 5-bit integers (Q5_0), except for 1D tensors.

MOSTLY_Q5_1 = 9

Quantized precision using mostly 5-bit integers (Q5_1), except for 1D tensors.

MOSTLY_Q2_K = 10

Quantized precision using mostly 2-bit integers with K-means clustering (Q2_K), except for 1D tensors.

MOSTLY_Q3_K_S = 11

Quantized precision using mostly 3-bit integers with K-means clustering, small size (Q3_K_S), except for 1D tensors.

MOSTLY_Q3_K_M = 12

Quantized precision using mostly 3-bit integers with K-means clustering, medium size (Q3_K_M), except for 1D tensors.

MOSTLY_Q3_K_L = 13

Quantized precision using mostly 3-bit integers with K-means clustering, large size (Q3_K_L), except for 1D tensors.

MOSTLY_Q4_K_S = 14

Quantized precision using mostly 4-bit integers with K-means clustering, small size (Q4_K_S), except for 1D tensors.

MOSTLY_Q4_K_M = 15

Quantized precision using mostly 4-bit integers with K-means clustering, medium size (Q4_K_M), except for 1D tensors.

MOSTLY_Q5_K_S = 16

Quantized precision using mostly 5-bit integers with K-means clustering, small size (Q5_K_S), except for 1D tensors.

MOSTLY_Q5_K_M = 17

Quantized precision using mostly 5-bit integers with K-means clustering, medium size (Q5_K_M), except for 1D tensors.

MOSTLY_Q6_K = 18

Quantized precision using mostly 6-bit integers with K-means clustering (Q6_K), except for 1D tensors.

MOSTLY_IQ2_XXS = 19

Quantized precision using mostly 2-bit integers, extra extra small size (IQ2_XXS), except for 1D tensors.

MOSTLY_IQ2_XS = 20

Quantized precision using mostly 2-bit integers, extra small size (IQ2_XS), except for 1D tensors.

MOSTLY_Q2_K_S = 21

Quantized precision using mostly 2-bit integers with K-means clustering, small size (Q2_K_S), except for 1D tensors.

MOSTLY_IQ3_XS = 22

Quantized precision using mostly 3-bit integers, extra small size (IQ3_XS), except for 1D tensors.

MOSTLY_IQ3_XXS = 23

Quantized precision using mostly 3-bit integers, extra extra small size (IQ3_XXS), except for 1D tensors.

MOSTLY_IQ1_S = 24

Quantized precision using mostly 1-bit integers, small size (IQ1_S), except for 1D tensors.

MOSTLY_IQ4_NL = 25

Quantized precision using mostly 4-bit integers, no-loss (IQ4_NL), except for 1D tensors.

MOSTLY_IQ3_S = 26

Quantized precision using mostly 3-bit integers, small size (IQ3_S), except for 1D tensors.

MOSTLY_IQ3_M = 27

Quantized precision using mostly 3-bit integers, medium size (IQ3_M), except for 1D tensors.

MOSTLY_IQ2_S = 28

Quantized precision using mostly 2-bit integers, small size (IQ2_S), except for 1D tensors.

MOSTLY_IQ2_M = 29

Quantized precision using mostly 2-bit integers, medium size (IQ2_M), except for 1D tensors.

MOSTLY_IQ4_XS = 30

Quantized precision using mostly 4-bit integers, extra small size (IQ4_XS), except for 1D tensors.

MOSTLY_IQ1_M = 31

Quantized precision using mostly 1-bit integers, medium size (IQ1_M), except for 1D tensors.

MOSTLY_BF16 = 32

Quantized precision using mostly 16-bit brain floating point (BF16), except for 1D tensors.

MOSTLY_Q4_0_4_4 = 33

Quantized precision using mostly 4-bit integers (Q4_0), with additional clustering (Q4_0_4_4).

MOSTLY_Q4_0_4_8 = 34

Quantized precision using mostly 4-bit integers (Q4_0), with additional clustering (Q4_0_4_8).

MOSTLY_Q4_0_8_8 = 35

Quantized precision using mostly 4-bit integers (Q4_0), with additional clustering (Q4_0_8_8).

MOSTLY_TQ1_0 = 36
MOSTLY_TQ2_0 = 37
GUESSED = 1024

Precision type is guessed because it is not specified in the model file.