Enum LM.Precision
Represents the different precision types for LM models.
public enum LM.Precision
Fields
ALL_F32 = 0Full precision using 32-bit floating point (FP32).
MOSTLY_F16 = 1Mixed precision using mostly 16-bit floating point (FP16), except for 1D tensors.
MOSTLY_Q4_0 = 2Quantized precision using mostly 4-bit integers (Q4_0), except for 1D tensors.
MOSTLY_Q4_1 = 3Quantized precision using mostly 4-bit integers (Q4_1), except for 1D tensors.
MOSTLY_Q8_0 = 7Quantized precision using mostly 8-bit integers (Q8_0), except for 1D tensors.
MOSTLY_Q5_0 = 8Quantized precision using mostly 5-bit integers (Q5_0), except for 1D tensors.
MOSTLY_Q5_1 = 9Quantized precision using mostly 5-bit integers (Q5_1), except for 1D tensors.
MOSTLY_Q2_K = 10Quantized precision using mostly 2-bit integers with K-means clustering (Q2_K), except for 1D tensors.
MOSTLY_Q3_K_S = 11Quantized precision using mostly 3-bit integers with K-means clustering, small size (Q3_K_S), except for 1D tensors.
MOSTLY_Q3_K_M = 12Quantized precision using mostly 3-bit integers with K-means clustering, medium size (Q3_K_M), except for 1D tensors.
MOSTLY_Q3_K_L = 13Quantized precision using mostly 3-bit integers with K-means clustering, large size (Q3_K_L), except for 1D tensors.
MOSTLY_Q4_K_S = 14Quantized precision using mostly 4-bit integers with K-means clustering, small size (Q4_K_S), except for 1D tensors.
MOSTLY_Q4_K_M = 15Quantized precision using mostly 4-bit integers with K-means clustering, medium size (Q4_K_M), except for 1D tensors.
MOSTLY_Q5_K_S = 16Quantized precision using mostly 5-bit integers with K-means clustering, small size (Q5_K_S), except for 1D tensors.
MOSTLY_Q5_K_M = 17Quantized precision using mostly 5-bit integers with K-means clustering, medium size (Q5_K_M), except for 1D tensors.
MOSTLY_Q6_K = 18Quantized precision using mostly 6-bit integers with K-means clustering (Q6_K), except for 1D tensors.
MOSTLY_IQ2_XXS = 19Quantized precision using mostly 2-bit integers, extra extra small size (IQ2_XXS), except for 1D tensors.
MOSTLY_IQ2_XS = 20Quantized precision using mostly 2-bit integers, extra small size (IQ2_XS), except for 1D tensors.
MOSTLY_Q2_K_S = 21Quantized precision using mostly 2-bit integers with K-means clustering, small size (Q2_K_S), except for 1D tensors.
MOSTLY_IQ3_XS = 22Quantized precision using mostly 3-bit integers, extra small size (IQ3_XS), except for 1D tensors.
MOSTLY_IQ3_XXS = 23Quantized precision using mostly 3-bit integers, extra extra small size (IQ3_XXS), except for 1D tensors.
MOSTLY_IQ1_S = 24Quantized precision using mostly 1-bit integers, small size (IQ1_S), except for 1D tensors.
MOSTLY_IQ4_NL = 25Quantized precision using mostly 4-bit integers, no-loss (IQ4_NL), except for 1D tensors.
MOSTLY_IQ3_S = 26Quantized precision using mostly 3-bit integers, small size (IQ3_S), except for 1D tensors.
MOSTLY_IQ3_M = 27Quantized precision using mostly 3-bit integers, medium size (IQ3_M), except for 1D tensors.
MOSTLY_IQ2_S = 28Quantized precision using mostly 2-bit integers, small size (IQ2_S), except for 1D tensors.
MOSTLY_IQ2_M = 29Quantized precision using mostly 2-bit integers, medium size (IQ2_M), except for 1D tensors.
MOSTLY_IQ4_XS = 30Quantized precision using mostly 4-bit integers, extra small size (IQ4_XS), except for 1D tensors.
MOSTLY_IQ1_M = 31Quantized precision using mostly 1-bit integers, medium size (IQ1_M), except for 1D tensors.
MOSTLY_BF16 = 32Quantized precision using mostly 16-bit brain floating point (BF16), except for 1D tensors.
MOSTLY_TQ1_0 = 36MOSTLY_TQ2_0 = 37GUESSED = 1024Precision type is guessed because it is not specified in the model file.