Class LoraTrainingParameters
- Namespace
- LMKit.Finetuning
- Assembly
- LM-Kit.NET.dll
Represents the parameters for training and fine-tuning AI models using the LoRA (Low-Rank Adaptation) approach.
public sealed class LoraTrainingParameters
- Inheritance
-
LoraTrainingParameters
- Inherited Members
Properties
- AdamAlpha
Gets or sets the learning rate (alpha) for the Adam optimizer. This parameter controls how much the model weights are adjusted with respect to the loss gradient.
- AdamBeta1
Gets or sets the exponential decay rate for the first moment estimates in the Adam optimizer. This rate controls how quickly the contributions from previous gradients diminish.
- AdamBeta2
Gets or sets the exponential decay rate for the second moment estimates in the Adam optimizer. This parameter helps stabilize the update steps across all parameters by normalizing gradient variances.
- AdamDecay
Gets or sets the AdamW optimizer's weight decay factor. This parameter controls the rate at which weights decrease during training.
- AdamDecayMinNDim
Gets or sets the minimum number of dimensions a tensor must have for AdamW weight decay to be applied. This ensures that weight decay is only applied to tensors meeting this dimensionality threshold.
- AdamGradientClipping
Gets or sets the AdamW optimizer's gradient clipping parameter. Gradient clipping is used to constrain the gradients to a specific range during backpropagation to prevent the exploding gradient problem.
- CosineDecayMin
Gets or sets the minimum value for the cosine decay when using the Adam optimizer. This defines the lower limit of the decay rate, influencing the learning rate adjustment over time.
- CosineDecayRestart
Gets or sets the multiplier that increases the number of steps for cosine decay upon each restart, applicable only when using the Adam optimizer with a cyclical learning rate policy.
- CosineDecaySteps
Gets or sets the number of steps over which the cosine decay is applied when using the Adam optimizer. This determines the total duration of the cosine decay cycle for adjusting the learning rate.
- GradientAccumulation
Gets or sets the number of gradient accumulations before updating model weights.
- LoraAlpha
Gets or sets LoRA alpha, which, in conjunction with LoraR, determines the scaling factor for the LoRA adaptation.
- LoraRank
Gets or sets LoRA's rank ('r'), which is the default rank used for low-rank adaptations.
This parameter also defines the scaling factor together with LoraAlpha for the LoRA approach.
- MaxNoImprovement
Gets or sets the maximum number of optimization iterations with no improvement before considering convergence.
- NormRMS
Gets or sets the RMS-Norm epsilon value, which stabilizes the training process.
- RankAttentionNorm
Represents the rank used for the attention norm tensor in the LoRA adaptation process. Overrides the default rank to enhance specificity. Typically, norm tensors use a rank of 1 to maintain normalization constraints.
- RankDownFeedForwardNorm
Represents the rank for the LoRA adaptation of the down projection tensor in the feed-forward network. Enhances the model’s ability to project input features to a lower dimensional space.
- RankFeedForwardNorm
Represents the rank for the LoRA adaptation of the feed-forward norm tensor. This typically uses rank 1 to enforce standard normalization.
- RankGateFeedForwardNorm
Represents the rank for the LoRA adaptation of the gate tensor in the feed-forward network. Allows for specific customization of gating mechanisms within the model.
- RankOutput
Represents the rank for the LoRA adaptation of the output tensor within the model's architecture. Overriding this rank allows for finer control over the model's output transformation processes.
- RankOutputNorm
Represents the rank for the LoRA adaptation of the output normalization tensor. This property allows overriding the default rank, typically set to 1 to ensure standard normalization across the model's outputs.
- RankTokenEmbeddings
Represents the rank for the LoRA adaptation of the token embeddings tensor. Can be customized to alter the model’s handling of token embeddings.
- RankUpFeedForwardNorm
Represents the rank for the LoRA adaptation of the up projection tensor in the feed-forward network. Allows for detailed customization of how features are projected back to the original dimension.
- RankWK
Represents the rank for the LoRA adaptation of the key weight (WK) tensor. Customizing this rank allows for tailored key transformations within the model's architecture.
- RankWO
Represents the rank for the LoRA adaptation of the output weight (WO) tensor. This parameter allows the customization of the model’s output weight transformations.
- RankWQ
Represents the rank for the LoRA adaptation of the query weight (WQ) tensor. This setting allows customization beyond the default rank, adapting the model's query transformations.
- RankWV
Represents the rank for the LoRA adaptation of the value weight (WV) tensor. Adjusting this rank can refine the model's handling of input values during transformations.
- RopeFreqBase
Gets or sets the frequency base for RoPE (Rotary Positional Embeddings).
- RopeFreqScale
Gets or sets the frequency scale for RoPE.