Class LoraTrainingParameters

Namespace: LMKit.Finetuning

Assembly: LM-Kit.NET.dll

Represents the parameters for training and fine-tuning AI models using the LoRA (Low-Rank Adaptation) approach.

public sealed class LoraTrainingParameters

Inheritance: object

LoraTrainingParameters

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.ReferenceEquals(object, object)

object.ToString()

Properties

AdamAlpha: Gets or sets the learning rate (alpha) for the Adam optimizer. This parameter controls how much the model weights are adjusted with respect to the loss gradient.

AdamBeta1: Gets or sets the exponential decay rate for the first moment estimates in the Adam optimizer. This rate controls how quickly the contributions from previous gradients diminish.

AdamBeta2: Gets or sets the exponential decay rate for the second moment estimates in the Adam optimizer. This parameter helps stabilize the update steps across all parameters by normalizing gradient variances.

AdamDecay: Gets or sets the AdamW optimizer's weight decay factor. This parameter controls the rate at which weights decrease during training.

AdamDecayMinNDim: Gets or sets the minimum number of dimensions a tensor must have for AdamW weight decay to be applied. This ensures that weight decay is only applied to tensors meeting this dimensionality threshold.

AdamGradientClipping: Gets or sets the AdamW optimizer's gradient clipping parameter. Gradient clipping is used to constrain the gradients to a specific range during backpropagation to prevent the exploding gradient problem.

CosineDecayMin: Gets or sets the minimum value for the cosine decay when using the Adam optimizer. This defines the lower limit of the decay rate, influencing the learning rate adjustment over time.

CosineDecayRestart: Gets or sets the multiplier that increases the number of steps for cosine decay upon each restart, applicable only when using the Adam optimizer with a cyclical learning rate policy.

CosineDecaySteps: Gets or sets the number of steps over which the cosine decay is applied when using the Adam optimizer. This determines the total duration of the cosine decay cycle for adjusting the learning rate.

GradientAccumulation: Gets or sets the number of gradient accumulations before updating model weights.

LoraAlpha: Gets or sets LoRA alpha, which, in conjunction with LoraR, determines the scaling factor for the LoRA adaptation.

LoraRank: Gets or sets LoRA's rank ('r'), which is the default rank used for low-rank adaptations.
This parameter also defines the scaling factor together with LoraAlpha for the LoRA approach.

MaxNoImprovement: Gets or sets the maximum number of optimization iterations with no improvement before considering convergence.

NormRMS: Gets or sets the RMS-Norm epsilon value, which stabilizes the training process.

RankAttentionNorm: Represents the rank used for the attention norm tensor in the LoRA adaptation process. Overrides the default rank to enhance specificity. Typically, norm tensors use a rank of 1 to maintain normalization constraints.

RankDownFeedForwardNorm: Represents the rank for the LoRA adaptation of the down projection tensor in the feed-forward network. Enhances the model’s ability to project input features to a lower dimensional space.

RankFeedForwardNorm: Represents the rank for the LoRA adaptation of the feed-forward norm tensor. This typically uses rank 1 to enforce standard normalization.

RankGateFeedForwardNorm: Represents the rank for the LoRA adaptation of the gate tensor in the feed-forward network. Allows for specific customization of gating mechanisms within the model.

RankOutput: Represents the rank for the LoRA adaptation of the output tensor within the model's architecture. Overriding this rank allows for finer control over the model's output transformation processes.

RankOutputNorm: Represents the rank for the LoRA adaptation of the output normalization tensor. This property allows overriding the default rank, typically set to 1 to ensure standard normalization across the model's outputs.

RankTokenEmbeddings: Represents the rank for the LoRA adaptation of the token embeddings tensor. Can be customized to alter the model’s handling of token embeddings.

RankUpFeedForwardNorm: Represents the rank for the LoRA adaptation of the up projection tensor in the feed-forward network. Allows for detailed customization of how features are projected back to the original dimension.

RankWK: Represents the rank for the LoRA adaptation of the key weight (WK) tensor. Customizing this rank allows for tailored key transformations within the model's architecture.

RankWO: Represents the rank for the LoRA adaptation of the output weight (WO) tensor. This parameter allows the customization of the model’s output weight transformations.

RankWQ: Represents the rank for the LoRA adaptation of the query weight (WQ) tensor. This setting allows customization beyond the default rank, adapting the model's query transformations.

RankWV: Represents the rank for the LoRA adaptation of the value weight (WV) tensor. Adjusting this rank can refine the model's handling of input values during transformations.

RopeFreqBase: Gets or sets the frequency base for RoPE (Rotary Positional Embeddings).

RopeFreqScale: Gets or sets the frequency scale for RoPE.