Table of Contents

Class LoraTrainingParameters

Namespace
LMKit.Finetuning
Assembly
LM-Kit.NET.dll

Represents the parameters for training and fine-tuning AI models using the LoRA (Low-Rank Adaptation) approach.

public sealed class LoraTrainingParameters
Inheritance
LoraTrainingParameters
Inherited Members

Properties

AdamAlpha

Gets or sets the learning rate (alpha) for the Adam optimizer. This parameter controls how much the model weights are adjusted with respect to the loss gradient.

AdamBeta1

Gets or sets the exponential decay rate for the first moment estimates in the Adam optimizer. This rate controls how quickly the contributions from previous gradients diminish.

AdamBeta2

Gets or sets the exponential decay rate for the second moment estimates in the Adam optimizer. This parameter helps stabilize the update steps across all parameters by normalizing gradient variances.

AdamDecay

Gets or sets the AdamW optimizer's weight decay factor. This parameter controls the rate at which weights decrease during training.

AdamDecayMinNDim

Gets or sets the minimum number of dimensions a tensor must have for AdamW weight decay to be applied. This ensures that weight decay is only applied to tensors meeting this dimensionality threshold.

AdamGradientClipping

Gets or sets the AdamW optimizer's gradient clipping parameter. Gradient clipping is used to constrain the gradients to a specific range during backpropagation to prevent the exploding gradient problem.

CosineDecayMin

Gets or sets the minimum value for the cosine decay when using the Adam optimizer. This defines the lower limit of the decay rate, influencing the learning rate adjustment over time.

CosineDecayRestart

Gets or sets the multiplier that increases the number of steps for cosine decay upon each restart, applicable only when using the Adam optimizer with a cyclical learning rate policy.

CosineDecaySteps

Gets or sets the number of steps over which the cosine decay is applied when using the Adam optimizer. This determines the total duration of the cosine decay cycle for adjusting the learning rate.

GradientAccumulation

Gets or sets the number of gradient accumulations before updating model weights.

LoraAlpha

Gets or sets LoRA alpha, which, in conjunction with LoraR, determines the scaling factor for the LoRA adaptation.

LoraRank

Gets or sets LoRA's rank ('r'), which is the default rank used for low-rank adaptations.
This parameter also defines the scaling factor together with LoraAlpha for the LoRA approach.

MaxNoImprovement

Gets or sets the maximum number of optimization iterations with no improvement before considering convergence.

NormRMS

Gets or sets the RMS-Norm epsilon value, which stabilizes the training process.

RankAttentionNorm

Represents the rank used for the attention norm tensor in the LoRA adaptation process. Overrides the default rank to enhance specificity. Typically, norm tensors use a rank of 1 to maintain normalization constraints.

RankDownFeedForwardNorm

Represents the rank for the LoRA adaptation of the down projection tensor in the feed-forward network. Enhances the model’s ability to project input features to a lower dimensional space.

RankFeedForwardNorm

Represents the rank for the LoRA adaptation of the feed-forward norm tensor. This typically uses rank 1 to enforce standard normalization.

RankGateFeedForwardNorm

Represents the rank for the LoRA adaptation of the gate tensor in the feed-forward network. Allows for specific customization of gating mechanisms within the model.

RankOutput

Represents the rank for the LoRA adaptation of the output tensor within the model's architecture. Overriding this rank allows for finer control over the model's output transformation processes.

RankOutputNorm

Represents the rank for the LoRA adaptation of the output normalization tensor. This property allows overriding the default rank, typically set to 1 to ensure standard normalization across the model's outputs.

RankTokenEmbeddings

Represents the rank for the LoRA adaptation of the token embeddings tensor. Can be customized to alter the model’s handling of token embeddings.

RankUpFeedForwardNorm

Represents the rank for the LoRA adaptation of the up projection tensor in the feed-forward network. Allows for detailed customization of how features are projected back to the original dimension.

RankWK

Represents the rank for the LoRA adaptation of the key weight (WK) tensor. Customizing this rank allows for tailored key transformations within the model's architecture.

RankWO

Represents the rank for the LoRA adaptation of the output weight (WO) tensor. This parameter allows the customization of the model’s output weight transformations.

RankWQ

Represents the rank for the LoRA adaptation of the query weight (WQ) tensor. This setting allows customization beyond the default rank, adapting the model's query transformations.

RankWV

Represents the rank for the LoRA adaptation of the value weight (WV) tensor. Adjusting this rank can refine the model's handling of input values during transformations.

RopeFreqBase

Gets or sets the frequency base for RoPE (Rotary Positional Embeddings).

RopeFreqScale

Gets or sets the frequency scale for RoPE.