Table of Contents

Property GradientAccumulation

Namespace
LMKit.Finetuning
Assembly
LM-Kit.NET.dll

GradientAccumulation

Gets or sets the number of gradient accumulations before updating model weights.

public int GradientAccumulation { get; set; }

Property Value

int

The number of gradient accumulations must be a positive integer. The default value is 1.

Remarks

Gradient accumulation is a strategy to train models with large mini-batches, especially when the available memory is insufficient. This technique involves dividing a large batch into smaller sub-batches and accumulating the gradients from each sub-batch. The accumulated gradients are then used to update the model weights, effectively simulating training with a larger batch size.

Employing gradient accumulation is akin to artificially increasing the batch size. It enables the training of models with the quality benefits of larger batches while maintaining lower memory usage. Although this approach may slow down the training process due to sequential processing of batches, it is particularly beneficial when hardware resources are limited.