Property GradientAccumulation
- Namespace
- LMKit.Finetuning
- Assembly
- LM-Kit.NET.dll
GradientAccumulation
Gets or sets the number of gradient accumulations before updating model weights.
public int GradientAccumulation { get; set; }
Property Value
- int
The number of gradient accumulations must be a positive integer. The default value is 1.
Remarks
Gradient accumulation is a strategy to train models with large mini-batches, especially when the available memory is insufficient. This technique involves dividing a large batch into smaller sub-batches and accumulating the gradients from each sub-batch. The accumulated gradients are then used to update the model weights, effectively simulating training with a larger batch size.
Employing gradient accumulation is akin to artificially increasing the batch size. It enables the training of models with the quality benefits of larger batches while maintaining lower memory usage. Although this approach may slow down the training process due to sequential processing of batches, it is particularly beneficial when hardware resources are limited.