Property AutoFitToVram
AutoFitToVram
Gets or sets a value indicating whether, on a load failure that looks like an out-of-memory condition, the loader should retry with progressively fewer layers offloaded to GPU until either the load succeeds or the layer count reaches 0 (i.e. CPU-only).
public bool AutoFitToVram { get; set; }
Property Value
- bool
The default value is
true.
Remarks
This is the difference between "the model failed to load" and "the model loaded with as many layers as fit on the GPU and the rest on CPU". To force CPU-only loading explicitly, set GpuLayerCount to 0.