Property MaxCachedContextMemoryRatio
MaxCachedContextMemoryRatio
Gets or sets the maximum fraction of a device's total memory that may be
used for cached inference contexts on that device.
This ratio is applied per device: each GPU's budget is
TotalVRAM * MaxCachedContextMemoryRatio, and the CPU budget is
TotalSystemRAM * MaxCachedContextMemoryRatio.
Set to 0 to disable context caching entirely.
Values are clamped to the range [0.0, 1.0].
public static double MaxCachedContextMemoryRatio { get; set; }
Property Value
- double
The default value is
0.15(15% of total device memory).