Property TensorOverrides
TensorOverrides
Gets or sets a list of tensor buffer type overrides that control which device (CPU or GPU) is used for tensors matching specific regex patterns.
public List<LM.TensorOverride> TensorOverrides { get; set; }
Property Value
Examples
Example 1: Offload MoE expert weights to CPU
var config = new LM.DeviceConfiguration
{
GpuLayerCount = int.MaxValue,
TensorOverrides = new List<LM.TensorOverride>
{
LM.TensorOverride.Cpu(@"\.ffn_.*_exps\.weight")
}
};
LM model = new LM(modelUri, deviceConfiguration: config);
Example 2: Multi-GPU with expert offloading
var config = new LM.DeviceConfiguration
{
GpuLayerCount = int.MaxValue,
TensorOverrides = new List<LM.TensorOverride>
{
LM.TensorOverride.Gpu(@"blk\.(0|1|2)\.attn", gpuIndex: 0),
LM.TensorOverride.Cpu(@"\.ffn_.*_exps\.weight"),
}
};
LM model = new LM(modelUri, deviceConfiguration: config);
Remarks
This property enables fine-grained control over tensor placement, which is particularly useful for Mixture of Experts (MoE) models where offloading expert weights to CPU can significantly reduce GPU memory usage while maintaining good performance.
Overrides are applied in order; the first matching pattern wins for each tensor.