Table of Contents

Class LM.TensorOverride

Namespace
LMKit.Model
Assembly
LM-Kit.NET.dll

Specifies a tensor buffer type override that controls which device (CPU or GPU) is used for tensors matching a regex pattern. This enables fine-grained control over tensor placement, particularly useful for offloading MoE (Mixture of Experts) expert weights to CPU while keeping attention layers on GPU.

public sealed class LM.TensorOverride
Inheritance
LM.TensorOverride
Inherited Members

Examples

Example 1: Offload all MoE expert weights to CPU

var config = new LM.DeviceConfiguration
{
    GpuLayerCount = int.MaxValue,
    TensorOverrides = new List<LM.TensorOverride>
    {
        LM.TensorOverride.Cpu(@"\.ffn_.*_exps\.weight")
    }
};

LM model = new LM(modelUri, deviceConfiguration: config);

Example 2: Multi-GPU with expert offloading

var config = new LM.DeviceConfiguration
{
    GpuLayerCount = int.MaxValue,
    TensorOverrides = new List<LM.TensorOverride>
    {
        LM.TensorOverride.Gpu(@"blk\.(0|1|2)\.attn", gpuIndex: 0),
        LM.TensorOverride.Cpu(@"\.ffn_.*_exps\.weight"),
    }
};

LM model = new LM(modelUri, deviceConfiguration: config);

Remarks

The Pattern property accepts a C++ std::regex pattern that is matched against tensor names using substring search. The first matching override wins when multiple overrides could match the same tensor.

Common patterns for MoE expert offloading:

  • \.ffn_.*_exps\.weight matches all routed expert FFN weights
  • blk\.(0|1|2)\.ffn_.*_exps matches experts in specific layers

Constructors

TensorOverride(string, int)

Creates a tensor override that places matching tensors on a specific device.

Properties

DeviceIndex

The GPU device index to place matching tensors on, or -1 for CPU.

Pattern

The regex pattern to match against tensor names. Uses C++ std::regex syntax with substring matching (not anchored).

Methods

Cpu(string)

Creates a tensor override that places matching tensors on CPU.

Gpu(string, int)

Creates a tensor override that places matching tensors on a specific GPU.

Share