Property EnableMultiTokenPrediction

Namespace: LMKit.Model

Assembly: LM-Kit.NET.dll

EnableMultiTokenPrediction

Gets or sets a value indicating whether Multi-Token Prediction (MTP) is enabled for this LM instance.

public bool EnableMultiTokenPrediction { get; set; }

Property Value

bool: The default value is true.

Remarks

When true (the default), MTP head tensors are loaded into VRAM if the GGUF declares them, and MTP self-speculative decoding is used at inference time on supported architectures.

When false, the head tensors are skipped at load time (saving a few hundred MiB to ~1 GiB of VRAM depending on the model) and MTP is unavailable for this instance. HasMultiTokenPrediction returns false even if the file itself declares the heads. The decision is permanent for the lifetime of the LM object; toggle MTP by loading the model with a different value.

Table of Contents

Property EnableMultiTokenPrediction

EnableMultiTokenPrediction

Property Value

Remarks