Table of Contents

Property EnableSpeculativeDecodingDrafts

Namespace
LMKit.Model
Assembly
LM-Kit.NET.dll

EnableSpeculativeDecodingDrafts

Gets or sets a value indicating whether the speculative decoding draft assets packaged with the model are loaded for this LM instance.

public bool EnableSpeculativeDecodingDrafts { get; set; }

Property Value

bool

The default value is true.

Remarks

"Draft assets" covers every in-envelope source of candidate tokens for draft-and-verify decoding: Multi-Token Prediction (MTP) head tensors declared by the GGUF, and a draft model shipped inside the LMK archive. When true (the default), these assets are loaded into VRAM when present, and the runtime uses them for speculative decoding on supported architectures.

When false, the draft assets are skipped at load time (saving a few hundred MiB to ~1 GiB of VRAM depending on the model) and speculative decoding from packaged drafts is unavailable for this instance. HasSpeculativeDecodingDrafts then returns false even if the file declares the assets. The decision is permanent for the lifetime of the LM object; load the model again with a different value to change it.

This does not affect a draft model assigned explicitly through DraftModel, which is wired independently of the model envelope.

Share