Property DraftModelSizeBytes
DraftModelSizeBytes
Gets the weight size, in bytes, of the attached speculative-decoding draft model,
or 0 when no separate draft model is attached. The draft model's weights are
held apart from this model's Size, so this is reported separately to
make the draft's contribution to the total memory footprint visible. In-model
Multi-Token Prediction (self-speculation) carries no separate weights and returns
0 here while still enabling speculative decoding (see
HasSpeculativeDecodingDrafts).
public long DraftModelSizeBytes { get; }