Property DraftMemorySize
DraftMemorySize
Gets the size, in bytes, of the speculative-decoding draft (Multi-Token Prediction
or attached draft-model) sibling context bound to this session: the draft's own
compute buffers, plus its own KV-cache when it keeps one. Reported apart from
MemorySize so the draft's footprint is visible on its own. When the
draft shares the main context's KV-cache (an attached assistant draft linked through
the target), that shared cache belongs to the main context and is counted in
MemorySize, not here, so the two never overlap. Returns 0 when
the session has no draft context or when the context is hibernated.
public long DraftMemorySize { get; }