Enum HibernationMode
Controls when the runtime hibernates an inference context at the
end of a conversation turn. The mode is set on
HibernationMode by the owning conversation
(e.g. MultiTurnConversation, PdfChat) and consulted
by the inference backend every time a usage lock is released.
public enum HibernationMode
Fields
Auto = 0Default. At the end of a turn the runtime hibernates the context only when the device is over its cached-context memory budget (MaxCachedContextMemoryRatio); otherwise the context stays resident and is marked eligible for the runtime to reclaim later under memory pressure. This keeps hot sessions fast while letting the runtime free memory automatically when it runs short.
None = 1No explicit hibernation. The runtime keeps the context resident in memory between turns; callers may still trigger hibernation manually via HibernateAsync(string).
Forced = 2At the end of every conversation turn, the runtime schedules HibernateAsync(string) as a fire-and-forget background task: the caller does not wait for the KV-cache to land on disk, and the next turn rehydrates the context transparently from the on-disk image.