Method ReleaseIdlePooledContexts
ReleaseIdlePooledContexts(out long)
Disposes every inference context sitting idle in the process-wide recycle pool and returns the number released. Contexts currently in use by a live session are never touched, so this is safe to call on a busy process: the only cost is that the next request for a recycled context allocates a fresh one. Use it to reclaim KV-cache memory on demand.
public static int ReleaseIdlePooledContexts(out long reclaimedBytes)
Parameters
reclaimedByteslongReceives the resident KV-cache bytes the released contexts accounted for.
Returns
- int
The number of idle contexts released.