Table of Contents

Class MultiTurnConversation

Namespace
LMKit.TextGeneration
Assembly
LM-Kit.NET.dll

A class specifically designed to handle multi-turn question-answering scenarios. This class maintains a conversation session, allowing for multi-turn interactions with a language model. Typical usage includes creating an instance with a language model, optionally specifying a context size, then repeatedly calling Submit(Prompt, CancellationToken) or SubmitAsync(Prompt, CancellationToken) with user prompts. The ChatHistory retains the conversation's state across multiple turns.

// Example usage:
var model = new LM("path/to/model.bin", 
                   new LM.LoadingOptions { LoadTensors = true });
using var conversation = new MultiTurnConversation(model);

// Optionally override the system prompt before the first user message
conversation.SystemPrompt = "You are a helpful assistant.";

// Submit user messages
var result = conversation.Submit("Hello! How can I use this library?");
Console.WriteLine(result.Content);

// Regenerate the response if needed
var regeneratedResult = conversation.RegenerateResponse();
Console.WriteLine(regeneratedResult.Content);
public sealed class MultiTurnConversation : IConversation, ITextGenerationSettings, IDisposable
Inheritance
MultiTurnConversation
Implements
Inherited Members

Constructors

MultiTurnConversation(LM, ChatHistory, int, ITextGenerationSettings)

Creates a new conversation with a specified model and an existing ChatHistory.

MultiTurnConversation(LM, byte[])

Restores a previous conversation session from a byte array.

MultiTurnConversation(LM, int)

Creates a new conversation instance, optionally specifying a contextSize.

MultiTurnConversation(LM, string)

Restores a previous conversation session from a file.

Properties

ChatHistory

Gets the complete history of the chat session.

ContextRemainingSpace

Gets the current number of tokens that can still fit into the model's context before it reaches its maximum capacity.

ContextSize

Specifies the total size of the model's context (in tokens) associated with this instance.

Grammar

Gets or sets the Grammar object used to enforce grammatical rules during generation. When Grammar is set to a non-null value, repetition penalties are disabled by default. This prevents conflicts between grammar enforcement and repetition control. If needed, you can manually re-enable repetition penalties after setting the Grammar property.

InferencePolicies

Governs various inference operations such as handling input length overflow.

LogitBias

A LogitBias object for adjusting the likelihood of specific tokens during text generation.

MaximumCompletionTokens

Defines the maximum number of tokens permitted for text completion or generation.

By default, this value is 512 tokens. Setting it to -1 disables the token limit.

MaximumRecallTokens

Gets or sets the maximum number of tokens that can be recalled from memory when Memory is set.

This property is only applicable if an AgentMemory instance is assigned. The value specifies the maximum token length of content retrieved from memory to enhance conversation context. It is automatically capped at half of the overall context size.

The default value is ContextSize / 4.

Memory

Gets or sets the AgentMemory instance used to persist and recall additional context.

When assigned, this memory instance is used to search for relevant text partitions that can be injected into the conversation to enhance the model's responses. In particular, during the generation process, if memory is available and not empty, the conversation attempts to retrieve and integrate context from memory.

Model

Gets the LM model instance associated with this conversation.

RepetitionPenalty

A RepetitionPenalty object specifying repetition penalty rules during text completion.

SamplingMode

A TokenSampling object specifying the sampling strategy for text completion.

StopSequences

Specifies a set of sequences for which generation should stop immediately. Any tokens that match these sequences are not included in the final output.

SystemPrompt

Specifies the system prompt applied to the model before forwarding the user's requests.

The default value is "You are a chatbot that always responds promptly and helpfully to user requests."

After the initial user interaction, this property becomes immutable and cannot be altered within the same chat session. If you need a different system prompt, create a new MultiTurnConversation instance.

Methods

ClearHistory()

Resets the conversation history and clears the context state, effectively starting a new session.

ContinueLastAssistantResponse(CancellationToken)

Continues generating additional tokens for the last assistant response without any new user input.

ContinueLastAssistantResponseAsync(CancellationToken)

Asynchronously continues generating additional tokens for the last assistant's response.

Dispose()

Disposes this conversation instance, releasing both managed and unmanaged resources.

~MultiTurnConversation()

Finalizer to ensure unmanaged resources are released if Dispose() is not called.

RegenerateResponse(CancellationToken)

Regenerates a response to the most recent user inquiry without altering the existing chat history.

RegenerateResponseAsync(CancellationToken)

Asynchronously regenerates a response to the most recent user inquiry. This does not remove the previous answer from the history; it simply replaces it with a new one.

SaveSession()

Saves the current chat session state and returns it as a byte array.

SaveSession(string)

Saves the current chat session state to a specified file.

Submit(Prompt, CancellationToken)

Submits a Prompt object (containing text and/or attachments) to the model, synchronously.

Submit(string, CancellationToken)

Submits a user prompt (string) to the model for text generation, synchronously.

SubmitAsync(Prompt, CancellationToken)

Submits a Prompt object (containing text and/or attachments) to the model, asynchronously.

SubmitAsync(string, CancellationToken)

Submits a user prompt (string) to the model for text generation, asynchronously.

Events

AfterTextCompletion

Triggered right after the completion of a text generation operation.

AfterTokenSampling

Triggered just after the generation of a token. Allows modifications to the token selection process via AfterTokenSamplingEventArgs.

BeforeTokenSampling

Triggered just before the generation of a token. Allows precise adjustments to the token sampling process via BeforeTokenSamplingEventArgs.

MemoryRecall

Occurs when a memory partition is recalled to augment the conversation.

This event is raised during the processing of a chat message when memory partitions are retrieved from the AgentMemory. Subscribers can inspect details such as the memory collection, the text content, and the memory identifier. They may also cancel the memory injection by setting Cancel to true.