Table of Contents

Namespace LMKit.TextGeneration

Namespaces

LMKit.TextGeneration.Chat
LMKit.TextGeneration.Events
LMKit.TextGeneration.Sampling

Classes

MultiTurnConversation

High-level, production-ready conversation runtime for multi-turn chat.

MultiTurnConversation wraps a local language model and maintains the running conversation state (messages, system prompt, tool-calls, memory recall, etc.). It exposes a compact API to:

  • submit user prompts (sync/async),
  • regenerate or continue the last assistant answer,
  • register model-callable tools and control per-turn tool policy,
  • inject long-term AgentMemory and cap recall tokens,
  • configure sampling (temperature/top-p/etc.) and repetition penalties,
  • enforce structure with Grammar (mutually exclusive with tools),
  • and persist/restore full chat sessions.

Threading model: generation operations are serialized internally so only one call runs at a time. Create one instance per independent conversation. Share the underlying LM across conversations if desired.

Typical usage:

// Load a model (ensure tensors/weights are loaded).
var lm = new LM("path/to/model.gguf", new LM.LoadingOptions { LoadTensors = true });

// Create a conversation with default settings.
using var chat = new MultiTurnConversation(lm);

// (Optional) set a system prompt before the first user message.
chat.SystemPrompt = "You are a concise technical assistant.";

// (Optional) register tools before first turn if your model supports tool-calls.
if (chat.Model.HasToolCalls)
{
    chat.Tools.Register(new WebSearchTool());
    chat.ToolPolicy.Choice = ToolChoice.Auto; // let the model decide
}

// Submit a user message
var result = chat.Submit("How do I stream tokens from this API?");
Console.WriteLine(result.Content);

// Regenerate a different answer for the same user turn
var alt = chat.RegenerateResponse();
Console.WriteLine(alt.Content);

// Save the entire session for later
chat.SaveSession("session.bin");
SingleTurnConversation

A class designed for handling single-turn question answering.
Unlike a multi-turn conversation service, it does not preserve context between questions and answers.

Summarizer

Provides functionality to generate a summary (title and/or content) from input text or an image using a language model. This summarizer supports both text and image inputs. Use Summarize(string, CancellationToken) or SummarizeAsync(string, CancellationToken) for text input, and Summarize(Attachment, CancellationToken) or SummarizeAsync(Attachment, CancellationToken) for image input.

Summarizer.SummarizerResult

Represents the result of a summarization operation, including both a title and summarized content.

TextGenerationResult

Holds the result of a text completion operation.

Interfaces

IConversation

Represents a conversation interface for interacting with a text generation model. Provides methods for submitting prompts, both synchronously and asynchronously, and allows for event handling before and after token sampling, as well as after text completion.

ITextGenerationSettings

Represents the settings used to control text generation behavior. This includes specifying the sampling strategy, repetition penalties, stop sequences, and optional grammar enforcement for structured and controlled output.

Enums

Language

Defines supported languages.

Summarizer.OverflowResolutionStrategy

Specifies the strategies available for handling scenarios where the combined length of the input text and the anticipated completion tokens exceed the configured MaximumContextLength.

Summarizer.SummarizationIntent

Defines the type of summarization intent to apply when processing a given input.

TextGenerationResult.StopReason

Enumerates the various reasons that can lead to the termination of a text completion task.