Namespace LMKit.TextGeneration

Namespaces

Classes

High-level, production-ready conversation runtime for multi-turn chat.

MultiTurnConversation wraps a local language model and maintains the running conversation state (messages, system prompt, tool-calls, memory recall, etc.). It exposes a compact API to:

submit user prompts (sync/async),
regenerate or continue the last assistant answer,
register model-callable tools and control per-turn tool policy,
inject long-term AgentMemory and cap recall tokens,
configure sampling (temperature/top-p/etc.) and repetition penalties,
enforce structure with Grammar (mutually exclusive with tools),
and persist/restore full chat sessions.

Threading model: generation operations are serialized internally so only one call runs at a time. Create one instance per independent conversation. Share the underlying LM across conversations if desired.

Typical usage:

// Load a model (ensure tensors/weights are loaded).
var lm = new LM("path/to/model.gguf", new LM.LoadingOptions { LoadTensors = true });

// Create a conversation with default settings.
using var chat = new MultiTurnConversation(lm);

// (Optional) set a system prompt before the first user message.
chat.SystemPrompt = "You are a concise technical assistant.";

// (Optional) register tools before first turn if your model supports tool-calls.
if (chat.Model.HasToolCalls)
{
    chat.Tools.Register(new WebSearchTool());
    chat.ToolPolicy.Choice = ToolChoice.Auto; // let the model decide
}

// Submit a user message
var result = chat.Submit("How do I stream tokens from this API?");
Console.WriteLine(result.Content);

// Regenerate a different answer for the same user turn
var alt = chat.RegenerateResponse();
Console.WriteLine(alt.Content);

// Save the entire session for later
chat.SaveSession("session.bin");

SingleTurnConversation: A class designed for handling single-turn question answering.
Unlike a multi-turn conversation service, it does not preserve context between questions and answers.

Summarizer: Provides functionality to generate a summary (title and/or content) from input text or an image using a language model. This summarizer supports both text and image inputs. Use Summarize(string, CancellationToken) or SummarizeAsync(string, CancellationToken) for text input, and Summarize(Attachment, CancellationToken) or SummarizeAsync(Attachment, CancellationToken) for image input.

Summarizer.SummarizerResult: Represents the result of a summarization operation, including both a title and summarized content.

TextGenerationResult: Holds the result of a text completion operation.

Interfaces

IConversation: Represents a conversation interface for interacting with a text generation model. Provides methods for submitting prompts (sync/async), and exposes lifecycle events for token sampling and completion post-processing. Also surfaces key configuration controls, including system prompt, sampling strategy, repetition penalties, and reasoning-level controls.

ITextGenerationSettings: Represents the settings used to control text generation behavior. This includes specifying the sampling strategy, repetition penalties, stop sequences, and optional grammar enforcement for structured and controlled output.

Enums

Language: Defines supported languages.

Summarizer.OverflowResolutionStrategy: Specifies the strategies available for handling scenarios where the combined length of the input text and the anticipated completion tokens exceed the configured MaximumContextLength.

Summarizer.SummarizationIntent: Defines the type of summarization intent to apply when processing a given input.

TextGenerationResult.StopReason: Enumerates the various reasons that can lead to the termination of a text completion task.

Table of Contents

Namespace LMKit.TextGeneration

Namespaces

Classes

Interfaces

Enums