Table of Contents

Build a Conversational Assistant with Memory

A useful assistant remembers what you told it: your preferences, project details, and past decisions. LM-Kit.NET's MultiTurnConversation class maintains chat history across turns, and AgentMemory persists knowledge across sessions. This tutorial builds a conversational assistant that streams responses, saves sessions, and recalls information from previous conversations.


Why Local Conversational Assistants Matter

Two enterprise problems that on-device assistants solve:

  1. Conversation history stays on-premises. Multi-turn assistants accumulate detailed context about users, projects, and business processes. With a local model, that context never leaves your infrastructure, making it safe for internal tools that handle sensitive projects.
  2. Predictable latency and availability. A local assistant responds in consistent time regardless of API traffic, outages, or rate limits. Critical for real-time tools where users wait for each response.

Prerequisites

Requirement Minimum
.NET SDK 8.0+
VRAM 4+ GB
Disk ~3 GB free for model download

Step 1: Create the Project

dotnet new console -n AssistantQuickstart
cd AssistantQuickstart
dotnet add package LM-Kit.NET

Step 2: Basic Multi-Turn Chat

using System.Text;
using LMKit.Model;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load model
// ──────────────────────────────────────
Console.WriteLine("Loading model...");
using LM model = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Create conversation with streaming
// ──────────────────────────────────────
var chat = new MultiTurnConversation(model)
{
    SystemPrompt = "You are a helpful coding assistant. " +
        "Give concise, practical answers with code examples when relevant.",
    MaximumCompletionTokens = 1024
};

chat.AfterTextCompletion += (sender, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

// ──────────────────────────────────────
// 3. Chat loop
// ──────────────────────────────────────
Console.WriteLine("Chat with the assistant (type 'quit' to exit):\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("You: ");
    Console.ResetColor();

    string? input = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(input) || input.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("Assistant: ");
    Console.ResetColor();

    TextGenerationResult result = chat.Submit(input);
    Console.WriteLine($"\n  [{result.GeneratedTokenCount} tokens, {result.TokenGenerationRate:F1} tok/s]\n");
}

Step 3: Save and Restore Sessions

Persist conversation state to disk so users can resume later:

string sessionFile = "chat_session.bin";

// ──────────────────────────────────────
// Restore previous session if available
// ──────────────────────────────────────
MultiTurnConversation chat;

if (File.Exists(sessionFile))
{
    Console.WriteLine("Restoring previous session...\n");
    chat = new MultiTurnConversation(model, sessionFile);
}
else
{
    chat = new MultiTurnConversation(model)
    {
        SystemPrompt = "You are a helpful assistant that remembers everything from our conversation."
    };
}

chat.AfterTextCompletion += (sender, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

// ──────────────────────────────────────
// Chat loop with session save
// ──────────────────────────────────────
while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("You: ");
    Console.ResetColor();

    string? input = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(input) || input.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("Assistant: ");
    Console.ResetColor();

    chat.Submit(input);
    Console.WriteLine("\n");
}

// Save session on exit
chat.SaveSession(sessionFile);
Console.WriteLine($"Session saved to {sessionFile}");

chat.Dispose();

Step 4: Chat History Management

Inspect and manipulate the conversation history:

var chat = new MultiTurnConversation(model)
{
    SystemPrompt = "You are a helpful assistant."
};

chat.AfterTextCompletion += (sender, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

// Have a conversation
Console.Write("Assistant: ");
chat.Submit("My name is Alice and I'm working on Project Atlas.");
Console.WriteLine("\n");

Console.Write("Assistant: ");
chat.Submit("We're building a recommendation engine using collaborative filtering.");
Console.WriteLine("\n");

// Inspect chat history
Console.WriteLine($"History: {chat.ChatHistory.Messages.Count} messages");
Console.WriteLine($"Context: {chat.ContextRemainingSpace} tokens remaining\n");

foreach (var msg in chat.ChatHistory.Messages)
{
    string role = msg.AuthorRole.ToString();
    string preview = msg.Content.Length > 60 ? msg.Content.Substring(0, 60) + "..." : msg.Content;
    Console.WriteLine($"  [{role}] {preview}");
}

// The assistant now knows about Alice and Project Atlas
Console.Write("\nAssistant: ");
chat.Submit("What project am I working on and what approach are we using?");
Console.WriteLine("\n");

Step 5: Long-Term Memory with AgentMemory

AgentMemory stores knowledge across sessions using RAG. The assistant recalls facts from previous conversations:

using LMKit.Agents;
using LMKit.Embeddings;

// Load an embedding model for memory storage
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// Create persistent memory
string memoryFile = "assistant_memory.dat";
var memory = new AgentMemory(new Embedder(embeddingModel));

var chat = new MultiTurnConversation(model)
{
    SystemPrompt = "You are a helpful assistant with long-term memory. " +
        "You can recall information from previous conversations.",
    Memory = memory,
    MaximumRecallTokens = 512
};

chat.AfterTextCompletion += (sender, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

// Chat loop
Console.WriteLine("Chat with memory-enabled assistant:\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("You: ");
    Console.ResetColor();

    string? input = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(input) || input.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("Assistant: ");
    Console.ResetColor();

    chat.Submit(input);
    Console.WriteLine("\n");
}

chat.Dispose();

Step 6: Specialized Assistant Personas

Create different assistants for different tasks by changing the system prompt:

// Code review assistant
var codeReviewer = new MultiTurnConversation(model)
{
    SystemPrompt = "You are a senior software engineer performing code reviews. " +
        "Focus on bugs, security issues, performance problems, and maintainability. " +
        "Be direct and specific. Reference line numbers when possible.",
    MaximumCompletionTokens = 2048
};

// Technical writer
var techWriter = new MultiTurnConversation(model)
{
    SystemPrompt = "You are a technical writer. Rewrite explanations to be clear, " +
        "well-structured, and accessible to developers of all levels. " +
        "Use examples and avoid jargon.",
    MaximumCompletionTokens = 1024
};

// SQL assistant
var sqlHelper = new MultiTurnConversation(model)
{
    SystemPrompt = "You are a database expert. Help write and optimize SQL queries. " +
        "Always explain the query logic. Warn about potential performance issues " +
        "with large tables.",
    MaximumCompletionTokens = 1024
};

Common Issues

Problem Cause Fix
Assistant forgets earlier context Context window full Increase ContextSize; or use AgentMemory for long-term recall
Responses cut off mid-sentence MaximumCompletionTokens too low Increase to 1024 or 2048
Slow first response Model not cached in VRAM First inference is slower; subsequent ones are faster
Session file too large Long conversation history Call chat.ClearHistory() periodically; rely on AgentMemory for recall
System prompt ignored Prompt too long or vague Keep system prompts under 200 words; be specific about behavior

Next Steps