Build a Conversational Assistant with Memory
A useful assistant remembers what you told it: your preferences, project details, and past decisions. LM-Kit.NET's MultiTurnConversation class maintains chat history across turns, and AgentMemory persists knowledge across sessions. This tutorial builds a conversational assistant that streams responses, saves sessions, and recalls information from previous conversations.
Why Local Conversational Assistants Matter
Two enterprise problems that on-device assistants solve:
- Conversation history stays on-premises. Multi-turn assistants accumulate detailed context about users, projects, and business processes. With a local model, that context never leaves your infrastructure, making it safe for internal tools that handle sensitive projects.
- Predictable latency and availability. A local assistant responds in consistent time regardless of API traffic, outages, or rate limits. Critical for real-time tools where users wait for each response.
Prerequisites
| Requirement | Minimum |
|---|---|
| .NET SDK | 8.0+ |
| VRAM | 4+ GB |
| Disk | ~3 GB free for model download |
Step 1: Create the Project
dotnet new console -n AssistantQuickstart
cd AssistantQuickstart
dotnet add package LM-Kit.NET
Step 2: Basic Multi-Turn Chat
using System.Text;
using LMKit.Model;
using LMKit.TextGeneration;
LMKit.Licensing.LicenseManager.SetLicenseKey("");
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Load model
// ──────────────────────────────────────
Console.WriteLine("Loading model...");
using LM model = LM.LoadFromModelID("gemma3:4b",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 2. Create conversation with streaming
// ──────────────────────────────────────
var chat = new MultiTurnConversation(model)
{
SystemPrompt = "You are a helpful coding assistant. " +
"Give concise, practical answers with code examples when relevant.",
MaximumCompletionTokens = 1024
};
chat.AfterTextCompletion += (sender, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
Console.Write(e.Text);
};
// ──────────────────────────────────────
// 3. Chat loop
// ──────────────────────────────────────
Console.WriteLine("Chat with the assistant (type 'quit' to exit):\n");
while (true)
{
Console.ForegroundColor = ConsoleColor.Green;
Console.Write("You: ");
Console.ResetColor();
string? input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input) || input.Equals("quit", StringComparison.OrdinalIgnoreCase))
break;
Console.ForegroundColor = ConsoleColor.Cyan;
Console.Write("Assistant: ");
Console.ResetColor();
TextGenerationResult result = chat.Submit(input);
Console.WriteLine($"\n [{result.GeneratedTokenCount} tokens, {result.TokenGenerationRate:F1} tok/s]\n");
}
Step 3: Save and Restore Sessions
Persist conversation state to disk so users can resume later:
string sessionFile = "chat_session.bin";
// ──────────────────────────────────────
// Restore previous session if available
// ──────────────────────────────────────
MultiTurnConversation chat;
if (File.Exists(sessionFile))
{
Console.WriteLine("Restoring previous session...\n");
chat = new MultiTurnConversation(model, sessionFile);
}
else
{
chat = new MultiTurnConversation(model)
{
SystemPrompt = "You are a helpful assistant that remembers everything from our conversation."
};
}
chat.AfterTextCompletion += (sender, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
Console.Write(e.Text);
};
// ──────────────────────────────────────
// Chat loop with session save
// ──────────────────────────────────────
while (true)
{
Console.ForegroundColor = ConsoleColor.Green;
Console.Write("You: ");
Console.ResetColor();
string? input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input) || input.Equals("quit", StringComparison.OrdinalIgnoreCase))
break;
Console.ForegroundColor = ConsoleColor.Cyan;
Console.Write("Assistant: ");
Console.ResetColor();
chat.Submit(input);
Console.WriteLine("\n");
}
// Save session on exit
chat.SaveSession(sessionFile);
Console.WriteLine($"Session saved to {sessionFile}");
chat.Dispose();
Step 4: Chat History Management
Inspect and manipulate the conversation history:
var chat = new MultiTurnConversation(model)
{
SystemPrompt = "You are a helpful assistant."
};
chat.AfterTextCompletion += (sender, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
Console.Write(e.Text);
};
// Have a conversation
Console.Write("Assistant: ");
chat.Submit("My name is Alice and I'm working on Project Atlas.");
Console.WriteLine("\n");
Console.Write("Assistant: ");
chat.Submit("We're building a recommendation engine using collaborative filtering.");
Console.WriteLine("\n");
// Inspect chat history
Console.WriteLine($"History: {chat.ChatHistory.Messages.Count} messages");
Console.WriteLine($"Context: {chat.ContextRemainingSpace} tokens remaining\n");
foreach (var msg in chat.ChatHistory.Messages)
{
string role = msg.AuthorRole.ToString();
string preview = msg.Content.Length > 60 ? msg.Content.Substring(0, 60) + "..." : msg.Content;
Console.WriteLine($" [{role}] {preview}");
}
// The assistant now knows about Alice and Project Atlas
Console.Write("\nAssistant: ");
chat.Submit("What project am I working on and what approach are we using?");
Console.WriteLine("\n");
Step 5: Long-Term Memory with AgentMemory
AgentMemory stores knowledge across sessions using RAG. The assistant recalls facts from previous conversations:
using LMKit.Agents;
using LMKit.Embeddings;
// Load an embedding model for memory storage
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine();
// Create persistent memory
string memoryFile = "assistant_memory.dat";
var memory = new AgentMemory(new Embedder(embeddingModel));
var chat = new MultiTurnConversation(model)
{
SystemPrompt = "You are a helpful assistant with long-term memory. " +
"You can recall information from previous conversations.",
Memory = memory,
MaximumRecallTokens = 512
};
chat.AfterTextCompletion += (sender, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
Console.Write(e.Text);
};
// Chat loop
Console.WriteLine("Chat with memory-enabled assistant:\n");
while (true)
{
Console.ForegroundColor = ConsoleColor.Green;
Console.Write("You: ");
Console.ResetColor();
string? input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input) || input.Equals("quit", StringComparison.OrdinalIgnoreCase))
break;
Console.ForegroundColor = ConsoleColor.Cyan;
Console.Write("Assistant: ");
Console.ResetColor();
chat.Submit(input);
Console.WriteLine("\n");
}
chat.Dispose();
Step 6: Specialized Assistant Personas
Create different assistants for different tasks by changing the system prompt:
// Code review assistant
var codeReviewer = new MultiTurnConversation(model)
{
SystemPrompt = "You are a senior software engineer performing code reviews. " +
"Focus on bugs, security issues, performance problems, and maintainability. " +
"Be direct and specific. Reference line numbers when possible.",
MaximumCompletionTokens = 2048
};
// Technical writer
var techWriter = new MultiTurnConversation(model)
{
SystemPrompt = "You are a technical writer. Rewrite explanations to be clear, " +
"well-structured, and accessible to developers of all levels. " +
"Use examples and avoid jargon.",
MaximumCompletionTokens = 1024
};
// SQL assistant
var sqlHelper = new MultiTurnConversation(model)
{
SystemPrompt = "You are a database expert. Help write and optimize SQL queries. " +
"Always explain the query logic. Warn about potential performance issues " +
"with large tables.",
MaximumCompletionTokens = 1024
};
Common Issues
| Problem | Cause | Fix |
|---|---|---|
| Assistant forgets earlier context | Context window full | Increase ContextSize; or use AgentMemory for long-term recall |
| Responses cut off mid-sentence | MaximumCompletionTokens too low |
Increase to 1024 or 2048 |
| Slow first response | Model not cached in VRAM | First inference is slower; subsequent ones are faster |
| Session file too large | Long conversation history | Call chat.ClearHistory() periodically; rely on AgentMemory for recall |
| System prompt ignored | Prompt too long or vague | Keep system prompts under 200 words; be specific about behavior |
Next Steps
- Build a Multi-Agent Workflow: coordinate multiple assistants in orchestrated workflows.
- Create an AI Agent with Tools: give your assistant tool-calling capabilities.
- Samples: Multi-Turn Chat: multi-turn chat demo.
- Samples: Persistent Memory Assistant: memory assistant demo.