Create Multi Turn Chatbot with Memory in .NET Applications

🎯 Purpose of the Demo

This demo shows you how to build a multi-turn chatbot that leverages persistent memory to recall and integrate context into its responses. It demonstrates how a tiny Small Language Model (SLM) can be used, even on low-cost CPU devices, to provide more accurate, context-aware answers during extended conversations.

👥 Who Should Use This Demo

This demo is ideal for developers and businesses who need to:

Run chatbots on resource-constrained hardware (e.g., low-cost CPUs).
Enhance chatbot accuracy by integrating persistent memory.
Build intelligent virtual assistants for customer support, information services, and more.

🚀 What Problem It Solves

Traditional chatbots often struggle with maintaining accurate context over long conversations, especially when running on limited hardware. This demo addresses that challenge by:

Using persistent memory (via the AgentMemory class) to store key facts and context.
Enriching the conversation with recalled information, resulting in correct and contextually relevant answers.
Demonstrating the stark difference in response quality between a chatbot without memory and one with memory.

💻 Demo Application Overview

The demo is a console application that sets up two versions of a chatbot:

Basic Chatbot: Runs without memory, leading to incorrect answers.
Memory-Enhanced Chatbot: Utilizes persistent memory to recall facts and provide correct answers.

The demo uses a tiny SLM (Alibaba Qwen 2.5 Instruct 0.5B) that is optimized to run on cheap CPU devices, making it accessible for scenarios where high-end hardware is not available.

✨ Key Features

Persistent Memory Storage: The chatbot stores and recalls facts to improve answer accuracy.
Dual Chat Modes: Compare responses from a standard chatbot and one with memory integration.
Low-Cost CPU Friendly: Demonstrates the capabilities of a tiny SLM that runs efficiently on affordable hardware.
Interactive Console Interface: Engage in a multi-turn conversation to see memory integration in action.

Example Output

Below is an example of the output you can expect when interacting with the demo:

Assistant:
I am BeeBop, an AI designed to provide information on the ideal customer profile for ACMEE Company. If you have any questions or need assistance with specific topics related to this company, feel free to ask!
(gen. tokens: 44 | stop reason: EndOfGeneration | quality score: 0.64 | speed: 125.13 tok/s | ctx usage: 96/4096)

Assistant with Memory:
I am BeeBop, an AI designed to provide information on the ideal customer profile for ACMEE Company. If you have any questions or need assistance with specific topics related to this company, feel free to ask!
(gen. tokens: 44 | stop reason: EndOfGeneration | quality score: 0.64 | speed: 137.32 tok/s | ctx usage: 96/4096)

User:
How many employees do our customers typically have?

Assistant:
Our customers typically range from 10 to 50 employees, depending on the size and complexity of their operations. This number can vary significantly based on the specific needs and requirements of each organization.
(gen. tokens: 41 | stop reason: EndOfGeneration | quality score: 0.52 | speed: 82.63 tok/s | ctx usage: 155/4096)

Note: This response contains hallucinations.

Assistant with Memory:
Based on the information provided, our ideal customer profile suggests that they typically range from 200 to 500 employees.
(gen. tokens: 28 | stop reason: EndOfGeneration | quality score: 0.6 | speed: 158.92 tok/s | ctx usage: 271/4096)

Note: Correct answer obtained from a memory recall.

User:
In which industries are they working?

Assistant:
Our customers work in a wide variety of industries, including but not limited to:

Manufacturing, Technology, Healthcare, Finance, Retail, [content removed as it was clearly hallucinations]

Note: This response contains hallucinations.

Assistant with Memory:
Acmee's ideal customers work in high-tech sectors such as software development, IT services, and digital media.
(gen. tokens: 24 | stop reason: EndOfGeneration | quality score: 0.72 | speed: 170.91 tok/s | ctx usage: 444/4096)

Note: Correct answer obtained from a memory recall.

In the example above, the answers from the basic assistant (without memory) are incorrect, while the memory-enhanced assistant provides the correct information.

🛠️ Getting Started

Prerequisites

.NET 6.0 (or later)
A working installation of the .NET SDK
An internet connection for model download (if not already cached)

Downloading the Project

Clone the Repository:

git clone https://github.com/LM-Kit/lm-kit-net-samples.git

Navigate to the Demo Directory:

cd lm-kit-net-samples/console_net/multi_turn_chat_with_agent_memory

Building and Running the Application

Build the Application:
```
dotnet build
```
Run the Application:
```
dotnet run
```
Interact with the Chatbot:
- Follow the on-screen prompts.
- Observe the differences between the basic assistant and the memory-enhanced assistant.
- Type your questions to see how persistent memory improves response accuracy.

🔗 Additional Resources

LM-Kit.NET API Documentation
- AgentMemory Class
- MemoryRecallEventArgs Class
GitHub Repository
- Console .NET Sample

Table of Contents