Table of Contents

Bridging Microsoft Semantic Kernel and LM-Kit.NET for Memory-Enhanced AI


🎯 Purpose of the Sample

This Semantic Kernel Integration Memory Demo illustrates how to use the LM-Kit.NET.SemanticKernel NuGet package to seamlessly integrate Microsoft Semantic Kernel with LM-Kit.NET. By combining the language generation capabilities of LM-Kit.NET with the semantic memory features of Microsoft Semantic Kernel, this demo enables memory-enhanced AI responses. The sample demonstrates how to store, recall, and utilize contextual information (in this case, detective facts) to provide more informed and comprehensive answers to user queries.


👥 Industry Target Audience

This demo is designed for developers and teams working in domains where context-aware responses and enriched user experiences are crucial:

  • 🤖 AI & Chatbot Development: Enhance chatbot responses by integrating dynamic memory recall for more contextually relevant answers.
  • 📚 Knowledge Management: Implement systems that leverage stored facts and semantic memory to deliver accurate and informed responses.
  • 📰 Media & Entertainment: Build interactive applications that recall historical data or background facts to create engaging user interactions.
  • 🏦 Finance & Research: Use semantic memory to store and recall key information, ensuring comprehensive analysis and report generation.
  • 💼 Business & Enterprise: Integrate memory-enhanced workflows into customer support, virtual assistants, or decision support systems.

🚀 Problem Solved

Without memory, AI responses are generated solely based on the current input, often lacking context from previous interactions or stored information. This demo shows how to enrich responses by:

  • Storing Relevant Facts: Persisting information (e.g., detective facts) that can be recalled during a conversation.
  • Enhancing Response Quality: Merging live language model completions with retrieved memory data for comprehensive answers.
  • Bridging Two Technologies: Seamlessly integrating LM-Kit.NET's powerful language models with Microsoft Semantic Kernel's semantic memory capabilities.

This combination enables applications to deliver nuanced and context-aware responses, significantly improving user engagement and effectiveness.


💻 Sample Application Description

The Semantic Kernel Integration Memory Demo is a console application that demonstrates two approaches for answering a detective-related query:

  1. Direct Query: The language model is invoked directly to generate a response.
  2. Memory-Enhanced Query: Stored detective facts are retrieved from semantic memory and used to enrich the generated answer.

The demo leverages LM-Kit.NET for both chat completions and text embeddings, while using Microsoft Semantic Kernel's memory plugins to store and recall contextual data. This approach empowers developers to create AI solutions that remember and use past interactions or stored knowledge to produce better, more informed responses.

✨ Key Features

  • 🔗 Seamless Integration: Bridge LM-Kit.NET's language generation with Microsoft Semantic Kernel's memory capabilities.
  • 🧠 Memory-Enhanced Responses: Store, recall, and incorporate semantic memory into AI responses.
  • 💬 Two Approaches to Querying: Compare direct model queries with memory-augmented responses.
  • 📦 Plugin-Based Architecture: Easily extend the functionality with memory plugins and custom prompt templates.
  • ⚡ Fast and Efficient: Leverage on-device processing for both language completions and embeddings without external service dependencies.

🧠 Supported Models & Technologies

The demo utilizes LM-Kit.NET to power both the chat and embedding models. Examples include:

  • Chat Model: A lightweight, instruction-following model (e.g., Phi-3.1-mini-4k-Instruct-Q4_K_M.gguf) for generating responses.
  • Embedding Model: A model (e.g., bge-m3-Q5_K_M.gguf) used for creating text embeddings, enabling semantic memory lookup.

Additionally, Microsoft Semantic Kernel is employed to manage and recall semantic memory, providing a robust bridge between stored knowledge and real-time query responses.


🔗 Additional Resources


🛠️ Getting Started

📋 Prerequisites

  • .NET 6.0 or later: Ensure you have the required .NET runtime installed.
  • NuGet Packages:
    • LM-Kit.NET
    • LM-Kit.NET.SemanticKernel
    • Microsoft.SemanticKernel

📥 Download the Project

▶️ Running the Application

  1. 📂 Clone the repository:

    git clone https://github.com/LM-Kit/lm-kit-net-samples.git
    
  2. 📁 Navigate to the project directory:

    cd lm-kit-net-samples/console_net6/semantic_kernel_integration_memory
    
  3. 🔨 Build and run the application:

    dotnet build
    dotnet run
    

💡 Example Usage

Upon running the application, you will see two distinct approaches:

  1. Direct Query (Without Memory):

    • The application displays the query, e.g., "Who is Elodie's favourite detective?", and directly invokes the language model.
    • The model generates a response based solely on its pre-trained knowledge.
  2. Memory-Enhanced Query:

    • After a prompt to continue, the demo uses semantic memory to retrieve stored detective facts.
    • A custom prompt template incorporates these recalled facts to generate a more comprehensive answer.
    • The output includes detailed context, merging the direct model output with memory recall.

The code snippet below illustrates the key components of the demo:

// Define the question
var question = "Who is Elodie's favourite detective?";
Console.WriteLine("=== Detective Query Demo ===");
Console.WriteLine($"Question: {question}");

// --- Approach 1: Direct Model Query ---
Console.WriteLine("Approach 1: Querying the model directly (without memory).\n");
// (Initialization and direct invocation of the LM-Kit.NET chat model)

// --- Approach 2: Memory-Enhanced Query ---
// Create and configure semantic memory
var embeddingService = kernel.Services.GetRequiredService<ITextEmbeddingGenerationService>();
var memory = new SemanticTextMemory(new VolatileMemoryStore(), embeddingService);

// Save detective facts to memory
const string detectiveMemoryCollection = "detectiveFacts";
await memory.SaveInformationAsync(detectiveMemoryCollection, id: "fact1", text: "Elodie has always admired Minouch the cat for his brilliant deductive reasoning.");
await memory.SaveInformationAsync(detectiveMemoryCollection, id: "fact2", text: "Miss Marple is renowned as one of the greatest detectives in literature.");
await memory.SaveInformationAsync(detectiveMemoryCollection, id: "fact3", text: "Elodie mentioned that her favourite detective series features Sherlock Holmes solving intricate cases.");
await memory.SaveInformationAsync(detectiveMemoryCollection, id: "fact4", text: "Other detectives like Hercule Poirot are popular among her friends, but not her top choice.");
await memory.SaveInformationAsync(detectiveMemoryCollection, id: "fact5", text: "Detective stories inspired Elodie during her childhood, especially those of Hercule Poirot and Miss Marple.");

// Import memory plugin and define prompt template
var memoryPlugin = new TextMemoryPlugin(memory);
kernel.ImportPluginFromObject(memoryPlugin);

var promptTemplate = @"
Question: {{$input}}
Using the following memory facts: {{Recall}},
provide a comprehensive answer to the question.
";

// Define arguments and invoke the memory-enhanced prompt
var promptArgs = new KernelArguments()
{
    { "input", question },
    { "collection", detectiveMemoryCollection }
};
var enhancedResponse = kernel.InvokePromptStreamingAsync(promptTemplate, promptArgs);
await foreach (var output in enhancedResponse)
{
    Console.Write(output);
}

Developers can modify the stored memory facts, prompt templates, or model configurations to tailor the demo to different scenarios and use cases.


📚 Summary

The Semantic Kernel Integration Memory Demo provides a robust foundation for building AI applications that combine dynamic memory recall with powerful language generation. By leveraging the LM-Kit.NET.SemanticKernel package, developers can create contextually enriched responses that significantly enhance user engagement and application performance.

For more details, check out the demo repository.