Table of Contents

πŸ€– Building a Custom AI Chatbot with RAG and Qdrant using C#


🎯 Purpose of the Sample

The Custom Chatbot with RAG and Qdrant Demo illustrates how to integrate LM-Kit.NET with a Qdrant vector store to build a chatbot that uses Retrieval-Augmented Generation (RAG). This solution demonstrates how semantic search and large language models (LLMs) can be combined in a .NET application to provide accurate, context-aware responses from stored vector embeddings.


πŸ‘₯ Industry Target Audience

This sample benefits developers and organizations in areas like:

  • πŸ›ŽοΈ Customer Support: Query large knowledge bases with fast, relevant results.
  • πŸ“š Education: Build tutoring systems based on textbooks and training content.
  • πŸ₯ Healthcare: Provide real-time answers from medical guidelines or literature.
  • πŸ“¦ Logistics & Industry: Search across technical documentation and manuals.
  • πŸ›οΈ E-commerce: Enable chat-based access to product catalogs and FAQs.

πŸš€ Problem Solved

Managing large corpora of data and generating precise answers is a complex task. This demo solves the problem by:

  • Persisting embeddings in a vector database (Qdrant) for faster future access.
  • Combining semantic search with generative AI for accurate, context-aware responses.
  • Allowing reuse of processed data sources across multiple sessions.

πŸ’» Sample Application Description

This is a console-based chatbot demo that loads a series of eBooks, stores their embeddings in a local Qdrant instance, and uses RAG to answer queries.

✨ Key Features

  • πŸ”Œ Qdrant Integration: Uses LM-Kit.NET.Data.Connectors.Qdrant to persist embeddings locally.
  • πŸ“¦ Model Loading: Load chat and embedding models from predefined URLs or custom paths.
  • 🧠 Persistent Vector Store: Avoid redundant embedding computation across sessions.
  • πŸ“š Semantic Search: Retrieve relevant content chunks before generation.
  • πŸ“ˆ Performance Tracking: Measure data source loading time.
  • ♻️ Cached Data Handling: Reuse indexed vectors when available.

🧠 Supported Models

  • Chat Model: Gemma 3.4B Instruct (quantized variant)
  • Embeddings Model: BGE-small-en-v1.5

🧾 Data Sources

Three classic eBooks are loaded as semantic knowledge bases:

  • Romeo and Juliet by William Shakespeare
  • Moby Dick by Herman Melville
  • Pride and Prejudice by Jane Austen

πŸ› οΈ Getting Started

πŸ“‹ Prerequisites

🧰 Qdrant Setup

Run Qdrant locally using Docker:

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

πŸ“₯ Download the Project

▢️ Run the Application

  1. πŸ“‚ Clone the repository:
git clone https://github.com/LM-Kit/lm-kit-net-samples.git
  1. πŸ“ Navigate to the demo directory:
cd lm-kit-net-samples/console_net/custom_chatbot_with_rag_qdrant_vector_store
  1. πŸ”¨ Build and run:
dotnet build
dotnet run

πŸ’¬ Example Usage

  1. Load both models (chat and embedding) from URLs or specify local paths.
  2. The demo will load the eBooks, embedding them via LM-Kit and storing them in Qdrant.
  3. Enter your query, e.g.:
    • "Who are the main characters in Romeo and Juliet?"
    • "What is Captain Ahab’s goal in Moby Dick?"
  4. The chatbot retrieves matching content and generates a response.
  5. The app skips re-embedding if the data is already present in Qdrant.

πŸ› οΈ Special Behavior

  • βœ… Reuse of stored collections: Embeddings are reused if already indexed.
  • πŸ”„ Fresh vector indexing: If a collection is not present, it’s created and persisted.
  • πŸ“‰ Dynamic top-k: GPU-optimized systems select more partitions for improved accuracy.

πŸ”“ Licensing

To run the application, you can use a free community license available at:

πŸ‘‰ https://lm-kit.com/products/community-edition/


By combining the power of semantic search and LLMs, this demo offers a robust foundation for building production-ready, intelligent chatbots in C# with persistent, scalable vector storage.