Table of Contents

Build a Persistent Document Knowledge Base with Vector Storage

RAG pipelines lose their indexed data when the application shuts down unless embeddings are persisted to disk. For enterprise knowledge bases that grow incrementally over months, re-indexing thousands of documents on every restart is impractical. LM-Kit.NET's IVectorStore interface and FileSystemVectorStore implementation save embeddings to disk automatically. New documents are indexed incrementally, cache hits skip re-embedding, and the knowledge base survives restarts. This tutorial builds a persistent, incrementally-growing document knowledge base.


Why Persistent Vector Storage Matters

Two enterprise problems that persistent embedding storage solves:

  1. Application restarts without re-indexing. A customer support chatbot with 5,000 indexed knowledge articles takes 30 minutes to re-embed on startup. With persistent storage, the application starts in seconds by loading pre-computed embeddings from disk. Only new or updated articles need embedding.
  2. Incremental knowledge growth. An engineering team adds new design documents, test reports, and specifications weekly. A persistent knowledge base absorbs new documents without re-processing the existing collection, keeping indexing time proportional to new content, not total content.

Prerequisites

Requirement Minimum
.NET SDK 8.0+
VRAM 2+ GB (for embedding model)
Disk ~1 GB free for model + storage for embeddings

Step 1: Create the Project

dotnet new console -n PersistentKnowledgeBase
cd PersistentKnowledgeBase
dotnet add package LM-Kit.NET

Step 2: Understand the Architecture

                     ┌─────────────────────────────────────────────┐
                     │          Persistent Knowledge Base          │
                     │                                             │
  New documents ───► │  DocumentRag                                │
                     │      │                                      │
                     │      ▼                                      │
                     │  Chunk ► Embed ► Store                      │
                     │                    │                        │
                     │                    ▼                        │
                     │           FileSystemVectorStore             │
                     │           (disk-backed)                     │
                     │                    │                        │
                     │       ┌────────────┴────────────┐           │
                     │       ▼                         ▼           │
                     │   embeddings/              On restart:      │
                     │   ├── doc1.bin             Load from disk   │
                     │   ├── doc2.bin             (no re-embedding)│
                     │   └── doc3.bin                              │
                     │                                             │
  Query ───────────► │  Similarity Search ► Top-K results          │
                     └─────────────────────────────────────────────┘
Component Purpose
DocumentRag Ingests, chunks, embeds, and queries documents
FileSystemVectorStore Persists embeddings to disk as binary files
IVectorStore Interface for custom backends (database, cloud)

Step 3: Create a Persistent Knowledge Base

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

Step 4: Ingest Documents Incrementally

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

// ──────────────────────────────────────
// 4. Ingest documents (only new ones are embedded)
// ──────────────────────────────────────
string docsFolder = "documents";
if (!Directory.Exists(docsFolder))
{
    Console.WriteLine($"Create a '{docsFolder}' folder with documents, then run again.");
    return;
}

string[] supportedExtensions = { ".pdf", ".docx", ".txt", ".md", ".html" };

string[] files = Directory.GetFiles(docsFolder)
    .Where(f => supportedExtensions.Contains(Path.GetExtension(f).ToLowerInvariant()))
    .ToArray();

Console.WriteLine($"Found {files.Length} document(s). Ingesting (skipping cached)...\n");

string dataSourceId = "knowledge-base";

foreach (string filePath in files)
{
    string fileName = Path.GetFileName(filePath);
    string docId = Path.GetFileNameWithoutExtension(fileName);

    // Check if already indexed
    if (rag.DataSources.Any(ds => ds.HasSection(docId)))
    {
        Console.ForegroundColor = ConsoleColor.DarkGray;
        Console.WriteLine($"  {fileName}: already indexed (cached)");
        Console.ResetColor();
        continue;
    }

    Console.Write($"  {fileName}: indexing... ");

    try
    {
        using var attachment = new Attachment(filePath);
        var metadata = new DocumentRag.DocumentMetadata(
            attachment: attachment,
            id: docId,
            sourceUri: Path.GetFullPath(filePath));

        await rag.ImportDocumentAsync(attachment, metadata, dataSourceId);

        Console.ForegroundColor = ConsoleColor.Green;
        Console.WriteLine("done");
        Console.ResetColor();
    }
    catch (Exception ex)
    {
        Console.ForegroundColor = ConsoleColor.Red;
        Console.WriteLine($"failed: {ex.Message}");
        Console.ResetColor();
    }
}

Console.WriteLine($"\nIngestion complete. Data sources: {rag.DataSources.Count}");

Step 5: Query the Knowledge Base

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

// ──────────────────────────────────────
// 4. Load a chat model and query
// ──────────────────────────────────────
Console.WriteLine("\nLoading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

var chat = new SingleTurnConversation(chatModel)
{
    SystemPrompt = "Answer the question using only the provided context. " +
                   "If the context does not contain the answer, say so.",
    MaximumCompletionTokens = 512
};

chat.AfterTextCompletion += (_, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

Console.WriteLine("Ask questions about your documents (or 'quit' to exit):\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("Question: ");
    Console.ResetColor();

    string? question = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(question) || question.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    var matches = rag.FindMatchingPartitions(question, topK: 5, minScore: 0.25f);

    if (matches.Count == 0)
    {
        Console.WriteLine("No relevant passages found.\n");
        continue;
    }

    Console.ForegroundColor = ConsoleColor.DarkGray;
    foreach (var m in matches)
        Console.WriteLine($"  [{m.SectionIdentifier}] score={m.Similarity:F3}");
    Console.ResetColor();

    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("\nAnswer: ");
    Console.ResetColor();

    var answer = rag.QueryPartitions(question, matches, chat);
    Console.WriteLine($"\n  [{answer.Response.GeneratedTokenCount} tokens]\n");
}

Step 6: Update and Delete Documents

Keep the knowledge base current by updating and removing documents:

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

string dataSourceId = "knowledge-base";

// Delete an outdated document
string outdatedDocId = "old-policy-v1";
bool deleted = await rag.DeleteDocumentAsync(outdatedDocId, dataSourceId);
if (deleted)
    Console.WriteLine($"Removed '{outdatedDocId}' from knowledge base.");

// Update a document (delete + re-index)
string updatedDocId = "employee-handbook";
await rag.DeleteDocumentAsync(updatedDocId, dataSourceId);

using var updatedDoc = new Attachment("documents/employee-handbook-v2.pdf");
var updatedMeta = new DocumentRag.DocumentMetadata(
    attachment: updatedDoc,
    id: updatedDocId,
    sourceUri: "documents/employee-handbook-v2.pdf");

await rag.ImportDocumentAsync(updatedDoc, updatedMeta, dataSourceId);
Console.WriteLine($"Updated '{updatedDocId}' in knowledge base.");

Step 7: Adding Metadata for Filtered Queries

Tag documents with metadata for filtered retrieval:

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

// Custom metadata for filtering
var customMeta = new MetadataCollection
{
    { "department", "finance" },
    { "year", "2024" },
    { "quarter", "Q4" },
    { "confidentiality", "internal" }
};

var taggedMetadata = new DocumentRag.DocumentMetadata(
    name: "Q4 2024 Earnings Report",
    id: "earnings-q4-2024",
    sourceUri: "reports/earnings-q4-2024.pdf",
    customMetadata: customMeta);

using var attachment = new Attachment("reports/earnings-q4-2024.pdf");
await rag.ImportDocumentAsync(attachment, taggedMetadata, "financial-reports");

Model Selection

Embedding Models

Model ID Size Dimensions Best For
embeddinggemma-300m ~300 MB 256 General-purpose, fast, low memory (recommended)
qwen3-embedding:0.6b ~600 MB 1024 Higher dimension, better recall

Chat Models

Model ID VRAM Best For
gemma3:4b ~3.5 GB Good quality, fast responses
qwen3:8b ~6 GB Best balance for knowledge base Q&A

Common Issues

Problem Cause Fix
Slow first run, fast subsequent runs Expected: first run embeds, subsequent runs load from cache This is the expected behavior
"Already indexed" for all files Documents were previously indexed Delete the knowledge_base_store/ folder to re-index
Large disk usage Many documents with high-dimensional embeddings Use embeddinggemma-300m (256 dims) for smaller storage
Stale results after document update Old embeddings still in store Delete the document before re-indexing
Out of memory during large batch ingestion Too many documents processed at once Process in batches; dispose attachments after each import

Next Steps