Build a Persistent Document Knowledge Base with Vector Storage

RAG pipelines lose their indexed data when the application shuts down unless embeddings are persisted to disk. For enterprise knowledge bases that grow incrementally over months, re-indexing thousands of documents on every restart is impractical. LM-Kit.NET's IVectorStore interface and FileSystemVectorStore implementation save embeddings to disk automatically. New documents are indexed incrementally, cache hits skip re-embedding, and the knowledge base survives restarts. This tutorial builds a persistent, incrementally-growing document knowledge base.

Why Persistent Vector Storage Matters

Two enterprise problems that persistent embedding storage solves:

Application restarts without re-indexing. A customer support chatbot with 5,000 indexed knowledge articles takes 30 minutes to re-embed on startup. With persistent storage, the application starts in seconds by loading pre-computed embeddings from disk. Only new or updated articles need embedding.
Incremental knowledge growth. An engineering team adds new design documents, test reports, and specifications weekly. A persistent knowledge base absorbs new documents without re-processing the existing collection, keeping indexing time proportional to new content, not total content.

Prerequisites

Requirement	Minimum
.NET SDK	8.0+
VRAM	2+ GB (for embedding model)
Disk	~1 GB free for model + storage for embeddings

Step 1: Create the Project

dotnet new console -n PersistentKnowledgeBase
cd PersistentKnowledgeBase
dotnet add package LM-Kit.NET

Step 2: Understand the Architecture

                     ┌─────────────────────────────────────────────┐
                     │          Persistent Knowledge Base          │
                     │                                             │
  New documents ───► │  DocumentRag                                │
                     │      │                                      │
                     │      ▼                                      │
                     │  Chunk ► Embed ► Store                      │
                     │                    │                        │
                     │                    ▼                        │
                     │           FileSystemVectorStore             │
                     │           (disk-backed)                     │
                     │                    │                        │
                     │       ┌────────────┴────────────┐           │
                     │       ▼                         ▼           │
                     │   embeddings/              On restart:      │
                     │   ├── doc1.bin             Load from disk   │
                     │   ├── doc2.bin             (no re-embedding)│
                     │   └── doc3.bin                              │
                     │                                             │
  Query ───────────► │  Similarity Search ► Top-K results          │
                     └─────────────────────────────────────────────┘

Component	Purpose
`DocumentRag`	Ingests, chunks, embeds, and queries documents
`FileSystemVectorStore`	Persists embeddings to disk as binary files
`IVectorStore`	Interface for custom backends (database, cloud)

Step 3: Create a Persistent Knowledge Base

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

Step 4: Ingest Documents Incrementally

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

// ──────────────────────────────────────
// 4. Ingest documents (only new ones are embedded)
// ──────────────────────────────────────
string docsFolder = "documents";
if (!Directory.Exists(docsFolder))
{
    Console.WriteLine($"Create a '{docsFolder}' folder with documents, then run again.");
    return;
}

string[] supportedExtensions = { ".pdf", ".docx", ".txt", ".md", ".html" };

string[] files = Directory.GetFiles(docsFolder)
    .Where(f => supportedExtensions.Contains(Path.GetExtension(f).ToLowerInvariant()))
    .ToArray();

Console.WriteLine($"Found {files.Length} document(s). Ingesting (skipping cached)...\n");

string dataSourceId = "knowledge-base";

foreach (string filePath in files)
{
    string fileName = Path.GetFileName(filePath);
    string docId = Path.GetFileNameWithoutExtension(fileName);

    // Check if already indexed
    if (rag.DataSources.Any(ds => ds.HasSection(docId)))
    {
        Console.ForegroundColor = ConsoleColor.DarkGray;
        Console.WriteLine($"  {fileName}: already indexed (cached)");
        Console.ResetColor();
        continue;
    }

    Console.Write($"  {fileName}: indexing... ");

    try
    {
        using var attachment = new Attachment(filePath);
        var metadata = new DocumentRag.DocumentMetadata(
            attachment: attachment,
            id: docId,
            sourceUri: Path.GetFullPath(filePath));

        await rag.ImportDocumentAsync(attachment, metadata, dataSourceId);

        Console.ForegroundColor = ConsoleColor.Green;
        Console.WriteLine("done");
        Console.ResetColor();
    }
    catch (Exception ex)
    {
        Console.ForegroundColor = ConsoleColor.Red;
        Console.WriteLine($"failed: {ex.Message}");
        Console.ResetColor();
    }
}

Console.WriteLine($"\nIngestion complete. Data sources: {rag.DataSources.Count}");

Step 5: Query the Knowledge Base

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

// ──────────────────────────────────────
// 4. Load a chat model and query
// ──────────────────────────────────────
Console.WriteLine("\nLoading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

var chat = new SingleTurnConversation(chatModel)
{
    SystemPrompt = "Answer the question using only the provided context. " +
                   "If the context does not contain the answer, say so.",
    MaximumCompletionTokens = 512
};

chat.AfterTextCompletion += (_, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

Console.WriteLine("Ask questions about your documents (or 'quit' to exit):\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("Question: ");
    Console.ResetColor();

    string? question = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(question) || question.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    var matches = rag.FindMatchingPartitions(question, topK: 5, minScore: 0.25f);

    if (matches.Count == 0)
    {
        Console.WriteLine("No relevant passages found.\n");
        continue;
    }

    Console.ForegroundColor = ConsoleColor.DarkGray;
    foreach (var m in matches)
        Console.WriteLine($"  [{m.SectionIdentifier}] score={m.Similarity:F3}");
    Console.ResetColor();

    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("\nAnswer: ");
    Console.ResetColor();

    var answer = rag.QueryPartitions(question, matches, chat);
    Console.WriteLine($"\n  [{answer.Response.GeneratedTokenCount} tokens]\n");
}

Step 6: Update and Delete Documents

Keep the knowledge base current by updating and removing documents:

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

string dataSourceId = "knowledge-base";

// Delete an outdated document
string outdatedDocId = "old-policy-v1";
bool deleted = await rag.DeleteDocumentAsync(outdatedDocId, dataSourceId);
if (deleted)
    Console.WriteLine($"Removed '{outdatedDocId}' from knowledge base.");

// Update a document (delete + re-index)
string updatedDocId = "employee-handbook";
await rag.DeleteDocumentAsync(updatedDocId, dataSourceId);

using var updatedDoc = new Attachment("documents/employee-handbook-v2.pdf");
var updatedMeta = new DocumentRag.DocumentMetadata(
    attachment: updatedDoc,
    id: updatedDocId,
    sourceUri: "documents/employee-handbook-v2.pdf");

await rag.ImportDocumentAsync(updatedDoc, updatedMeta, dataSourceId);
Console.WriteLine($"Updated '{updatedDocId}' in knowledge base.");

Step 7: Adding Metadata for Filtered Queries

Tag documents with metadata for filtered retrieval:

using System.Text;
using LMKit.Data;
using LMKit.Data.Storage;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load the embedding model
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 2. Configure persistent vector store
// ──────────────────────────────────────
string storageDir = "knowledge_base_store";
Directory.CreateDirectory(storageDir);

var vectorStore = new FileSystemVectorStore(storageDir);

Console.WriteLine($"Vector store directory: {Path.GetFullPath(storageDir)}");

// ──────────────────────────────────────
// 3. Create DocumentRag with persistent storage
// ──────────────────────────────────────
var rag = new DocumentRag(embeddingModel, vectorStore)
{
    MaxChunkSize = 512
};

// Custom metadata for filtering
var customMeta = new MetadataCollection
{
    { "department", "finance" },
    { "year", "2024" },
    { "quarter", "Q4" },
    { "confidentiality", "internal" }
};

var taggedMetadata = new DocumentRag.DocumentMetadata(
    name: "Q4 2024 Earnings Report",
    id: "earnings-q4-2024",
    sourceUri: "reports/earnings-q4-2024.pdf",
    customMetadata: customMeta);

using var attachment = new Attachment("reports/earnings-q4-2024.pdf");
await rag.ImportDocumentAsync(attachment, taggedMetadata, "financial-reports");

Model Selection

Embedding Models

Model ID	Size	Dimensions	Best For
`embeddinggemma-300m`	~300 MB	256	General-purpose, fast, low memory (recommended)
`qwen3-embedding:0.6b`	~600 MB	1024	Higher dimension, better recall

Chat Models

Model ID	VRAM	Best For
`gemma3:4b`	~3.5 GB	Good quality, fast responses
`qwen3.5:9b`	~7 GB	Best balance for knowledge base Q&A

Common Issues

Problem	Cause	Fix
Slow first run, fast subsequent runs	Expected: first run embeds, subsequent runs load from cache	This is the expected behavior
"Already indexed" for all files	Documents were previously indexed	Delete the `knowledge_base_store/` folder to re-index
Large disk usage	Many documents with high-dimensional embeddings	Use `embeddinggemma-300m` (256 dims) for smaller storage
Stale results after document update	Old embeddings still in store	Delete the document before re-indexing
Out of memory during large batch ingestion	Too many documents processed at once	Process in batches; dispose attachments after each import

Next Steps

Build a RAG Pipeline Over Your Own Documents: foundational RAG with RagEngine.
Improve RAG Results with Reranking: add reranking for better retrieval precision.
Build a Multi-Format Document Ingestion Pipeline: ingest PDFs, Word, images, and HTML.
Chat with PDF Documents: high-level PDF chat with PdfChat.
Build a Real-Time Document Monitoring and Indexing Agent: automate document intake with folder monitoring and automatic classification.
Process Email Archives for Compliance and Legal Discovery: index email archives into your knowledge base.

Table of Contents