How Do I Build a Private Document Q&A System?

TL;DR

LM-Kit.NET provides two approaches: PdfChat for quick setup with automatic size-based routing (full-document for small files, passage retrieval for large ones), and RagEngine for fully customizable RAG pipelines. Both run entirely locally with no cloud dependency. Supported formats include PDF, DOCX, XLSX, PPTX, HTML, EML, MBOX, and images (via OCR). All processing stays on your machine.

Approach 1: PdfChat (Quick Setup)

PdfChat is a ready-to-use conversational Q&A class that handles document loading, chunking, retrieval, and multi-turn chat automatically:

using LMKit.Model;
using LMKit.Retrieval;

using LM chatModel = LM.LoadFromModelID("qwen3.5:9b");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m");

var pdfChat = new PdfChat(chatModel, embeddingModel);

// Load one or more documents
pdfChat.LoadDocument("quarterly-report.pdf");
pdfChat.LoadDocument("product-manual.docx");

// Ask questions (multi-turn, conversational)
string answer1 = pdfChat.Submit("What was the Q3 revenue?");
string answer2 = pdfChat.Submit("How does that compare to Q2?");

// Source references from the last answer
foreach (var source in pdfChat.LastSourceReferences)
    Console.WriteLine($"  Source: {source.DocumentName}, Page {source.PageNumber}");

Automatic Size-Based Routing

PdfChat automatically chooses the best strategy per document:

Document Size	Strategy	How It Works
Under 4096 tokens (default)	Full-document context	Entire document included in the prompt
Over 4096 tokens	Passage retrieval	Only relevant passages retrieved via embedding search

The threshold is configurable:

pdfChat.FullDocumentTokenBudget = 8192;  // Raise for larger full-doc inclusion

Query Generation Modes

For multi-turn conversations, PdfChat can reformulate follow-up questions:

Mode	Behavior
Original (default)	Use the question as-is
Contextual	Reformulate using conversation history ("How about Q2?" → "What was the Q2 revenue?")
MultiQuery	Generate multiple query variants for broader retrieval
HypotheticalAnswer	Generate a hypothetical answer first, then retrieve (HyDE)

Approach 2: RagEngine (Full Control)

For custom RAG pipelines with fine-grained control over chunking, retrieval, and reranking:

using LMKit.Retrieval;
using LMKit.Embeddings;

using LM embeddingModel = LM.LoadFromModelID("bge-m3");
var ragEngine = new RagEngine(embeddingModel);

// Import documents with automatic chunking
ragEngine.ImportDocument("knowledge-base/manual.pdf");
ragEngine.ImportDocument("knowledge-base/faq.html");

// Customize chunking
ragEngine.DefaultChunking = new TextChunking
{
    MaxChunkSize = 300,    // Tokens per chunk
    MaxOverlapSize = 50    // Overlap for context preservation
};

// Optional: Add a reranker for higher precision
using LM rerankerModel = LM.LoadFromModelID("bge-m3-reranker");
ragEngine.Reranker = new Reranker(rerankerModel);

// Query with a conversation
var chat = new MultiTurnConversation(chatModel);
string answer = ragEngine.QueryWithContext(chat, "What are the safety requirements?");

Supported Document Formats

Format	Extension	Notes
PDF	.pdf	Native text extraction + optional OCR for scanned pages
Word	.docx	Full text and table extraction
Excel	.xlsx	Cell-level data extraction
PowerPoint	.pptx	Slide text extraction
HTML	.html	Structure-aware parsing
Email	.eml	Headers, body, and attachments
Mail archive	.mbox	Multi-message archive processing
Images	.png, .jpg, .tiff, etc.	Via VLM OCR or LM-Kit OCR

Privacy: Everything Stays Local

Both PdfChat and RagEngine run entirely on your machine:

No cloud calls. Documents are processed locally.
No data leaves your network. Embeddings are computed locally.
No API keys required. Models run via local inference.

This makes LM-Kit.NET ideal for sensitive documents: legal contracts, medical records, financial reports, and proprietary code.

What embedding models does LM-Kit.NET support?: Choose the right embedding model for your Q&A system.
How do I handle documents larger than the model's context window?: Chunking and overflow strategies.
What OCR options does LM-Kit.NET provide?: Process scanned documents in your Q&A pipeline.
Build a RAG Pipeline Over Your Own Documents: Step-by-step RAG implementation guide.

Table of Contents