Build a RAG Pipeline Over Your Own Documents

Retrieval-Augmented Generation (RAG) grounds LLM responses in your own data. Instead of relying on the model's training data alone, RAG retrieves relevant passages from your documents and injects them into the prompt context. This eliminates hallucinations on domain-specific questions and keeps answers current without retraining.

This tutorial builds a working RAG system that indexes text files, persists the index to disk, and answers questions using retrieved context.

Why Local RAG Matters

Two real-world problems that on-device RAG solves:

Data sovereignty in regulated industries. Healthcare, finance, and legal organizations cannot send proprietary documents to cloud APIs. Local RAG keeps all data on-premises while still delivering AI-powered Q&A.
Offline knowledge bases for field workers. Technicians, inspectors, and engineers need access to manuals and procedures in environments with no internet connectivity. Local RAG runs entirely on a laptop or edge device.

Prerequisites

Requirement	Minimum
.NET SDK	8.0+
RAM	16 GB recommended
VRAM	6 GB (for both models simultaneously)
Disk	~4 GB free for model downloads

You will load two models: an embedding model (for indexing and search) and a chat model (for generating answers).

Step 1: Create the Project

dotnet new console -n RagQuickstart
cd RagQuickstart
dotnet add package LM-Kit.NET

Step 2: Understand the RAG Architecture

┌──────────────┐    chunk + embed    ┌────────────────┐
│  Your Docs   │ ─────────────────── │  DataSource    │
│  (.txt, .md) │                     │  (vector index)│
└──────────────┘                     └───────┬────────┘
                                             │ similarity search
┌──────────────┐    embed query              │
│  User Query  │ ───────────────────────────►│
└──────────────┘                             │
                                             ▼
                                     ┌───────────────┐
                                     │  Top-K chunks │
                                     └───────┬───────┘
                                             │ inject into prompt
                                             ▼
                                     ┌───────────────┐
                                     │  Chat Model   │ ──► Answer
                                     └───────────────┘

Key classes:

Class	Role
`RagEngine`	Orchestrates indexing, search, and LLM querying
`DataSource`	Stores chunk embeddings (in-memory or file-backed)
`TextChunking`	Splits text into overlapping chunks
`Embedder`	Generates vector embeddings
`SingleTurnConversation`	Generates the final answer from retrieved context

Step 3: Write the Program

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create the RAG engine with a file-backed index
// ──────────────────────────────────────
const string IndexPath = "knowledge_base.dat";

DataSource dataSource;
if (File.Exists(IndexPath))
{
    Console.WriteLine("Loading existing index from disk...");
    dataSource = DataSource.LoadFromFile(IndexPath, readOnly: false);
}
else
{
    dataSource = DataSource.CreateFileDataSource(IndexPath, "KnowledgeBase", embeddingModel);
}

var rag = new RagEngine(embeddingModel);
rag.AddDataSource(dataSource);

// Configure chunking
rag.DefaultIChunking = new TextChunking
{
    MaxChunkSize = 500,    // tokens per chunk
    MaxOverlapSize = 50    // overlap for context continuity
};

// ──────────────────────────────────────
// 3. Index documents (skip sections already indexed)
// ──────────────────────────────────────
string[] docs = {
    "docs/product-manual.txt",
    "docs/faq.txt",
    "docs/troubleshooting.txt"
};

foreach (string docPath in docs)
{
    string sectionName = Path.GetFileNameWithoutExtension(docPath);

    if (dataSource.HasSection(sectionName))
    {
        Console.WriteLine($"  Skipping {sectionName} (already indexed)");
        continue;
    }

    if (!File.Exists(docPath))
    {
        Console.WriteLine($"  Skipping {docPath} (file not found)");
        continue;
    }

    Console.WriteLine($"  Indexing {sectionName}...");
    string content = File.ReadAllText(docPath);
    rag.ImportText(content, "KnowledgeBase", sectionName);
}

Console.WriteLine($"\nIndex contains {dataSource.Sections.Count()} section(s).\n");

// ──────────────────────────────────────
// 4. Query loop
// ──────────────────────────────────────
var chat = new SingleTurnConversation(chatModel)
{
    SystemPrompt = "Answer the question using only the provided context. " +
                   "If the context does not contain the answer, say so.",
    MaximumCompletionTokens = 512
};

chat.AfterTextCompletion += (_, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

Console.WriteLine("Ask a question about your documents (or 'quit' to exit):\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("Question: ");
    Console.ResetColor();

    string? query = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(query) || query.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    // Retrieve top-3 most relevant chunks
    var matches = rag.FindMatchingPartitions(query, topK: 3, minScore: 0.3f);

    if (matches.Count == 0)
    {
        Console.WriteLine("No relevant passages found in the index.\n");
        continue;
    }

    // Show which sections were matched
    Console.ForegroundColor = ConsoleColor.DarkGray;
    foreach (var m in matches)
        Console.WriteLine($"  [{m.SectionIdentifier}] score={m.Similarity:F3}");
    Console.ResetColor();

    // Generate answer grounded in the retrieved context
    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("\nAnswer: ");
    Console.ResetColor();

    var result = rag.QueryPartitions(query, matches, chat);
    Console.WriteLine($"\n  [{result.GeneratedTokenCount} tokens, {result.TokenGenerationRate:F1} tok/s]\n");
}

// ──────────────────────────────────────
// Helper callbacks
// ──────────────────────────────────────
static bool DownloadProgress(string path, long? contentLength, long bytesRead)
{
    if (contentLength.HasValue)
        Console.Write($"\r  Downloading: {(double)bytesRead / contentLength.Value * 100:F1}%   ");
    return true;
}

static bool LoadProgress(float progress)
{
    Console.Write($"\r  Loading: {progress * 100:F0}%   ");
    return true;
}

Step 4: Create Sample Documents and Run

Create a docs/ folder with a few .txt files containing your content, then:

dotnet run

Example session:

Loading embedding model...
  Loading: 100%    Done.

Loading chat model...
  Loading: 100%    Done.

  Indexing product-manual...
  Indexing faq...

Index contains 2 section(s).

Ask a question about your documents (or 'quit' to exit):

Question: How do I reset the device to factory settings?
  [product-manual] score=0.847
  [faq] score=0.612

Answer: To reset the device to factory settings, press and hold the power button
and volume-down button simultaneously for 10 seconds until the LED flashes red.
The device will restart and all user data will be erased.
  [52 tokens, 38.7 tok/s]

Choosing an Embedding Model

Model ID	Dimensions	Size	Best For
`embeddinggemma-300m`	256	~300 MB	General-purpose, fast, lowest memory
`qwen3-embedding:0.6b`	1024	~600 MB	Higher dimension, better recall for large collections
`nomic-embed-text`	768	~260 MB	High-quality text embeddings

All models are downloaded automatically with LoadFromModelID. Use embeddinggemma-300m as a default starting point. For production workloads with large document collections, qwen3-embedding:0.6b provides stronger recall thanks to its higher dimensionality.

Tuning Retrieval Quality

Chunk Size

Chunk Size	Effect
Small (200-300)	More precise matches, but may split important context
Medium (400-500)	Good default balance
Large (800-1000)	Better for long-form content, less precise matching

Search Parameters

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create the RAG engine with a file-backed index
// ──────────────────────────────────────
const string IndexPath = "knowledge_base.dat";

DataSource dataSource;
if (File.Exists(IndexPath))
{
    Console.WriteLine("Loading existing index from disk...");
    dataSource = DataSource.LoadFromFile(IndexPath, readOnly: false);
}
else
{
    dataSource = DataSource.CreateFileDataSource(IndexPath, "KnowledgeBase", embeddingModel);
}

var rag = new RagEngine(embeddingModel);

string query = "What are the key findings?";
var matches = rag.FindMatchingPartitions(
    query,
    topK: 5,                    // return up to 5 chunks
    minScore: 0.3f,             // minimum cosine similarity threshold
    forceUniqueSection: true    // at most one result per section
);

Lowering minScore returns more results (higher recall, lower precision). Raising it returns fewer, more relevant results.

Adding a Reranker

A reranker re-scores retrieved chunks using a cross-encoder, improving ranking quality at a small latency cost:

rag.Reranker = new RagEngine.RagReranker(embeddingModel, rerankedAlpha: 0.7f);
// rerankedAlpha: 0.0 = only original score, 1.0 = only reranker score

Persistence and Incremental Updates

The DataSource.CreateFileDataSource approach persists embeddings to disk. On subsequent runs, DataSource.LoadFromFile loads the index instantly without re-embedding.

To add new documents later:

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create the RAG engine with a file-backed index
// ──────────────────────────────────────
const string IndexPath = "knowledge_base.dat";

DataSource dataSource;
if (File.Exists(IndexPath))
{
    Console.WriteLine("Loading existing index from disk...");
    dataSource = DataSource.LoadFromFile(IndexPath, readOnly: false);
}
else
{
    dataSource = DataSource.CreateFileDataSource(IndexPath, "KnowledgeBase", embeddingModel);
}

var rag = new RagEngine(embeddingModel);
rag.AddDataSource(dataSource);

// Configure chunking
rag.DefaultIChunking = new TextChunking
{
    MaxChunkSize = 500,    // tokens per chunk
    MaxOverlapSize = 50    // overlap for context continuity
};

// ──────────────────────────────────────
// 3. Index documents (skip sections already indexed)
// ──────────────────────────────────────
string[] docs = {
    "docs/product-manual.txt",
    "docs/faq.txt",
    "docs/troubleshooting.txt"
};

if (!dataSource.HasSection("new-document"))
{
    string content = File.ReadAllText("docs/new-document.txt");
    rag.ImportText(content, "KnowledgeBase", "new-document");
}

Scaling Up: PDF and Markdown Documents

For PDF files, use DocumentRag instead of RagEngine for built-in document parsing:

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create the RAG engine with a file-backed index
// ──────────────────────────────────────
const string IndexPath = "knowledge_base.dat";

DataSource dataSource;
if (File.Exists(IndexPath))
{
    Console.WriteLine("Loading existing index from disk...");
    dataSource = DataSource.LoadFromFile(IndexPath, readOnly: false);
}
else
{
    dataSource = DataSource.CreateFileDataSource(IndexPath, "KnowledgeBase", embeddingModel);
}

var docRag = new DocumentRag(embeddingModel);

var attachment = new Attachment("report.pdf");
var metadata = new DocumentRag.DocumentMetadata(attachment, id: "q4-report");
await docRag.ImportDocumentAsync(attachment, metadata, "Reports");

var matches = docRag.FindMatchingPartitions("quarterly revenue", topK: 5);

For the highest-level PDF Q&A experience (with chat history and source references), use PdfChat:

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create the RAG engine with a file-backed index
// ──────────────────────────────────────
const string IndexPath = "knowledge_base.dat";

DataSource dataSource;
if (File.Exists(IndexPath))
{
    Console.WriteLine("Loading existing index from disk...");
    dataSource = DataSource.LoadFromFile(IndexPath, readOnly: false);
}
else
{
    dataSource = DataSource.CreateFileDataSource(IndexPath, "KnowledgeBase", embeddingModel);
}

var pdfChat = new PdfChat(chatModel, embeddingModel);
await pdfChat.LoadDocumentAsync("report.pdf");
var response = await pdfChat.SubmitAsync("What were the key findings?");
Console.WriteLine(response.Response.Completion);

Custom Prompt Templates

Override how retrieved context is injected into the prompt:

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma3:4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create the RAG engine with a file-backed index
// ──────────────────────────────────────
const string IndexPath = "knowledge_base.dat";

DataSource dataSource;
if (File.Exists(IndexPath))
{
    Console.WriteLine("Loading existing index from disk...");
    dataSource = DataSource.LoadFromFile(IndexPath, readOnly: false);
}
else
{
    dataSource = DataSource.CreateFileDataSource(IndexPath, "KnowledgeBase", embeddingModel);
}

var rag = new RagEngine(embeddingModel);
rag.AddDataSource(dataSource);

// Configure chunking
rag.DefaultIChunking = new TextChunking
{
    MaxChunkSize = 500,    // tokens per chunk
    MaxOverlapSize = 50    // overlap for context continuity
};

// ──────────────────────────────────────
// 3. Index documents (skip sections already indexed)
// ──────────────────────────────────────
string[] docs = {
    "docs/product-manual.txt",
    "docs/faq.txt",
    "docs/troubleshooting.txt"
};

foreach (string docPath in docs)
{
    string sectionName = Path.GetFileNameWithoutExtension(docPath);

    if (dataSource.HasSection(sectionName))
    {
        Console.WriteLine($"  Skipping {sectionName} (already indexed)");
        continue;
    }

    if (!File.Exists(docPath))
    {
        Console.WriteLine($"  Skipping {docPath} (file not found)");
        continue;
    }

    Console.WriteLine($"  Indexing {sectionName}...");
    string content = File.ReadAllText(docPath);
    rag.ImportText(content, "KnowledgeBase", sectionName);
}

Console.WriteLine($"\nIndex contains {dataSource.Sections.Count()} section(s).\n");

// ──────────────────────────────────────
// 4. Query loop
// ──────────────────────────────────────
var chat = new SingleTurnConversation(chatModel)
{
    SystemPrompt = "Answer the question using only the provided context. " +
                   "If the context does not contain the answer, say so.",
    MaximumCompletionTokens = 512
};

chat.AfterTextCompletion += (_, e) =>
{
    if (e.SegmentType == TextSegmentType.UserVisible)
        Console.Write(e.Text);
};

Console.WriteLine("Ask a question about your documents (or 'quit' to exit):\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("Question: ");
    Console.ResetColor();

    string? query = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(query) || query.Equals("quit", StringComparison.OrdinalIgnoreCase))
        break;

    // Retrieve top-3 most relevant chunks
    var matches = rag.FindMatchingPartitions(query, topK: 3, minScore: 0.3f);

    if (matches.Count == 0)
    {
        Console.WriteLine("No relevant passages found in the index.\n");
        continue;
    }

    // Show which sections were matched
    Console.ForegroundColor = ConsoleColor.DarkGray;
    foreach (var m in matches)
        Console.WriteLine($"  [{m.SectionIdentifier}] score={m.Similarity:F3}");
    Console.ResetColor();

    string customTemplate = @"Use the following reference material to answer the user's question.
If the material does not contain the answer, state that clearly.

## Reference Material:
@context

## User Question:
@question";

    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.Write("\nAnswer: ");
    Console.ResetColor();

    var result = rag.QueryPartitions(query, customTemplate, matches, chat);
    Console.WriteLine($"\n  [{result.GeneratedTokenCount} tokens, {result.TokenGenerationRate:F1} tok/s]\n");
}

// ──────────────────────────────────────
// Helper callbacks
// ──────────────────────────────────────
static bool DownloadProgress(string path, long? contentLength, long bytesRead)
{
    if (contentLength.HasValue)
        Console.Write($"\r  Downloading: {(double)bytesRead / contentLength.Value * 100:F1}%   ");
    return true;
}

static bool LoadProgress(float progress)
{
    Console.Write($"\r  Loading: {progress * 100:F0}%   ");
    return true;
}

The placeholders @context and @question are replaced automatically.

Common Issues

Problem	Cause	Fix
Low similarity scores	Embedding model not suited to your domain	Try `nomic-embed-text` or increase chunk overlap
Answers ignore retrieved context	System prompt too weak	Strengthen the instruction: "Answer ONLY from the provided context"
Index file grows large	Many large documents	Use `MarkdownChunking` for structured docs, or reduce `MaxChunkSize`
Slow indexing	Large corpus on CPU	Use GPU-accelerated embedding, or batch-index offline

Next Steps

Boost Retrieval with Hybrid Search: combine vector and BM25 search for broader recall.
Build Conversational RAG with RagChat: wrap your pipeline in a multi-turn conversational interface.
Improve Recall with Multi-Query and HyDE Retrieval: expand queries to capture more relevant passages.
Diversify and Filter RAG Results: reduce redundancy with MMR and scope retrieval with metadata filtering.
Improve RAG Results with Reranking: add a cross-encoder reranker to boost retrieval precision.
Optimize RAG with Custom Chunking Strategies: tailor TextChunking, MarkdownChunking, or HtmlChunking to your content.
Build a Unified Multimodal RAG System: index audio, images, and text in one knowledge base.
Chat with PDF Documents: high-level PDF chat with PdfChat.
Samples: Conversational RAG: multi-turn RAG with RagChat, query contextualization, Multi-Query, and HyDE.
Samples: Single-Turn RAG: basic single-turn Q&A demo.

Table of Contents