Boost Retrieval with Hybrid Search

Pure vector search finds semantically similar passages but can miss exact keyword matches. Pure keyword search (BM25) finds exact terms but misses paraphrases and synonyms. Hybrid search combines both approaches and merges their results with Reciprocal Rank Fusion (RRF), giving you the best of both worlds in a single query.

This tutorial shows how to switch from the default VectorRetrievalStrategy to HybridRetrievalStrategy and tune BM25 parameters for your domain.

Why This Matters

Two enterprise problems that hybrid search solves:

Exact terminology retrieval in technical domains. A developer searching for "NullReferenceException" needs the exact error string, not a passage about "unexpected null values." Vector search alone may rank the paraphrase higher. BM25 catches the exact match, and RRF fusion ensures it surfaces.
Multilingual knowledge bases with mixed vocabulary. When documents contain product names, part numbers, or domain-specific acronyms, vector embeddings may not capture these well. Hybrid search ensures that lexical matches on these identifiers are combined with semantic understanding of the surrounding context.

Prerequisites

Requirement	Minimum
.NET SDK	8.0+
RAM	16 GB recommended
VRAM	6 GB (for both models simultaneously)
Disk	~4 GB free for model downloads

Step 1: Create the Project

dotnet new console -n HybridSearchQuickstart
cd HybridSearchQuickstart
dotnet add package LM-Kit.NET

Step 2: Understand the Architecture

                          User Query
                              │
                    ┌─────────┴─────────┐
                    ▼                   ▼
            ┌──────────────┐   ┌──────────────┐
            │    Vector    │   │    BM25      │
            │   Strategy   │   │   Strategy   │
            │ (semantic)   │   │  (keyword)   │
            └──────┬───────┘   └──────┬───────┘
                   │                  │
                   ▼                  ▼
              Ranked List A     Ranked List B
                   │                  │
                   └────────┬─────────┘
                            ▼
                   ┌────────────────┐
                   │  RRF Fusion    │
                   │  (weighted)    │
                   └───────┬────────┘
                           ▼
                    Final Ranked List

Key classes:

Class	Role
`HybridRetrievalStrategy`	Orchestrates vector + BM25 retrieval and merges results with RRF
`Bm25RetrievalStrategy`	Lexical keyword search with BM25+ scoring
`VectorRetrievalStrategy`	Cosine similarity search over embeddings (default)
`RagEngine`	Orchestrates indexing, search, and LLM querying

Step 3: Enable Hybrid Search

The simplest way to enable hybrid search is to set RetrievalStrategy on your RagEngine:

using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using LMKit.Retrieval.Bm25;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load models
// ──────────────────────────────────────
Console.WriteLine("Loading embedding model...");
using LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m", // or "harrier-oss:0.6b"
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("gemma4:e4b",
    downloadingProgress: (_, len, read) =>
    {
        if (len.HasValue) Console.Write($"\r  Downloading: {(double)read / len.Value * 100:F1}%   ");
        return true;
    },
    loadingProgress: p => { Console.Write($"\r  Loading: {p * 100:F0}%   "); return true; });
Console.WriteLine(" Done.\n");

// ──────────────────────────────────────
// 2. Create RAG engine with hybrid search
// ──────────────────────────────────────
var dataSource = DataSource.CreateInMemoryDataSource("KnowledgeBase", embeddingModel);
var rag = new RagEngine(embeddingModel);
rag.AddDataSource(dataSource);

// Switch from default VectorRetrievalStrategy to hybrid
rag.RetrievalStrategy = new HybridRetrievalStrategy();

// ──────────────────────────────────────
// 3. Index sample documents
// ──────────────────────────────────────
string[] docs =
{
    "Error code NRE-4021: A NullReferenceException occurs when the connection pool is exhausted under high concurrency.",
    "Connection pooling improves performance by reusing database connections instead of creating new ones for each request.",
    "When the application throws unexpected null errors, verify that all services are registered in the dependency injection container.",
    "The maximum pool size defaults to 100 connections. Increase it in the connection string with 'Max Pool Size=200'."
};

foreach (string doc in docs)
    rag.ImportText(doc, "KnowledgeBase", "troubleshooting");

// ──────────────────────────────────────
// 4. Query: hybrid search catches both semantic AND keyword matches
// ──────────────────────────────────────
string query = "NullReferenceException connection pool";
var matches = rag.FindMatchingPartitions(query, topK: 3, minScore: 0.1f);

Console.WriteLine($"Query: \"{query}\"\n");
Console.WriteLine("Hybrid search results:");
foreach (var m in matches)
{
    Console.WriteLine($"  score={m.Similarity:F3}  {m.Payload.Content.Substring(0, Math.Min(90, m.Payload.Content.Length))}...");
}

With hybrid search, the passage containing the exact error code "NRE-4021" and "NullReferenceException" surfaces strongly (BM25 keyword match), while the passage about "unexpected null errors" also ranks well (semantic match). Neither strategy alone would produce this combined ranking.

Step 4: Tune BM25 Parameters

The Bm25RetrievalStrategy exposes several parameters that control how keyword matching behaves:

var bm25 = new Bm25RetrievalStrategy
{
    K1 = 1.5f,              // Term frequency saturation (default: 1.2)
    B = 0.8f,               // Document length normalization (default: 0.75)
    Delta = 1.0f,            // BM25+ lower-bound floor (default: 1.0)
    ProximityWeight = 0.5f,  // Boost for co-occurring terms (default: 0.3)
    Language = Language.English
};

rag.RetrievalStrategy = new HybridRetrievalStrategy(
    new VectorRetrievalStrategy(),
    bm25
);

Parameter	Default	Effect
`K1`	1.2	Controls term frequency saturation. Higher values give more weight to repeated terms.
`B`	0.75	Length normalization. `0.0` ignores document length; `1.0` fully normalizes.
`Delta`	1.0	BM25+ floor. Ensures long documents are not unfairly penalized.
`ProximityWeight`	0.3	Boosts passages where query terms appear close together.
`Language`	English	Controls stopword removal and stemming rules.

For most English-language corpora, the defaults work well. Increase ProximityWeight when phrase-level matching matters (e.g., searching for "machine learning" should prefer passages where those words appear adjacent).

Step 5: Adjust Fusion Weights

By default, vector and BM25 results are weighted equally during RRF fusion. You can bias toward one strategy:

var hybrid = new HybridRetrievalStrategy
{
    VectorWeight = 1.0f,    // Weight for semantic results (default: 1.0)
    KeywordWeight = 1.5f,   // Weight for BM25 results (default: 1.0)
    RrfK = 60              // RRF smoothing constant (default: 60)
};

rag.RetrievalStrategy = hybrid;

Weight Configuration	Best For
Equal weights (1.0 / 1.0)	General-purpose, balanced starting point
Higher `KeywordWeight` (1.0 / 1.5)	Technical docs with exact identifiers, error codes, part numbers
Higher `VectorWeight` (1.5 / 1.0)	Conversational queries, questions with varied phrasing

The RrfK constant controls how quickly ranks decay during fusion. Lower values (e.g., 20) amplify the gap between top-ranked and lower-ranked results. Higher values (e.g., 100) flatten the curve, giving more weight to results that appear in both lists regardless of rank.

Step 6: Add Custom Stopwords

For domain-specific corpora, you can add custom stopwords to prevent common but uninformative terms from inflating BM25 scores:

var bm25 = new Bm25RetrievalStrategy
{
    Language = Language.English,
    CustomStopWords = new[] { "system", "error", "log", "info", "debug", "warning" }
};

This is useful when your documents contain repetitive boilerplate terms (e.g., log levels in application logs) that would otherwise dominate BM25 scores.

Step 7: Combine with Reranking

Hybrid search and reranking are complementary. Hybrid search improves recall (finding the right passages), while reranking improves precision (ordering them correctly):

// Hybrid retrieval + reranking for maximum quality
rag.RetrievalStrategy = new HybridRetrievalStrategy();
rag.Reranker = new RagEngine.RagReranker(embeddingModel, rerankedAlpha: 0.7f);

var matches = rag.FindMatchingPartitions("NullReferenceException pool exhausted", topK: 5, minScore: 0.1f);

The pipeline becomes: hybrid retrieval (broad recall) followed by cross-encoder reranking (precise ordering).

When to Use Each Strategy

Strategy	Strengths	Weaknesses	Best For
`VectorRetrievalStrategy`	Catches synonyms, paraphrases, semantic similarity	Misses exact keywords, identifiers	Conversational Q&A, natural language queries
`Bm25RetrievalStrategy`	Exact keyword matching, fast, no embeddings needed	Misses semantic meaning, synonyms	Keyword search, known-item retrieval
`HybridRetrievalStrategy`	Combines both, broad recall	Slightly higher latency	Production systems, mixed query types

Common Issues

Problem	Cause	Fix
BM25 returns no results	All query terms are stopwords	Check `Language` setting or add fewer `CustomStopWords`
Hybrid results same as vector-only	BM25 scores all zero (index not built)	Ensure documents are imported before querying
Slow first query	BM25 builds inverted index lazily on first search	Expected. Subsequent queries reuse the cached index.
Wrong language stemming	`Language` does not match document language	Set `Language` to match your corpus

Next Steps

Build a RAG Pipeline Over Your Own Documents: start with the foundational RAG tutorial if you have not set up indexing yet.
Improve RAG Results with Reranking: add a cross-encoder reranker on top of hybrid retrieval.
Build Conversational RAG with RagChat: wrap hybrid search in a multi-turn conversational interface.
Improve Recall with Multi-Query and HyDE Retrieval: generate query variants to capture additional relevant passages.
Diversify and Filter RAG Results: reduce redundancy and scope retrieval with MMR and metadata filtering.
Glossary: Hybrid Search: deep dive into how vector and BM25 fusion works.
Glossary: Reciprocal Rank Fusion: understand the RRF algorithm used to merge ranked lists.
Samples: Retrieval Quality Tuning: interactive demo comparing retrieval strategies.

Table of Contents