Table of Contents

Property CustomStopWords

Namespace
LMKit.Retrieval.Bm25
Assembly
LM-Kit.NET.dll

CustomStopWords

Gets or sets custom stopwords to filter during tokenization in addition to the language-specific stopwords selected by Language.

public IEnumerable<string> CustomStopWords { get; set; }

Property Value

IEnumerable<string>

A collection of lowercase words to exclude from indexing and queries, or null to use only the language stopwords. Default is null.

Examples

// Add domain-specific stopwords to filter out common boilerplate terms.
var bm25 = new Bm25RetrievalStrategy
{
    CustomStopWords = new[] { "acme", "corp", "inc" }
};

ragEngine.RetrievalStrategy = bm25;

Remarks

Use this property to add domain-specific terms that should be excluded from BM25 scoring (e.g., project names, boilerplate identifiers). The custom words are merged with the language-specific stopwords; they do not replace them.

Assigning this property invalidates the cached index, forcing a full rebuild on the next query. A defensive copy of the assigned collection is stored internally; subsequent modifications to the original collection have no effect.

Share