Property CustomStopWords
CustomStopWords
Gets or sets custom stopwords to filter during tokenization in addition to the language-specific stopwords selected by Language.
public IEnumerable<string> CustomStopWords { get; set; }
Property Value
- IEnumerable<string>
A collection of lowercase words to exclude from indexing and queries, or
nullto use only the language stopwords. Default isnull.
Examples
// Add domain-specific stopwords to filter out common boilerplate terms.
var bm25 = new Bm25RetrievalStrategy
{
CustomStopWords = new[] { "acme", "corp", "inc" }
};
ragEngine.RetrievalStrategy = bm25;
Remarks
Use this property to add domain-specific terms that should be excluded from BM25 scoring (e.g., project names, boilerplate identifiers). The custom words are merged with the language-specific stopwords; they do not replace them.
Assigning this property invalidates the cached index, forcing a full rebuild on the next query. A defensive copy of the assigned collection is stored internally; subsequent modifications to the original collection have no effect.