Table of Contents

Property MaxOverlapSize

Namespace
LMKit.Retrieval
Assembly
LM-Kit.NET.dll

MaxOverlapSize

Gets or sets the maximum number of tokens to be duplicated (overlapped) between consecutive text chunks. This overlap ensures that context is not lost at the boundaries between chunks. It aids in maintaining the continuity of the text across chunks, especially important for cohesive text analysis and generation.

public int MaxOverlapSize { get; set; }

Property Value

int

The default value is 50.

Remarks

A "token" refers to the smallest unit of text, such as a word or punctuation mark, identifiable in the context of text processing.
Tokens are enumerated in the Vocabs property of the Vocabulary object, which provides a comprehensive index of recognizable text elements.