Property MaxOverlapSize
MaxOverlapSize
Gets or sets the maximum number of tokens to be duplicated (overlapped) between consecutive text chunks. This overlap ensures that context is not lost at the boundaries between chunks. It aids in maintaining the continuity of the text across chunks, especially important for cohesive text analysis and generation.
public int MaxOverlapSize { get; set; }
Property Value
- int
The default value is 50.
Remarks
A "token" refers to the smallest unit of text, such as a word or punctuation mark, identifiable in the context of text processing.
Tokens are enumerated in the Vocabs property of the Vocabulary object, which provides a comprehensive index of recognizable text elements.