Property MaxChunkSize
MaxChunkSize
Gets or sets the maximum size in characters for text chunks during document import.
public int MaxChunkSize { get; set; }
Property Value
- int
The maximum chunk size in characters. The default is LMKit.Retrieval.TextChunking.DEFAULT_MAX_CHUNK_SIZE.
Remarks
Text extracted from each document page is split into chunks for embedding generation. Smaller chunks provide more granular retrieval but may lose context; larger chunks preserve more context but may reduce retrieval precision.
The optimal chunk size depends on your embedding model's context window and your retrieval requirements. Most embedding models work well with chunks of 256–512 characters.