Property ProcessingMode
ProcessingMode
Gets or sets the page processing mode that determines how document pages are analyzed and text is extracted.
public PageProcessingMode ProcessingMode { get; set; }
Property Value
- PageProcessingMode
The processing mode. The default is Auto.
Examples
LM embeddingModel = LM.LoadFromModelID("embeddinggemma-300m");
var docRag = new DocumentRag(embeddingModel);
// Force vision-based processing for all pages
LM visionModel = LM.LoadFromModelID("gemma3:4b");
docRag.VisionParser = new VlmOcr(visionModel);
docRag.ProcessingMode = PageProcessingMode.DocumentUnderstanding;
Remarks
The processing mode affects both the quality and performance of document ingestion:
- Auto: Automatically selects the optimal strategy for each page. Uses DocumentUnderstanding for image-heavy pages when VisionParser is configured, otherwise falls back to TextExtraction.
- TextExtraction: Extracts text directly from the document structure. Uses OcrEngine for pages that require OCR (e.g., scanned images).
- DocumentUnderstanding: Uses the configured VisionParser to analyze page images and extract structured content as markdown. Provides better results for complex layouts, tables, and diagrams.