Table of Contents

Property OcrImageParallelism

Namespace
LMKit.Document.Conversion
Assembly
LM-Kit.NET.dll

OcrImageParallelism

Gets or sets the maximum number of concurrent OCR calls used when enriching text-extraction pages with their embedded raster images. Input is clamped to the [1, 12] range: values <= 1 run each image sequentially, values

= 12 cap to 12 to avoid over-subscribing the OCR engine's internal worker pool. Defaults to 4.

public int OcrImageParallelism { get; set; }

Property Value

int

Examples

var options = new DocumentToMarkdownOptions
{
    Strategy            = DocumentToMarkdownStrategy.TextExtraction,
    OcrEngine           = new LMKitOcr(),
    OcrImageParallelism = 8 // 8 concurrent OCR calls per page
};

Remarks

Only relevant when OcrEngine is set AND the page carries multiple embedded raster images. Raise the value on machines with spare CPU cores to speed up image-heavy PDFs; lower it when the OCR engine has its own internal thread pool you want to protect from over-subscription, or when you need to bound memory usage.

Ignored when OcrEngine is null or when the page has at most one embedded image. Does not affect the full-page OCR fallback (which is inherently single-threaded per page).

Share