Table of Contents

Class LMKitOcr

Namespace
LMKit.Extraction.Ocr
Assembly
LM-Kit.NET.dll

Provides high-throughput OCR functionality optimized for business documents, with advanced page layout analysis, automatic language and orientation detection, and automatic model download support. Implements IDisposable to release native OCR resources.

public class LMKitOcr : OcrEngine, IDisposable
Inheritance
LMKitOcr
Implements
Inherited Members

Examples

using var ocr = new LMKitOcr();

// Optional: enable automatic language detection
ocr.VisionModel = myVisionModel;
ocr.EnableLanguageDetection = true;

var result = await ocr.RunAsync(ocrParameters, cancellationToken);

Remarks

LM-Kit OCR is engineered for speed, accuracy, and complex page layout handling. It delivers very high accuracy on business documents such as invoices, contracts, reports, and forms, while maintaining high throughput for batch processing scenarios.

Key capabilities:

  • High-throughput processing optimized for large-scale document workflows
  • Very high accuracy on business documents (invoices, contracts, reports, forms)
  • Complex page layout handling with intelligent reading order reconstruction
  • Automatic language detection (requires a vision-capable LM model)
  • Automatic page orientation detection and correction
  • Automatic deskewing of scanned documents
  • On-demand downloading of OCR dictionaries from Hugging Face

Constructors

LMKitOcr()

Initializes a new instance of the LMKitOcr class using the default model storage directory.

LMKitOcr(string)

Initializes a new instance of the LMKitOcr class with the specified OCR resource path.

Properties

DefaultLanguage

Gets or sets the default ISO 639-2/T language code used when a specific language model is not available or language detection is disabled.

EnableAutoDeskew

Gets or sets a value indicating whether automatic deskewing is applied to the input image before OCR.

EnableDespeckle

Gets or sets a value indicating whether speckle noise removal is applied to the binarized image before OCR recognition.

EnableLanguageDetection

Gets or sets a value indicating whether automatic language detection is performed before OCR.

EnableModelDownload

Gets or sets a value indicating whether missing OCR dictionaries should be automatically downloaded.

EnableOrientationDetection

Gets or sets a value indicating whether automatic orientation detection is performed before OCR.

EnableSmartBinarization

Gets or sets a value indicating whether adaptive (smart) binarization is used to convert the input image to a binary (black-and-white) representation.

VisionModel

Gets or sets the vision-capable LM model used for automatic language detection.

Methods

ClearCache()

Clears all cached OCR engines and releases their associated resources.

Dispose()

Releases all resources used by this LMKitOcr instance.

RunAsync(OcrParameters, CancellationToken)

Runs OCR on the provided image data asynchronously.

Events

LanguageDetected

Occurs when a language is detected during OCR processing.

OrientationDetected

Occurs when page orientation is detected during OCR processing.

Share