Class OcrEngine
- Namespace
- LMKit.Extraction.Ocr
- Assembly
- LM-Kit.NET.dll
Represents an Optical Character Recognition (OCR) engine capable of processing image data and extracting text. Concrete implementations must override RunAsync(OcrParameters, CancellationToken) to perform actual OCR. Exposes events that fire just before and just after OCR executes, and allows cancellation.
public abstract class OcrEngine
- Inheritance
-
OcrEngine
- Derived
- Inherited Members
Examples
Example: Implement a custom OCR engine and subscribe to events
public class MyOcrEngine : OcrEngine
{
public override async Task<OcrResult> RunAsync(
OcrParameters ocrParameters,
CancellationToken cancellationToken = default)
{
// Call your third-party OCR library here
string recognizedText = await MyOcrLibrary.RecognizeAsync(ocrParameters.ImageData);
return new OcrResult(recognizedText);
}
}
// Usage:
var engine = new MyOcrEngine();
engine.OcrStarting += (s, e) => Console.WriteLine($"OCR starting on page {e.PageIndex}");
engine.OcrCompleted += (s, e) => Console.WriteLine($"OCR done: {e.Result?.PageText?.Length ?? 0} chars");
var extractor = new TextExtraction(model);
extractor.OcrEngine = engine;
Methods
- OnOcrCompleted(OcrCompletedEventArgs)
Raises the OcrCompleted event (if any subscribers exist).
- OnOcrStarting(OcrStartingEventArgs)
Raises the OcrStarting event (if any subscribers exist).
- RunAsync(Attachment, int, CancellationToken)
Runs OCR on a specific page of the given attachment. This convenience overload handles image extraction from the attachment internally, then delegates to RunOcrAsync(ImageBuffer, Attachment, int, CancellationToken).
- RunAsync(OcrParameters, CancellationToken)
Executes the OCR process using the provided parameters. Concrete subclasses must override this method to implement specific OCR logic (e.g., calling a third‐party OCR library).
Events
- OcrCompleted
Raised after OCR finishes (whether it succeeded, was canceled, or faulted). Subscribers can inspect the parameters, result, and/or any exception that occurred.
- OcrStarting
Raised just before OCR begins. Subscribers can inspect the attachment and, if they wish, set Cancel = true to abort.