Method RunAsync
- Namespace
- LMKit.Extraction.Ocr
- Assembly
- LM-Kit.NET.dll
RunAsync(OcrParameters, CancellationToken)
Executes the OCR process using the provided parameters. Concrete subclasses must override this method to implement specific OCR logic (e.g., calling a third‐party OCR library).
public abstract Task<OcrResult> RunAsync(OcrParameters ocrParameters, CancellationToken cancellationToken = default)
Parameters
ocrParametersOcrParametersAn OcrParameters instance that encapsulates the image buffer, any associated attachment metadata, and any additional configuration options.
cancellationTokenCancellationTokenA CancellationToken that can be used to cancel the OCR operation at any time.
Returns
- Task<OcrResult>
A Task<TResult> that, when completed, provides an OcrResult containing the extracted text, layout information, and any other data produced by the OCR engine.
Examples
public class TesseractOcrEngine : OcrEngine
{
public override async Task<OcrResult> RunAsync(
OcrParameters ocrParameters,
CancellationToken cancellationToken = default)
{
string text = await Tesseract.RecognizeAsync(ocrParameters.ImageData, cancellationToken);
return new OcrResult(text);
}
}
Exceptions
- OperationCanceledException
Thrown if the operation is canceled via the provided
cancellationToken.- Exception
Concrete implementations may throw other exceptions to indicate failures in the underlying OCR processing (e.g., I/O errors, service faults, invalid image format). It is recommended to document those specifics in the subclass’s implementation.
RunAsync(Attachment, int, CancellationToken)
Runs OCR on a specific page of the given attachment. This convenience overload handles image extraction from the attachment internally, then delegates to RunOcrAsync(ImageBuffer, Attachment, int, CancellationToken).
public Task<OcrResult> RunAsync(Attachment attachment, int pageIndex, CancellationToken cancellationToken = default)
Parameters
attachmentAttachmentThe Attachment containing the document or image to process (e.g., a PDF, TIFF, or single-page image file).
pageIndexintThe zero-based index of the page within the
attachmentto run OCR on. For single-page images, use0.cancellationTokenCancellationTokenA CancellationToken that can be used to cancel the OCR operation.
Returns
- Task<OcrResult>
A Task<TResult> that, when completed, provides an OcrResult containing the extracted text and layout information for the specified page.
Examples
Example: Run OCR on the first page of a PDF attachment
var attachment = new Attachment("invoice.pdf");
var engine = new MyOcrEngine();
OcrResult result = await engine.RunAsync(attachment, pageIndex: 0);
Console.WriteLine(result.PageText);
Exceptions
- OperationCanceledException
Thrown if the operation is canceled via
cancellationTokenor by an OcrStarting event subscriber.