Class TextractOcr
- Namespace
- LMKit.Integrations.AWS.Ocr.Textract
- Assembly
- LM-Kit.NET.dll
An OCR engine implementation that leverages AWS Textract to extract text from image data.
public sealed class TextractOcr : OcrEngine
- Inheritance
-
TextractOcr
- Inherited Members
Remarks
This class extends OcrEngine and implements the necessary AWS Signature Version 4 signing process to call the Textract “DetectDocumentText” API. It converts input image bytes into the required JSON payload, signs the request, and parses out any recognized LINE‐level text blocks returned by Textract. The resulting text is returned inside an OcrResult instance.
To use this class, you must supply valid AWS credentials (access key ID and secret access key) and specify an AWSRegion. You can then assign an instance of TextractOcr to the OcrEngine property of your TextExtraction or invoke it directly with RunAsync(OcrParameters, CancellationToken).
Constructors
- TextractOcr(string, string, AWSRegion)
Initializes a new instance of the TextractOcr class with the specified AWS credentials and region.
Properties
- Timeout
Gets or sets the timeout duration for HTTP requests to the AWS Textract service.
Methods
- RunAsync(OcrParameters, CancellationToken)
Executes the OCR process using the provided parameters. Concrete subclasses must override this method to implement specific OCR logic (e.g., calling a third‐party OCR library).