Class TextractOcr

Namespace: LMKit.Integrations.AWS.Ocr.Textract

Assembly: LM-Kit.NET.dll

An OCR engine implementation that leverages AWS Textract to extract text from image data.

public sealed class TextractOcr : OcrEngine

Inheritance: object

OcrEngine

TextractOcr

Inherited Members: OcrEngine.OcrStarting

OcrEngine.OcrCompleted

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

This class extends OcrEngine and implements the necessary AWS Signature Version 4 signing process to call the Textract “DetectDocumentText” API. It converts input image bytes into the required JSON payload, signs the request, and parses out any recognized LINE‐level text blocks returned by Textract. The resulting text is returned inside an OcrResult instance.

To use this class, you must supply valid AWS credentials (access key ID and secret access key) and specify an AWSRegion. You can then assign an instance of TextractOcr to the OcrEngine property of your TextExtraction or invoke it directly with RunAsync(OcrParameters, CancellationToken).

Constructors

TextractOcr(string, string, AWSRegion): Initializes a new instance of the TextractOcr class with the specified AWS credentials and region.

Properties

Timeout: Gets or sets the timeout duration for HTTP requests to the AWS Textract service.

Methods

RunAsync(OcrParameters, CancellationToken): Executes the OCR process using the provided parameters. Concrete subclasses must override this method to implement specific OCR logic (e.g., calling a third‐party OCR library).

Table of Contents

Class TextractOcr

Remarks

Constructors

Properties

Methods