Table of Contents

Class PageElement

Namespace
LMKit.Extraction.Layout
Assembly
LM-Kit.NET.dll

Represents the content of a single page, including its textual elements and a plain text aggregation. Typically used for layout-aware extraction results from documents such as PDFs or OCR-processed images.

public class PageElement
Inheritance
PageElement
Inherited Members

Constructors

PageElement(IEnumerable<TextElement>)

Initializes a new instance of the PageElement class with structured text elements only.

PageElement(IEnumerable<TextElement>, double, double, int, double)

Initializes a new instance of the PageElement class with structured text elements, page dimensions, rotation, and skew information.

PageElement(string)

Initializes a new instance of the PageElement class with plain unstructured text only. This constructor should be used when no layout or bounding box information is available.

Properties

Height

Gets the height of the page in the original document, in points.

Rotation

Gets the detected rotation of the page in degrees clockwise (e.g., 0, 90, 180, or 270).

Skew

Gets the detected skew angle of the page in degrees clockwise.

Text

Gets the full textual content of the page as a single aggregated string.

TextElements

Gets the collection of TextElement instances found on the page. Each text element may optionally include bounding box coordinates describing its layout.

Width

Gets the width of the page in the original document, in points.

Methods

Clone()

Creates a deep copy of this PageElement, including cloned text elements and preserving current layout settings (size, rotation, skew, and formatting flag).

FromJson(string)

Deserializes the specified JSON string into a PageElement instance.

ToJson()

Serializes this PageElement into a JSON-formatted string.