Table of Contents

Class TextRegion

Namespace
LMKit.Document.Layout
Assembly
LM-Kit.NET.dll

A region of text in a document: its character-span and (optional) 2D layout bounds.

public class TextRegion
Inheritance
TextRegion
Derived
Inherited Members

Examples

Example: Inspect search results as TextRegion instances.

using LMKit.Document.Layout;
using LMKit.Document.Pdf;
using LMKit.Document.Search;

PdfInfo info = PdfInfo.Load("document.pdf"); PageElement page = info.Pages[0].GetLayout();

var engine = new LayoutSearchEngine(); List<TextMatch> matches = engine.FindText(page, "important");

foreach (TextRegion region in matches) { Console.WriteLine($"Span : [{region.StartIndex}, {region.EndIndex})"); Console.WriteLine($"Page : {region.PageIndex}"); Console.WriteLine($"Bounds : {region.Bounds}"); Console.WriteLine($"Elements : {region.Elements.Count} contributing words"); }

Properties

Bounds

Optional 2D bounds of this region (e.g., page coordinates from OCR/PDF). May be null.

Elements

The list of TextElement nodes that contributed to this region.

EndIndex

Zero-based index immediately after the last character of the region (exclusive). -1 if the position could not be determined.

PageIndex

Zero-based page index for paged inputs. -1 if the page index is unknown.

StartIndex

Zero-based index of the first character of the region in the original content. -1 if the position could not be determined.

Methods

ToString()
Share