Class TextRegion
A region of text in a document: its character-span and (optional) 2D layout bounds.
public class TextRegion
- Inheritance
-
TextRegion
- Derived
- Inherited Members
Examples
Example: Inspect search results as TextRegion instances.
using LMKit.Document.Layout;
using LMKit.Document.Pdf;
using LMKit.Document.Search;
PdfInfo info = PdfInfo.Load("document.pdf");
PageElement page = info.Pages[0].GetLayout();
var engine = new LayoutSearchEngine();
List<TextMatch> matches = engine.FindText(page, "important");
foreach (TextRegion region in matches)
{
Console.WriteLine($"Span : [{region.StartIndex}, {region.EndIndex})");
Console.WriteLine($"Page : {region.PageIndex}");
Console.WriteLine($"Bounds : {region.Bounds}");
Console.WriteLine($"Elements : {region.Elements.Count} contributing words");
}
Properties
- Bounds
Optional 2D bounds of this region (e.g., page coordinates from OCR/PDF). May be null.
- Elements
The list of TextElement nodes that contributed to this region.
- EndIndex
Zero-based index immediately after the last character of the region (exclusive). -1 if the position could not be determined.
- PageIndex
Zero-based page index for paged inputs. -1 if the page index is unknown.
- StartIndex
Zero-based index of the first character of the region in the original content. -1 if the position could not be determined.