Class LayoutSearchEngine
Provides advanced, layout-aware search capabilities over PageElement instances and their TextElement children. Supports exact, regex, fuzzy, region-based, proximity, block-level queries, and cross-page overloads. Returns bounding boxes and contributing elements for each match.
public sealed class LayoutSearchEngine
- Inheritance
-
LayoutSearchEngine
- Inherited Members
Constructors
- LayoutSearchEngine(LayoutSearchOptions)
Initializes a new instance of the LayoutSearchEngine class.
Methods
- FindBetween(PageElement, string, string, BetweenOptions)
Extracts the text located between the first occurrence of
startQuery
and the first occurrence ofendQuery
. Can optionally include the anchors and cross line/block boundaries (within the same page).
- FindBetween(IEnumerable<PageElement>, string, string, BetweenOptions)
Extracts text located between the first occurrences of
startQuery
andendQuery
within each page, across multiplepages
. This overload does not span across page boundaries.
- FindFuzzy(PageElement, string, FuzzySearchOptions)
Performs token-aware fuzzy search using Damerau–Levenshtein distance over sliding windows of the page text. Useful when the source contains OCR noise or minor typos. Normalization (whitespace/diacritics/optional char-stripping) is applied to both the page text and the query.
- FindFuzzy(IEnumerable<PageElement>, string, FuzzySearchOptions)
Performs fuzzy search across multiple
pages
.
- FindInRegion(PageElement, Rectangle, RegionSearchOptions)
Returns text matches within a geometric
region
. You can choose intersection or containment semantics and whether to merge adjacent elements.
- FindInRegion(IEnumerable<PageElement>, Rectangle, RegionSearchOptions)
Returns text matches found within the same
region
(in each page's coordinate space) across multiplepages
. The sameregion
rectangle is applied to each page independently (page-local coordinates).
- FindNear(PageElement, string, ProximityOptions)
Finds instances of
query
located within a proximity of the specified anchor region.
- FindNear(IEnumerable<PageElement>, string, ProximityOptions)
Finds instances of
query
located within a proximity of the specified anchor region across multiplepages
. The same anchor region and radius are applied to each page independently (page-local coordinates).
- FindRegex(PageElement, string, RegexSearchOptions)
Finds regular expression matches within a page's text and returns layout-aware results. The regex runs over the normalized page text (options are applied to the text, not the pattern).
- FindRegex(IEnumerable<PageElement>, string, RegexSearchOptions)
Finds regular expression matches across multiple
pages
.
- FindText(PageElement, string, TextSearchOptions)
Finds exact (substring) matches of
query
within a page's text, honoringtextOptions
. Results include the matched text, a context snippet, the union bounding box, and contributing elements. Normalization (whitespace/diacritics/optional char-stripping) is applied to both the page text and the query.
- FindText(IEnumerable<PageElement>, string, TextSearchOptions)
Finds exact matches across multiple
pages
and annotates each result with its page index.