Method GetTextAsync
GetTextAsync(CancellationToken)
Asynchronously extracts and returns the textual content from the attachment.
public Task<string> GetTextAsync(CancellationToken cancellationToken = default)
Parameters
cancellationTokenCancellationTokenA token to monitor for cancellation requests. Default: None.
Returns
GetTextAsync(string, CancellationToken)
Asynchronously extracts and returns the textual content from the specified pages of the attachment.
public Task<string> GetTextAsync(string pageRange, CancellationToken cancellationToken = default)
Parameters
pageRangestringA page range specification using 1-based page numbers (e.g.,
"1-5, 7, 9-12"). Usenull, empty string, or"*"to include all pages. Invalid page numbers are silently ignored.cancellationTokenCancellationTokenA token to monitor for cancellation requests. Default: None.
Returns
- Task<string>
A task whose result is the extracted plain-text content from the specified pages; an empty string if no text is available or if the page range resolves to no valid pages.
Remarks
Page numbers in the range are 1-based (first page is 1). Ranges can be specified as:
"3"- single page"1-5"- page range (inclusive)"1-3, 7, 10-12"- multiple ranges and individual pages"5-1"- reversed ranges are normalized automatically
GetTextAsync(TextOutputMode, CancellationToken)
Asynchronously extracts and returns the textual content formatted with the given
mode.
public Task<string> GetTextAsync(TextOutputMode mode, CancellationToken cancellationToken = default)
Parameters
modeTextOutputModeControls how raw lines are grouped and spaced in the output. See TextOutputMode: RawLines, GridAligned, ParagraphFlow, or Structured.
cancellationTokenCancellationTokenA token to observe while performing extraction. If cancellation is requested, the operation throws OperationCanceledException.
Returns
- Task<string>
A task that completes with the extracted plain-text content (UTF-8, Unix line endings) formatted according to
mode; the result is an empty string when the attachment has no extractable text.
Remarks
The first invocation performs extraction and caches page elements; later calls reuse the cache.
The layout mode is applied at formatting time without re-extracting text.
For image-only inputs, provide OCR text via SetText(string) or
SetText(PageElement) to obtain non-empty output.
If you want the default layout, use GetTextAsync(CancellationToken).
- See Also
GetTextAsync(TextOutputMode, string, CancellationToken)
Asynchronously extracts and returns the textual content from the specified pages, formatted
with the given mode.
public Task<string> GetTextAsync(TextOutputMode mode, string pageRange, CancellationToken cancellationToken = default)
Parameters
modeTextOutputModeControls how raw lines are grouped and spaced in the output. See TextOutputMode: RawLines, GridAligned, ParagraphFlow, or Structured.
pageRangestringA page range specification using 1-based page numbers (e.g.,
"1-5, 7, 9-12"). Usenull, empty string, or"*"to include all pages. Invalid page numbers are silently ignored.cancellationTokenCancellationTokenA token to observe while performing extraction. If cancellation is requested, the operation throws OperationCanceledException.
Returns
- Task<string>
A task that completes with the extracted plain-text content from the specified pages, formatted according to
mode; the result is an empty string when the attachment has no extractable text or if the page range resolves to no valid pages.
Remarks
The first invocation performs extraction and caches page elements; later calls reuse the cache.
The pageRange filter is applied after extraction.
Page numbers in the range are 1-based (first page is 1). Ranges can be specified as:
"3"- single page"1-5"- page range (inclusive)"1-3, 7, 10-12"- multiple ranges and individual pages"5-1"- reversed ranges are normalized automatically
- See Also