Class DocumentToMarkdownPageResult

Namespace: LMKit.Document.Conversion

Assembly: LM-Kit.NET.dll

Represents the Markdown conversion outcome for a single page of a document.

public sealed class DocumentToMarkdownPageResult

Inheritance: object

DocumentToMarkdownPageResult

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.ReferenceEquals(object, object)

object.ToString()

Examples

Flag low-quality VLM pages for re-processing.

using LMKit.Document.Conversion;
var converter = new DocumentToMarkdown();
var result = converter.Convert("report.pdf");
foreach (var page in result.Pages)
{
if (page.StrategyUsed == DocumentToMarkdownStrategy.VlmOcr &&
page.QualityScore.HasValue && page.QualityScore.Value < 0.6)
{
Console.WriteLine($"Review page {page.PageNumber} (quality {page.QualityScore:F2}).");
}
}

Remarks

Instances of this class are created by DocumentToMarkdown and exposed through Pages. They carry both the per-page Markdown body and diagnostics about which strategy handled the page.

GeneratedTokenCount and QualityScore are populated only when the page was transcribed by the vision-language model. Use them to detect pages that hit the completion-token cap or flag pages whose quality score suggests the conversion should be re-run at higher fidelity.

Properties

Certainty: Gets a confidence score in the [0, 1] range that the page's Markdown faithfully represents its source. A value of 1.0 means the converter is very confident the output is a correct and complete rendering; values below 0.70 are worth reviewing or routing to a more thorough pipeline.

Elapsed: Gets the wall-clock time spent processing this page.

GeneratedTokenCount: Gets the number of tokens emitted by the vision model for this page, or 0 when the page was handled by the text-extraction strategy.

HasExtractableText: Gets a value indicating whether the source page exposed an extractable text layer at the time of conversion.

Markdown: Gets the Markdown content produced for this page. May be empty when the page contains no textual content or when the conversion could not recover any text.

OcrPerformed: Gets a value indicating whether OCR was performed on this page during conversion.

PageElement: Gets the structured layout extracted for this page: its TextElements with their geometry (positions and bounding boxes) arranged in the page's reading structure.

PageIndex: Gets the zero-based index of the page within the source document.

PageNumber: Gets the 1-based page number, convenient for user-facing output.

QualityScore: Gets the quality score reported by the vision model for this page, or null when the page was handled by the text-extraction strategy.

StrategyUsed: Gets the strategy that was actually applied to this page. When the converter runs in Hybrid, this value reflects the per-page decision (either TextExtraction or VlmOcr).

Warning: Gets an optional warning message associated with this page (for example, a notice that a page was empty because no vision model was available).

Table of Contents

Class DocumentToMarkdownPageResult

Examples

Remarks

Properties