Class DocumentSplitting
- Namespace
- LMKit.Extraction
- Assembly
- LM-Kit.NET.dll
Provides functionality to detect logical document boundaries within a multi-page file using a vision language model (VLM).
public sealed class DocumentSplitting
- Inheritance
-
DocumentSplitting
- Inherited Members
Examples
Example: Detect documents in a multi-page PDF
using LMKit.Model;
using LMKit.Extraction;
using LMKit.Data;
using System;
// Load a vision-capable model (8B or larger recommended)
LM model = LM.LoadFromModelID("qwen3-vl:8b");
// Create the splitter
DocumentSplitting splitter = new DocumentSplitting(model);
// Analyze a multi-page PDF
DocumentSplittingResult result = splitter.Split(new Attachment("multi_document_scan.pdf"));
// Display results
Console.WriteLine($"Multiple documents: {result.ContainsMultipleDocuments}");
Console.WriteLine($"Document count: {result.DocumentCount}");
Console.WriteLine($"Confidence: {result.Confidence:P0}");
foreach (DocumentSegment segment in result.Segments)
{
Console.WriteLine($" Pages {segment.StartPage}-{segment.EndPage}: {segment.Label} ({segment.PageCount} pages)");
}
Example: Use guidance to improve detection accuracy
using LMKit.Model;
using LMKit.Extraction;
using LMKit.Data;
using System;
// Load a vision-capable model (8B or larger recommended)
LM model = LM.LoadFromModelID("qwen3-vl:8b");
// Provide guidance about the expected document types
DocumentSplitting splitter = new DocumentSplitting(model)
{
Guidance = "The file contains a mix of invoices and purchase orders."
};
DocumentSplittingResult result = splitter.Split(new Attachment("scanned_batch.pdf"));
foreach (DocumentSegment segment in result.Segments)
{
Console.WriteLine($"Pages {segment.StartPage}-{segment.EndPage}: {segment.Label}");
}
Remarks
The DocumentSplitting class analyzes a multi-page PDF attachment and determines whether it contains multiple logical documents. For each detected document, it returns the page range for each one.
This class requires a vision-capable language model. The model must have
HasVision set to true. Page images are fed directly to
the VLM for visual boundary detection, which allows reliable splitting even on
scanned documents or documents with complex layouts.
Key Features
- Detect whether a multi-page PDF contains multiple logical documents
- Identify the page range for each detected document
- Optional OCR engine integration for scanned documents
- Guidance text to improve detection accuracy
Typical Workflow
- Create a DocumentSplitting instance with a vision language model
- Optionally configure Guidance or LMKit.Extraction.DocumentSplitting.OcrEngine
- Call Split(Attachment, CancellationToken) or SplitAsync(Attachment, CancellationToken) with the PDF attachment to analyze
- Access results via DocumentSplittingResult
Constructors
- DocumentSplitting(LM)
Initializes a new instance of the DocumentSplitting class with the specified vision language model.
Properties
- Guidance
Gets or sets semantic guidance for the splitting process.
- MaximumContextLength
Gets or sets the maximum context length (in tokens) allowed for the language model during splitting.
- Model
Gets the vision language model instance used to drive the document splitting process.
Methods
- Split(Attachment, CancellationToken)
Detects logical document boundaries synchronously within the specified attachment.
- SplitAsync(Attachment, CancellationToken)
Asynchronously detects logical document boundaries within the specified attachment.