Method SplitAsync
- Namespace
- LMKit.Extraction
- Assembly
- LM-Kit.NET.dll
SplitAsync(Attachment, CancellationToken)
Asynchronously detects logical document boundaries within the specified attachment.
public Task<DocumentSplittingResult> SplitAsync(Attachment attachment, CancellationToken cancellationToken = default)
Parameters
attachmentAttachmentThe multi-page PDF Attachment to analyze. Cannot be
null.cancellationTokenCancellationTokenA token to monitor for cancellation requests while splitting is running.
Returns
- Task<DocumentSplittingResult>
A task that represents the asynchronous operation. The task result contains a DocumentSplittingResult with the detected document segments.
Examples
using LMKit.Model;
using LMKit.Extraction;
using LMKit.Document.Pdf;
using LMKit.Data;
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
// Load a vision-capable model (8B or larger recommended)
LM model = LM.LoadFromModelID("qwen3-vl:8b");
// Create the splitter
var splitter = new DocumentSplitting(model);
// Detect logical document boundaries
var source = new Attachment("multi_doc.pdf");
DocumentSplittingResult result = await splitter.SplitAsync(source);
// Display results
Console.WriteLine($"Found {result.DocumentCount} document(s)");
Console.WriteLine($"Confidence: {result.Confidence:P0}");
foreach (DocumentSegment segment in result.Segments)
{
Console.WriteLine($" Pages {segment.StartPage}-{segment.EndPage}: {segment.Label}");
}
// Physically split the PDF using PdfSplitter
if (result.ContainsMultipleDocuments)
{
List<Attachment> documents = PdfSplitter.Split(source, result);
Console.WriteLine($"Split into {documents.Count} separate PDFs");
}
Exceptions
- ArgumentNullException
Thrown if
attachmentisnull.
SplitAsync(Attachment, bool, string, CancellationToken)
Asynchronously detects logical document boundaries within the specified attachment, and optionally splits the source PDF into separate files for each detected segment.
public Task<DocumentSplittingResult> SplitAsync(Attachment attachment, bool splitDocument, string outputDirectory = null, CancellationToken cancellationToken = default)
Parameters
attachmentAttachmentThe multi-page PDF Attachment to analyze. Cannot be
null.splitDocumentboolWhen
true, the source PDF is physically split into separate PDF files for each detected segment. The file paths are available via Documents. Whenfalse, behavior is identical to SplitAsync(Attachment, CancellationToken).outputDirectorystringThe directory where split PDF files will be written. Created if it does not exist. Required when
splitDocumentistrue.cancellationTokenCancellationTokenA token to monitor for cancellation requests while splitting is running.
Returns
- Task<DocumentSplittingResult>
A task that represents the asynchronous operation. The task result contains a DocumentSplittingResult with the detected document segments and, when
splitDocumentistrue, the paths to the split PDF files via Documents.
Examples
using LMKit.Model;
using LMKit.Extraction;
using LMKit.Data;
using System;
using System.Threading.Tasks;
LM model = LM.LoadFromModelID("qwen3-vl:8b");
var splitter = new DocumentSplitting(model);
// Detect boundaries AND split the PDF into separate files in one call
DocumentSplittingResult result = await splitter.SplitAsync(
new Attachment("multi_doc_scan.pdf"),
splitDocument: true,
outputDirectory: "output/split_docs");
Console.WriteLine($"Found {result.DocumentCount} document(s)");
Console.WriteLine($"Confidence: {result.Confidence:P0}");
for (int i = 0; i < result.Segments.Count; i++)
{
Console.WriteLine($" {result.Segments[i].Label}: {result.Documents[i]}");
}
Exceptions
- ArgumentNullException
Thrown if
attachmentisnull, or ifsplitDocumentistrueandoutputDirectoryisnull.