Method Split
Split(Attachment, IEnumerable<string>)
Splits a PDF attachment into multiple attachments based on the provided page ranges.
public static List<Attachment> Split(Attachment source, IEnumerable<string> pageRanges)
Parameters
sourceAttachmentThe source PDF attachment.
pageRangesIEnumerable<string>A collection of 1-based page range strings (e.g., "1-3", "4-6", "7"). Each range produces one output attachment.
Returns
- List<Attachment>
A list of Attachment instances, one per page range.
Examples
using System;
using System.Collections.Generic;
using LMKit.Data;
using LMKit.Document;
// Load a 12-page document
var source = new Attachment("quarterly_report.pdf");
// Split into 3 parts: Q1, Q2, Q3+Q4
List<Attachment> parts = PdfSplitter.Split(
source,
new[] { "1-3", "4-6", "7-12" });
for (int i = 0; i < parts.Count; i++)
{
Console.WriteLine($"Part {i + 1}: {parts[i].PageCount} pages");
}
Exceptions
- ArgumentNullException
Thrown when
sourceorpageRangesisnull.- ArgumentException
Thrown when the source attachment is not a PDF.
Split(Attachment, DocumentSplittingResult)
Splits a PDF attachment into multiple attachments based on the segments detected by DocumentSplitting.
public static List<Attachment> Split(Attachment source, DocumentSplittingResult splittingResult)
Parameters
sourceAttachmentThe source PDF attachment.
splittingResultDocumentSplittingResultThe result of a Split(Attachment, CancellationToken) or SplitAsync(Attachment, CancellationToken) operation.
Returns
- List<Attachment>
A list of Attachment instances, one per DocumentSegment.
Examples
using System;
using System.Collections.Generic;
using LMKit.Data;
using LMKit.Document;
using LMKit.Extraction;
using LMKit.Model;
// Load a multi-document PDF scan
var source = new Attachment("batch_scan.pdf");
// Use a vision model to detect logical boundaries
var model = LM.LoadFromModelID("qwen2-vl:8b");
var splitting = new DocumentSplitting(model);
DocumentSplittingResult result = splitting.Split(source);
// Extract each detected segment as a separate PDF attachment
List<Attachment> documents = PdfSplitter.Split(source, result);
foreach (var segment in result.Segments)
{
Console.WriteLine($"Detected: {segment.Label} (pages {segment.StartPage}-{segment.EndPage})");
}
Console.WriteLine($"Extracted {documents.Count} separate documents");
Exceptions
- ArgumentNullException
Thrown when
sourceorsplittingResultisnull.- ArgumentException
Thrown when the source attachment is not a PDF.