Table of Contents

Method Split

Namespace
LMKit.Document.Pdf
Assembly
LM-Kit.NET.dll

Split(Attachment, IEnumerable<string>)

Splits a PDF attachment into multiple attachments based on the provided page ranges.

public static List<Attachment> Split(Attachment source, IEnumerable<string> pageRanges)

Parameters

source Attachment

The source PDF attachment.

pageRanges IEnumerable<string>

A collection of 1-based page range strings (e.g., "1-3", "4-6", "7"). Each range produces one output attachment.

Returns

List<Attachment>

A list of Attachment instances, one per page range.

Examples

using System;
using System.Collections.Generic;
using LMKit.Data;
using LMKit.Document;

// Load a 12-page document
var source = new Attachment("quarterly_report.pdf");

// Split into 3 parts: Q1, Q2, Q3+Q4
List<Attachment> parts = PdfSplitter.Split(
    source,
    new[] { "1-3", "4-6", "7-12" });

for (int i = 0; i < parts.Count; i++)
{
    Console.WriteLine($"Part {i + 1}: {parts[i].PageCount} pages");
}

Exceptions

ArgumentNullException

Thrown when source or pageRanges is null.

ArgumentException

Thrown when the source attachment is not a PDF.

Split(Attachment, DocumentSplittingResult)

Splits a PDF attachment into multiple attachments based on the segments detected by DocumentSplitting.

public static List<Attachment> Split(Attachment source, DocumentSplittingResult splittingResult)

Parameters

source Attachment

The source PDF attachment.

splittingResult DocumentSplittingResult

The result of a Split(Attachment, CancellationToken) or SplitAsync(Attachment, CancellationToken) operation.

Returns

List<Attachment>

A list of Attachment instances, one per DocumentSegment.

Examples

using System;
using System.Collections.Generic;
using LMKit.Data;
using LMKit.Document;
using LMKit.Extraction;
using LMKit.Model;

// Load a multi-document PDF scan
var source = new Attachment("batch_scan.pdf");

// Use a vision model to detect logical boundaries
var model = LM.LoadFromModelID("qwen2-vl:8b");
var splitting = new DocumentSplitting(model);
DocumentSplittingResult result = splitting.Split(source);

// Extract each detected segment as a separate PDF attachment
List<Attachment> documents = PdfSplitter.Split(source, result);

foreach (var segment in result.Segments)
{
    Console.WriteLine($"Detected: {segment.Label} (pages {segment.StartPage}-{segment.EndPage})");
}
Console.WriteLine($"Extracted {documents.Count} separate documents");

Exceptions

ArgumentNullException

Thrown when source or splittingResult is null.

ArgumentException

Thrown when the source attachment is not a PDF.