Table of Contents

Method SetContent

Namespace
LMKit.Extraction
Assembly
LM-Kit.NET.dll

SetContent(string)

Sets the text content from which elements will be extracted.

public void SetContent(string content)

Parameters

content string

The unstructured text content to process; cannot be null or empty.

Examples

// Set the content to extract data from
textExtraction.SetContent("John Doe, aged 30, lives at 123 Main St, Anytown, 12345.");

Exceptions

ArgumentException

Thrown if content is null or empty.

SetContent(Attachment)

Adds all pages of an attachment to be processed for data extraction.

public void SetContent(Attachment content)

Parameters

content Attachment

The Attachment to process. Supports various file formats including images, PDF documents, HTML files, and Microsoft Office formats. See Attachment for the complete list of supported formats.

Examples

// Create an attachment from an image file
Attachment imageAttachment = new Attachment("path/to/image.png");
textExtraction.SetContent(imageAttachment);
// Create an attachment from a PDF document
Attachment pdfAttachment = new Attachment("path/to/document.pdf");
textExtraction.SetContent(pdfAttachment);
// Create an attachment from a Word document
Attachment docxAttachment = new Attachment("path/to/report.docx");
textExtraction.SetContent(docxAttachment);

Remarks

All pages of the attachment are included for processing. Calling this method multiple times accumulates pages rather than replacing previously added content.

Exceptions

ArgumentNullException

Thrown if content is null.

SetContent(Attachment, int)

Adds a specific page of an attachment to be processed for data extraction.

public void SetContent(Attachment content, int pageIndex)

Parameters

content Attachment

The Attachment to process. Supports various file formats including images, PDF documents, HTML files, and Microsoft Office formats.

pageIndex int

The zero-based index of the page to process.

Examples

// Create an attachment from a PDF document
Attachment attachment = new Attachment("path/to/document.pdf");

// Add only the first page for extraction
textExtraction.SetContent(attachment, 0);

Remarks

Calling this method multiple times accumulates pages rather than replacing previously added content.

Exceptions

ArgumentNullException

Thrown if content is null.

ArgumentOutOfRangeException

Thrown if pageIndex is negative or greater than or equal to the attachment's page count.

SetContent(Attachment, string)

Adds specified pages of an attachment to be processed for data extraction.

public void SetContent(Attachment content, string pageRange)

Parameters

content Attachment

The Attachment to process. Supports various file formats including images, PDF documents, HTML files, and Microsoft Office formats.

pageRange string

A string specifying which pages to include, using 1-based page numbers (e.g., "1-5, 7, 9-12"). Use null, an empty string, or "*" to include all pages.

Examples

// Create an attachment from a PDF document
Attachment attachment = new Attachment("path/to/document.pdf");

// Add pages 1 through 5, page 7, and pages 9 through 12
textExtraction.SetContent(attachment, "1-5, 7, 9-12");
// Add only the first three pages
textExtraction.SetContent(attachment, "1-3");

Remarks

Calling this method multiple times accumulates pages rather than replacing previously added content. Page numbers outside the valid range are ignored.

Exceptions

ArgumentNullException

Thrown if content is null.

SetContent(ImageBuffer)

Sets the content for extraction from the specified image buffer.

public void SetContent(ImageBuffer content)

Parameters

content ImageBuffer

The ImageBuffer representing the image to process; cannot be null.

Examples

// Load an image into an ImageBuffer
ImageBuffer buffer = ImageLoader.Load("path/to/image.png");

// Set the image content to extract data from
textExtraction.SetContent(buffer);

Exceptions

ArgumentNullException

Thrown if content is null.

SetContent(IEnumerable<Attachment>)

Adds multiple attachments to be processed for data extraction.

public void SetContent(IEnumerable<Attachment> content)

Parameters

content IEnumerable<Attachment>

An enumerable of Attachment objects to process. Each attachment can be an image, PDF document, HTML file, Microsoft Office document, or any other supported format.

Examples

// Create attachments from multiple image files
var attachments = new List<Attachment>
{
    new Attachment("path/to/image1.png"),
    new Attachment("path/to/image2.jpg")
};

// Add all images for extraction
textExtraction.SetContent(attachments);
// Combine different file types
var attachments = new List<Attachment>
{
    new Attachment("path/to/photo.png"),
    new Attachment("path/to/invoice.pdf"),
    new Attachment("path/to/report.docx")
};

// Add all content for extraction
textExtraction.SetContent(attachments);

Remarks

All pages of each attachment are included for processing. Calling this method multiple times accumulates pages rather than replacing previously added content.

Exceptions

ArgumentNullException

Thrown if content is null.

ArgumentException

Thrown if content contains no items.