Method SetContent
- Namespace
- LMKit.Extraction
- Assembly
- LM-Kit.NET.dll
SetContent(string)
Sets the text content from which elements will be extracted.
public void SetContent(string content)
Parameters
contentstringThe unstructured text content to process; cannot be
nullor empty.
Examples
// Set the content to extract data from
textExtraction.SetContent("John Doe, aged 30, lives at 123 Main St, Anytown, 12345.");
Exceptions
- ArgumentException
Thrown if
contentisnullor empty.
SetContent(Attachment)
Adds all pages of an attachment to be processed for data extraction.
public void SetContent(Attachment content)
Parameters
contentAttachmentThe Attachment to process. Supports various file formats including images, PDF documents, HTML files, and Microsoft Office formats. See Attachment for the complete list of supported formats.
Examples
// Create an attachment from an image file
Attachment imageAttachment = new Attachment("path/to/image.png");
textExtraction.SetContent(imageAttachment);
// Create an attachment from a PDF document
Attachment pdfAttachment = new Attachment("path/to/document.pdf");
textExtraction.SetContent(pdfAttachment);
// Create an attachment from a Word document
Attachment docxAttachment = new Attachment("path/to/report.docx");
textExtraction.SetContent(docxAttachment);
Remarks
All pages of the attachment are included for processing. Calling this method multiple times accumulates pages rather than replacing previously added content.
Exceptions
- ArgumentNullException
Thrown if
contentisnull.
SetContent(Attachment, int)
Adds a specific page of an attachment to be processed for data extraction.
public void SetContent(Attachment content, int pageIndex)
Parameters
contentAttachmentThe Attachment to process. Supports various file formats including images, PDF documents, HTML files, and Microsoft Office formats.
pageIndexintThe zero-based index of the page to process.
Examples
// Create an attachment from a PDF document
Attachment attachment = new Attachment("path/to/document.pdf");
// Add only the first page for extraction
textExtraction.SetContent(attachment, 0);
Remarks
Calling this method multiple times accumulates pages rather than replacing previously added content.
Exceptions
- ArgumentNullException
Thrown if
contentisnull.- ArgumentOutOfRangeException
Thrown if
pageIndexis negative or greater than or equal to the attachment's page count.
SetContent(Attachment, string)
Adds specified pages of an attachment to be processed for data extraction.
public void SetContent(Attachment content, string pageRange)
Parameters
contentAttachmentThe Attachment to process. Supports various file formats including images, PDF documents, HTML files, and Microsoft Office formats.
pageRangestringA string specifying which pages to include, using 1-based page numbers (e.g., "1-5, 7, 9-12"). Use
null, an empty string, or "*" to include all pages.
Examples
// Create an attachment from a PDF document
Attachment attachment = new Attachment("path/to/document.pdf");
// Add pages 1 through 5, page 7, and pages 9 through 12
textExtraction.SetContent(attachment, "1-5, 7, 9-12");
// Add only the first three pages
textExtraction.SetContent(attachment, "1-3");
Remarks
Calling this method multiple times accumulates pages rather than replacing previously added content. Page numbers outside the valid range are ignored.
Exceptions
- ArgumentNullException
Thrown if
contentisnull.
SetContent(ImageBuffer)
Sets the content for extraction from the specified image buffer.
public void SetContent(ImageBuffer content)
Parameters
contentImageBufferThe ImageBuffer representing the image to process; cannot be
null.
Examples
// Load an image into an ImageBuffer
ImageBuffer buffer = ImageLoader.Load("path/to/image.png");
// Set the image content to extract data from
textExtraction.SetContent(buffer);
Exceptions
- ArgumentNullException
Thrown if
contentisnull.
SetContent(IEnumerable<Attachment>)
Adds multiple attachments to be processed for data extraction.
public void SetContent(IEnumerable<Attachment> content)
Parameters
contentIEnumerable<Attachment>An enumerable of Attachment objects to process. Each attachment can be an image, PDF document, HTML file, Microsoft Office document, or any other supported format.
Examples
// Create attachments from multiple image files
var attachments = new List<Attachment>
{
new Attachment("path/to/image1.png"),
new Attachment("path/to/image2.jpg")
};
// Add all images for extraction
textExtraction.SetContent(attachments);
// Combine different file types
var attachments = new List<Attachment>
{
new Attachment("path/to/photo.png"),
new Attachment("path/to/invoice.pdf"),
new Attachment("path/to/report.docx")
};
// Add all content for extraction
textExtraction.SetContent(attachments);
Remarks
All pages of each attachment are included for processing. Calling this method multiple times accumulates pages rather than replacing previously added content.
Exceptions
- ArgumentNullException
Thrown if
contentisnull.- ArgumentException
Thrown if
contentcontains no items.