Class TextExtractionElement
- Namespace
- LMKit.Extraction
- Assembly
- LM-Kit.NET.dll
Represents an element used in text extraction processes, encapsulating metadata such as name, type, description, and optional nested elements for complex data structures.
public sealed class TextExtractionElement
- Inheritance
-
TextExtractionElement
- Inherited Members
Examples
Example: Simple flat schema
using LMKit.Extraction;
using LMKit.Data;
using System;
// A flat schema with primitive types
var invoiceNo = new TextExtractionElement("InvoiceNumber", ElementType.String, "Human-readable invoice ID", isRequired: true);
var issueDate = new TextExtractionElement("IssueDate", ElementType.Date, "Date the invoice was issued");
var total = new TextExtractionElement("Total", ElementType.Double, "Total amount including tax");
var isPaid = new TextExtractionElement("IsPaid", ElementType.Bool, "Whether payment has been received");
Example: Nested object structure
using LMKit.Extraction;
using LMKit.Data;
using System;
// A nested object for vendor information
var vendor = new TextExtractionElement(
"Vendor",
new[]
{
new TextExtractionElement("Name", ElementType.String, "Company name"),
new TextExtractionElement("Country", ElementType.String, "Country of origin"),
new TextExtractionElement("TaxId", ElementType.String, "Tax identification number")
},
isArray: false,
description: "Supplier information");
Example: Array of objects
using LMKit.Extraction;
using LMKit.Data;
using System;
// An array of line items
var lineItems = new TextExtractionElement(
"LineItems",
new[]
{
new TextExtractionElement("Sku", ElementType.String, "Product SKU"),
new TextExtractionElement("Description", ElementType.String, "Item description"),
new TextExtractionElement("Quantity", ElementType.Integer, "Number of units"),
new TextExtractionElement("UnitPrice", ElementType.Double, "Price per unit")
},
isArray: true,
description: "List of items on the invoice");
Example: Using enum values
using LMKit.Extraction;
using LMKit.Data;
using System;
using System.Collections.Generic;
// Element with constrained allowed values
var priority = new TextExtractionElement("Priority", ElementType.String, "Task priority level");
priority.Format.AllowedValues = new List<string> { "Low", "Medium", "High", "Critical" };
var status = new TextExtractionElement("Status", ElementType.String, "Current status");
status.Format.AllowedValues = new List<string> { "Pending", "InProgress", "Completed", "Cancelled" };
Remarks
The TextExtractionElement class is designed to model elements within a text extraction schema. It supports simple data types as well as complex types with nested elements, enabling the representation of hierarchical data models.
Supported Element Types
- String - Text values
- Integer - Whole numbers
- Double - Floating-point numbers
- Bool - True/false values
- Date - Date values
- Date - Date and time values
- StringArray - Arrays of strings
- IntegerArray - Arrays of integers
- DoubleArray - Arrays of decimals
- Object - Nested object with child elements
- ObjectArray - Array of nested objects
Key Properties
- Name - Unique identifier for the element
- Type - Data type of the element
- Description - Helps the LLM understand what to extract
- IsRequired - Whether the element must be present
- Format with AllowedValues - Constrain values to a specific set
- Format - Formatting options (case, length, etc.)
- InnerElements - Child elements for nested objects
Constructors
- TextExtractionElement(string, ElementType, string, bool)
Initializes a new instance of the TextExtractionElement class representing a simple data element.
- TextExtractionElement(string, IEnumerable<TextExtractionElement>, bool, string, bool)
Initializes a new instance of the TextExtractionElement class representing a complex object or an array with nested elements.
Properties
- Description
Gets the descriptive text associated with the extraction element.
- DetectedEntityKind
Gets the semantic entity kind that was automatically detected from the element name and type (for example, EmailAddress, Iban, or PhoneNumber).
- ElementType
Gets the data type of the extraction element.
- Format
Gets the format settings applied to this extraction element.
- InnerElements
Gets a read-only list of nested TextExtractionElement instances if this element contains inner elements.
- IsArray
Gets a value indicating whether this element represents an array.
- IsArrayOfObject
Gets a value indicating whether this element represents an array of objects.
- IsObject
Gets a value indicating whether this element represents a complex object with nested elements.
- IsRequired
Gets or sets a value indicating whether this element must be present in the extracted output.
- Name
Gets the original name of the extraction element.