Table of Contents

Class TextExtractionElement

Namespace
LMKit.Extraction
Assembly
LM-Kit.NET.dll

Represents an element used in text extraction processes, encapsulating metadata such as name, type, description, and optional nested elements for complex data structures.

public sealed class TextExtractionElement
Inheritance
TextExtractionElement
Inherited Members

Examples

Example: Simple flat schema

using LMKit.Extraction;
using LMKit.Data;
using System;

// A flat schema with primitive types
var invoiceNo = new TextExtractionElement("InvoiceNumber", ElementType.String, "Human-readable invoice ID", isRequired: true);
var issueDate = new TextExtractionElement("IssueDate", ElementType.Date, "Date the invoice was issued");
var total = new TextExtractionElement("Total", ElementType.Double, "Total amount including tax");
var isPaid = new TextExtractionElement("IsPaid", ElementType.Bool, "Whether payment has been received");

Example: Nested object structure

using LMKit.Extraction;
using LMKit.Data;
using System;

// A nested object for vendor information
var vendor = new TextExtractionElement(
    "Vendor",
    new[]
    {
        new TextExtractionElement("Name", ElementType.String, "Company name"),
        new TextExtractionElement("Country", ElementType.String, "Country of origin"),
        new TextExtractionElement("TaxId", ElementType.String, "Tax identification number")
    },
    isArray: false,
    description: "Supplier information");

Example: Array of objects

using LMKit.Extraction;
using LMKit.Data;
using System;

// An array of line items
var lineItems = new TextExtractionElement(
    "LineItems",
    new[]
    {
        new TextExtractionElement("Sku", ElementType.String, "Product SKU"),
        new TextExtractionElement("Description", ElementType.String, "Item description"),
        new TextExtractionElement("Quantity", ElementType.Integer, "Number of units"),
        new TextExtractionElement("UnitPrice", ElementType.Double, "Price per unit")
    },
    isArray: true,
    description: "List of items on the invoice");

Example: Using enum values

using LMKit.Extraction;
using LMKit.Data;
using System;
using System.Collections.Generic;

// Element with constrained allowed values
var priority = new TextExtractionElement("Priority", ElementType.String, "Task priority level");
priority.Format.AllowedValues = new List<string> { "Low", "Medium", "High", "Critical" };

var status = new TextExtractionElement("Status", ElementType.String, "Current status");
status.Format.AllowedValues = new List<string> { "Pending", "InProgress", "Completed", "Cancelled" };

Remarks

The TextExtractionElement class is designed to model elements within a text extraction schema. It supports simple data types as well as complex types with nested elements, enabling the representation of hierarchical data models.

Supported Element Types

Key Properties

Constructors

TextExtractionElement(string, ElementType, string, bool)

Initializes a new instance of the TextExtractionElement class representing a simple data element.

TextExtractionElement(string, IEnumerable<TextExtractionElement>, bool, string, bool)

Initializes a new instance of the TextExtractionElement class representing a complex object or an array with nested elements.

Properties

Description

Gets the descriptive text associated with the extraction element.

DetectedEntityKind

Gets the semantic entity kind that was automatically detected from the element name and type (for example, EmailAddress, Iban, or PhoneNumber).

ElementType

Gets the data type of the extraction element.

Format

Gets the format settings applied to this extraction element.

InnerElements

Gets a read-only list of nested TextExtractionElement instances if this element contains inner elements.

IsArray

Gets a value indicating whether this element represents an array.

IsArrayOfObject

Gets a value indicating whether this element represents an array of objects.

IsObject

Gets a value indicating whether this element represents a complex object with nested elements.

IsRequired

Gets or sets a value indicating whether this element must be present in the extracted output.

Name

Gets the original name of the extraction element.