Table of Contents

Method ImportText

Namespace
LMKit.Retrieval
Assembly
LM-Kit.NET.dll

ImportText(string, TextChunking, string, string, CancellationToken)

Imports a text string into a DataSource, creating a new section.

public DataSource ImportText(string data, TextChunking textChunking, string dataSourceIdentifier, string sectionIdentifier, CancellationToken cancellationToken = default)

Parameters

data string

The text to import.

textChunking TextChunking

Specifies how to split the text into chunks.

dataSourceIdentifier string

The unique identifier for the target DataSource. If the identifier exists, the text is added as a new section; otherwise, a new data source is created.

sectionIdentifier string

The identifier for the new section.

cancellationToken CancellationToken

Optional cancellation token.

Returns

DataSource

The DataSource containing the imported text.

Examples

// Example: Import a string into a data source.
LM embeddingModel = LM.LoadFromModelID("nomic-embed-text");
RagEngine ragEngine = new RagEngine(embeddingModel);
string textData = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";
DataSource dataSource = ragEngine.ImportText(
    data: textData,
    textChunking: new TextChunking() { MaxChunkSize = 200 },
    dataSourceIdentifier: "latinData",
    sectionIdentifier: "introduction"
);
Console.WriteLine($"Data imported into source: {dataSource.Identifier}");

Exceptions

ArgumentNullException

Thrown if data or dataSourceIdentifier is null or empty.

OperationCanceledException

Thrown if the operation is canceled.

ImportText(string, TextChunking, string, string, MetadataCollection, CancellationToken)

Imports a text string into a DataSource, creating a new section, and attaches additional metadata information.

public DataSource ImportText(string data, TextChunking textChunking, string dataSourceIdentifier, string sectionIdentifier, MetadataCollection additionalMetadata, CancellationToken cancellationToken = default)

Parameters

data string

The text to import.

textChunking TextChunking

Specifies how to split the text into chunks.

dataSourceIdentifier string

The unique identifier for the target DataSource. If the identifier exists, the text is added as a new section; otherwise, a new data source is created.

sectionIdentifier string

The identifier for the new section.

additionalMetadata MetadataCollection

Metadata to associate with the imported section.

cancellationToken CancellationToken

Optional cancellation token.

Returns

DataSource

The DataSource containing the imported text.

ImportText(IList<string>, TextChunking, string, IList<string>, IList<MetadataCollection>, CancellationToken)

Imports an array of text strings into a DataSource, creating a new section for each text string.

public DataSource ImportText(IList<string> data, TextChunking textChunking, string dataSourceIdentifier, IList<string> sectionIdentifiers, IList<MetadataCollection> metadataCollections = null, CancellationToken cancellationToken = default)

Parameters

data IList<string>

A list of text strings to import, each representing a page.

textChunking TextChunking

Specifies how to split each text into chunks.

dataSourceIdentifier string

The unique identifier for the target DataSource. If the data source exists, new sections are added; otherwise, a new one is created.

sectionIdentifiers IList<string>

A list of unique identifiers for the new sections. The number of identifiers must match the number of text strings.

metadataCollections IList<MetadataCollection>

An optional list of metadata collections to associate with each imported section.

cancellationToken CancellationToken

Optional cancellation token.

Returns

DataSource

The updated or newly created DataSource with the imported text pages.

Examples

// Example: Import an array of text pages.
LM embeddingModel = LM.LoadFromModelID("nomic-embed-text");
RagEngine ragEngine = new RagEngine(embeddingModel);
var pages = new List<string>
{
    "Page 1: Introduction to RAG systems.",
    "Page 2: Advanced RAG techniques."
};
var sectionIds = new List<string> { "chapter1", "chapter2" };
DataSource ds = ragEngine.ImportText(
    data: pages,
    textChunking: new TextChunking() { MaxChunkSize = 300 },
    dataSourceIdentifier: "ragBook",
    sectionIdentifiers: sectionIds
);
Console.WriteLine($"Imported {pages.Count} pages into DataSource: {ds.Identifier}");

Exceptions

ArgumentNullException

Thrown if data or dataSourceIdentifier is null or empty.

ArgumentOutOfRangeException

Thrown if the count of sectionIdentifiers does not match the number of text strings.