Table of Contents

Method ImportText

Namespace
LMKit.Retrieval
Assembly
LM-Kit.NET.dll

ImportText(string, TextChunking, string, string, CancellationToken)

Imports text data into a specified DataSource object, creating a new or updating an existing Section entry.

public DataSource ImportText(string data, TextChunking textChunking, string dataSourceIdentifier, string sectionIdentifier = "default", CancellationToken cancellationToken = default)

Parameters

data string

The textual data to be imported.

textChunking TextChunking

A TextChunking object specifying how text is chunked.

dataSourceIdentifier string

The unique identifier for the DataSource. If it matches an existing DataSource, the data is added as a new section. Otherwise, a new DataSource is created.

sectionIdentifier string

Optional. The new Section's identifier. Defaults to 'default'.

cancellationToken CancellationToken

Optional. A CancellationToken to cancel the operation.

Returns

DataSource

The DataSource object into which the data has been imported.

Examples

using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using System;

class Example
{
    static void Main()
    {
        LM embeddingModel = new LM(new Uri("https://example-model-uri.com"));
        RagEngine ragEngine = new RagEngine(embeddingModel);

        string textData = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";

        DataSource dataSource = ragEngine.ImportText(
            data: textData,
            textChunking: new TextChunking() { MaxChunkSize = 200 },
            dataSourceIdentifier: "latinData",
            sectionIdentifier: "introduction"
        );

        Console.WriteLine($"Data imported into source: {dataSource.Identifier}");
    }
}

Exceptions

ArgumentNullException

Thrown if 'data' or 'dataSourceIdentifier' is null or empty.

OperationCanceledException

Thrown if the operation is canceled.

ImportText(IList<string>, TextChunking, string, IList<string>, CancellationToken)

Imports an array of text data into a specified DataSource object, dynamically creating new Section entries for each item.

public DataSource ImportText(IList<string> data, TextChunking textChunking, string dataSourceIdentifier, IList<string> sectionIdentifiers, CancellationToken cancellationToken = default)

Parameters

data IList<string>

An array of text strings to be imported. Each string can represent a "page" of data.

textChunking TextChunking

A TextChunking object specifying how text is chunked.

dataSourceIdentifier string

A unique identifier for the DataSource. If it matches an existing DataSource, new sections are added to it; otherwise, a new one is created.

sectionIdentifiers IList<string>

A list of identifiers for the new Sections. The length must match the length of 'data'.

cancellationToken CancellationToken

Optional. A CancellationToken to cancel the operation.

Returns

DataSource

The updated or newly created DataSource.

Examples

using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using System;
using System.Collections.Generic;

class Example
{
    static void Main()
    {
        LM embeddingModel = new LM(new Uri("https://example-model-uri.com"));
        RagEngine ragEngine = new RagEngine(embeddingModel);

        var pages = new List<string>
        {
            "Page 1: Introduction to RAG systems.",
            "Page 2: Advanced RAG techniques."
        };
        var sectionIds = new List<string> { "chapter1", "chapter2" };

        DataSource ds = ragEngine.ImportText(
            data: pages,
            textChunking: new TextChunking() { MaxChunkSize = 300 },
            dataSourceIdentifier: "ragBook",
            sectionIdentifiers: sectionIds
        );

        Console.WriteLine($"Imported {pages.Count} pages into DataSource: {ds.Identifier}");
    }
}

Exceptions

ArgumentNullException

Thrown if 'data' or 'dataSourceIdentifier' is null or empty.

ArgumentOutOfRangeException

Thrown if 'sectionIdentifiers' has more identifiers than 'data' has pages, or if it has duplicates.

OperationCanceledException

Thrown if the operation is canceled.