Method ImportText
ImportText(string, TextChunking, string, string, CancellationToken)
Imports text data into a specified DataSource object, creating a new or updating an existing Section entry.
public DataSource ImportText(string data, TextChunking textChunking, string dataSourceIdentifier, string sectionIdentifier = "default", CancellationToken cancellationToken = default)
Parameters
data
stringThe textual data to be imported.
textChunking
TextChunkingA TextChunking object specifying how text is chunked.
dataSourceIdentifier
stringThe unique identifier for the DataSource. If it matches an existing DataSource, the data is added as a new section. Otherwise, a new DataSource is created.
sectionIdentifier
stringOptional. The new Section's identifier. Defaults to 'default'.
cancellationToken
CancellationTokenOptional. A CancellationToken to cancel the operation.
Returns
- DataSource
The DataSource object into which the data has been imported.
Examples
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using System;
class Example
{
static void Main()
{
LM embeddingModel = new LM(new Uri("https://example-model-uri.com"));
RagEngine ragEngine = new RagEngine(embeddingModel);
string textData = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";
DataSource dataSource = ragEngine.ImportText(
data: textData,
textChunking: new TextChunking() { MaxChunkSize = 200 },
dataSourceIdentifier: "latinData",
sectionIdentifier: "introduction"
);
Console.WriteLine($"Data imported into source: {dataSource.Identifier}");
}
}
Exceptions
- ArgumentNullException
Thrown if 'data' or 'dataSourceIdentifier' is null or empty.
- OperationCanceledException
Thrown if the operation is canceled.
ImportText(IList<string>, TextChunking, string, IList<string>, CancellationToken)
Imports an array of text data into a specified DataSource object, dynamically creating new Section entries for each item.
public DataSource ImportText(IList<string> data, TextChunking textChunking, string dataSourceIdentifier, IList<string> sectionIdentifiers, CancellationToken cancellationToken = default)
Parameters
data
IList<string>An array of text strings to be imported. Each string can represent a "page" of data.
textChunking
TextChunkingA TextChunking object specifying how text is chunked.
dataSourceIdentifier
stringA unique identifier for the DataSource. If it matches an existing DataSource, new sections are added to it; otherwise, a new one is created.
sectionIdentifiers
IList<string>A list of identifiers for the new Sections. The length must match the length of 'data'.
cancellationToken
CancellationTokenOptional. A CancellationToken to cancel the operation.
Returns
- DataSource
The updated or newly created DataSource.
Examples
using LMKit.Data;
using LMKit.Model;
using LMKit.Retrieval;
using System;
using System.Collections.Generic;
class Example
{
static void Main()
{
LM embeddingModel = new LM(new Uri("https://example-model-uri.com"));
RagEngine ragEngine = new RagEngine(embeddingModel);
var pages = new List<string>
{
"Page 1: Introduction to RAG systems.",
"Page 2: Advanced RAG techniques."
};
var sectionIds = new List<string> { "chapter1", "chapter2" };
DataSource ds = ragEngine.ImportText(
data: pages,
textChunking: new TextChunking() { MaxChunkSize = 300 },
dataSourceIdentifier: "ragBook",
sectionIdentifiers: sectionIds
);
Console.WriteLine($"Imported {pages.Count} pages into DataSource: {ds.Identifier}");
}
}
Exceptions
- ArgumentNullException
Thrown if 'data' or 'dataSourceIdentifier' is null or empty.
- ArgumentOutOfRangeException
Thrown if 'sectionIdentifiers' has more identifiers than 'data' has pages, or if it has duplicates.
- OperationCanceledException
Thrown if the operation is canceled.