Table of Contents

Method ImportTextFromFile

Namespace
LMKit.Retrieval
Assembly
LM-Kit.NET.dll

ImportTextFromFile(string, Encoding, TextChunking, string, string, CancellationToken)

Imports text from a file into a DataSource by creating a new section.

public DataSource ImportTextFromFile(string path, Encoding encoding, TextChunking textChunking, string dataSourceIdentifier, string sectionIdentifier, CancellationToken cancellationToken = default)

Parameters

path string

The file path of the text to import.

encoding Encoding

The character encoding used in the file.

textChunking TextChunking

Specifies how to split the text into chunks.

dataSourceIdentifier string

The unique identifier for the target DataSource. If a matching data source exists, the text is added as a new section; otherwise, a new data source is created.

sectionIdentifier string

The identifier for the new section.

cancellationToken CancellationToken

Optional cancellation token.

Returns

DataSource

The DataSource containing the imported text.

Examples

// Example: Import text from a file.
string filePath = "sample.txt";
LM embeddingModel = LM.LoadFromModelID("nomic-embed-text");
RagEngine ragEngine = new RagEngine(embeddingModel);
DataSource dataSource = ragEngine.ImportTextFromFile(
    path: filePath,
    encoding: Encoding.UTF8,
    textChunking: new TextChunking() { MaxChunkSize = 500 },
    dataSourceIdentifier: "myDataSource",
    sectionIdentifier: "intro"
);
Console.WriteLine($"Imported text into DataSource: {dataSource.Identifier}");

Exceptions

ArgumentNullException

Thrown if path or dataSourceIdentifier is null or empty.

OperationCanceledException

Thrown if the operation is canceled.

ImportTextFromFile(string, Encoding, TextChunking, string, string, MetadataCollection, CancellationToken)

Imports text from a file into a DataSource by creating a new section, and attaches additional metadata information.

public DataSource ImportTextFromFile(string path, Encoding encoding, TextChunking textChunking, string dataSourceIdentifier, string sectionIdentifier, MetadataCollection additionalMetadata, CancellationToken cancellationToken = default)

Parameters

path string

The file path of the text to import.

encoding Encoding

The character encoding used in the file.

textChunking TextChunking

Specifies how to split the text into chunks.

dataSourceIdentifier string

The unique identifier for the target DataSource. If a matching data source exists, the text is added as a new section; otherwise, a new data source is created.

sectionIdentifier string

The identifier for the new section.

additionalMetadata MetadataCollection

Metadata to associate with the imported section.

cancellationToken CancellationToken

Optional cancellation token.

Returns

DataSource

The DataSource containing the imported text.