Table of Contents

Method LoadDocumentAsync

Namespace
LMKit.Retrieval
Assembly
LM-Kit.NET.dll

LoadDocumentAsync(string, DocumentMetadata, CancellationToken)

Asynchronously loads a PDF document from the specified file path.

public Task<DocumentIndexingResult> LoadDocumentAsync(string filePath, DocumentRag.DocumentMetadata documentMetadata = null, CancellationToken cancellationToken = default)

Parameters

filePath string

Path to the PDF file.

documentMetadata DocumentRag.DocumentMetadata

Optional metadata to associate with the document. If null, default metadata is created using the file name. Use this to specify a custom document name, reference URL, or additional metadata fields for source attribution in query results.

cancellationToken CancellationToken

Token to cancel the operation.

Returns

Task<DocumentIndexingResult>

A task containing a DocumentIndexingResult describing how the document was processed, including the indexing mode, page count, and token count.

Examples

// Load with custom metadata for source tracking
var metadata = new DocumentMetadata("Q4 Financial Report")
{
    SourceUri = "https://intranet.example.com/docs/report.pdf",
    AdditionalMetadata = new MetadataCollection
    {
        { "author", "Finance Team" },
        { "confidentiality", "Internal" }
    }
};
var result = await chat.LoadDocumentAsync("report.pdf", metadata);

Exceptions

InvalidOperationException

Thrown if the conversation has already started.

FileNotFoundException

Thrown if the specified file does not exist.

ObjectDisposedException

Thrown if this instance has been disposed.

LoadDocumentAsync(Stream, string, DocumentMetadata, CancellationToken)

Asynchronously loads a PDF document from a stream.

public Task<DocumentIndexingResult> LoadDocumentAsync(Stream stream, string documentName, DocumentRag.DocumentMetadata documentMetadata = null, CancellationToken cancellationToken = default)

Parameters

stream Stream

Stream containing PDF data.

documentName string

Display name for the document.

documentMetadata DocumentRag.DocumentMetadata

Optional metadata to associate with the document. If null, default metadata is created using documentName. Use this to specify a custom document name, reference URL, or additional metadata fields for source attribution in query results.

cancellationToken CancellationToken

Token to cancel the operation.

Returns

Task<DocumentIndexingResult>

A task containing a DocumentIndexingResult describing how the document was processed, including the indexing mode, page count, and token count.

Examples

using var stream = File.OpenRead("contract.pdf");

// Create metadata with reference URL and custom fields
var metadata = new DocumentMetadata("contract.pdf")
{
    SourceUri = "file:///contracts/2024/contract-001.pdf",
    AdditionalMetadata = new MetadataCollection
    {
        { "client", "Acme Corp" },
        { "status", "Active" }
    }
};

var result = await chat.LoadDocumentAsync(stream, "contract.pdf", metadata);

// Later, when querying, source references include the metadata
var response = await chat.SubmitAsync("What are the payment terms?");
foreach (var source in response.SourceReferences)
{
    if (source.Metadata.TryGet("client", out var clientMeta))
        Console.WriteLine($"From client: {clientMeta.Value}");
}

Exceptions

InvalidOperationException

Thrown if the conversation has already started, or if PageProcessingMode is DocumentUnderstanding and DocumentVisionParser is not set.

ArgumentNullException

Thrown if stream or documentName is null or empty.

ObjectDisposedException

Thrown if this instance has been disposed.