Class PdfChat
Provides conversational question-answering over PDF documents by combining intelligent document understanding with grounded response generation.
public sealed class PdfChat : IMultiTurnConversation, IConversation, IKVCache, IDisposable
- Inheritance
-
PdfChat
- Implements
- Inherited Members
Examples
using var chat = new PdfChat(chatModel, embeddingModel);
// Enable vision-based document understanding for complex layouts
chat.DocumentVisionParser = new VlmOcr(visionModel);
// Monitor all operations
chat.DocumentImportProgress += (s, e) =>
{
if (e.Phase == DocumentImportPhase.PageProcessingStarted)
Console.WriteLine($"Processing page {e.PageIndex + 1}/{e.TotalPages}");
};
chat.CacheAccessed += (s, e) =>
Console.WriteLine($"Cache {(e.IsHit ? "hit" : "miss")}: {e.DocumentName}");
chat.PassageRetrievalCompleted += (s, e) =>
Console.WriteLine($"Retrieved {e.RetrievedCount} passages in {e.Elapsed.TotalMilliseconds:F0}ms");
chat.ResponseGenerationStarted += (s, e) =>
Console.WriteLine($"Generating response ({(e.UsesFullContext ? "full context" : $"{e.PassageCount} passages")})");
// Load one or more documents
var result = await chat.LoadDocumentAsync("report.pdf");
Console.WriteLine($"Loaded {result.DocumentName}: {result.IndexingMode} ({result.TokenCount} tokens)");
// Ask questions - responses are grounded in document content
var response = await chat.SubmitAsync("What are the key findings?");
Console.WriteLine(response.Response.Completion);
Remarks
Unlike simple text extraction, PdfChat interprets both the physical layout (where content appears on the page) and the logical structure (what that content means) of documents. This enables accurate answers from complex materials including multi-column layouts, tables, forms, and scanned pages.
Document preparation is automatic: smaller documents are provided in full to the model for complete context, while larger documents use passage retrieval to inject only the most relevant excerpts per question.
Load documents using LoadDocument(string, DocumentMetadata, CancellationToken) or LoadDocumentAsync(Stream, string, DocumentMetadata, CancellationToken), then ask questions with Submit(string, CancellationToken). Follow-up questions maintain conversation context for natural multi-turn dialogue.
Constructors
- PdfChat(LM, LM, IVectorStore, byte[])
Restores a previous PdfChat session from serialized bytes, with an optional vector store.
- PdfChat(LM, LM, IVectorStore, int)
Initializes a new instance using separate chat and embedding models with an optional vector store for caching.
- PdfChat(LM, LM, IVectorStore, string)
Restores a previous PdfChat session from a file on disk, with an optional vector store.
- PdfChat(LM, LM, byte[])
Restores a previous PdfChat session from serialized bytes.
- PdfChat(LM, LM, int)
Initializes a new instance using separate chat and embedding models.
- PdfChat(LM, LM, string)
Restores a previous PdfChat session from a file on disk.
Properties
- ChatHistory
Gets the conversation history containing all exchanged messages.
- ContextRemainingSpace
Remaining token budget currently available in the context window.
- ContextSize
Gets the total token context size available for this conversation.
- ContextWindow
Gets or sets the number of neighboring partitions to include around each matched partition for contextual continuity.
- ContextualizationOptions
Gets the options that control how follow-up questions are reformulated when QueryGenerationMode is set to Contextual.
- DocumentCount
Gets the total number of documents currently loaded.
- DocumentProcessingModality
Gets or sets the modality used for processing document content.
- DocumentVisionParser
Gets or sets an optional vision-based analyzer for document understanding.
- EmbeddingModel
Gets the embedding model used for computing text embeddings during passage retrieval.
- FullDocumentCount
Gets the number of documents loaded in full within the context.
- FullDocumentTokenBudget
Gets or sets the maximum token budget for full document inclusion.
- HasDocuments
Gets whether any documents have been loaded.
- HydeOptions
Gets the options that control how hypothetical answers are generated when QueryGenerationMode is set to HypotheticalAnswer.
- ImageDetail
Gets or sets the level of detail used when processing images for vision models. Controls the maximum pixel budget allocated to images, which directly affects token consumption and visual fidelity. Default is High.
- IncludePageRenderingsInContext
Gets or sets whether page renderings are included alongside retrieved passages during question answering.
- MaxRetrievedPassages
Gets or sets the maximum number of passages retrieved per query.
- MaximumCompletionTokens
Gets or sets the maximum number of tokens to generate per response.
- MaximumRecallTokens
Maximum number of tokens recalled from Memory per turn.
Defaults to
ContextSize / 4. The effective value is automatically capped to at mostContextSize / 2.
- Memory
Long-term memory store used to recall relevant context across turns.
Assign an AgentMemory implementation to enable retrieval of relevant text partitions. Retrieved snippets are injected as hidden context up to MaximumRecallTokens.
- MinRelevanceScore
Gets or sets the minimum relevance score for retrieved passages.
- MmrLambda
Gets or sets the Maximal Marginal Relevance (MMR) lambda parameter that controls the balance between relevance and diversity in retrieval results.
- Model
Gets the language model used for generating responses.
- MultiQueryOptions
Gets the options that control how query variants are generated when QueryGenerationMode is set to MultiQuery.
- OcrEngine
Gets or sets an optional OCR engine for extracting text from image-based pages.
- PageProcessingMode
Gets or sets how document pages are processed when loading documents.
- PassageRetrievalDocumentCount
Gets the number of documents loaded with passage retrieval.
- PreferFullDocumentContext
Gets or sets whether small documents should be provided in full to the model.
- QueryGenerationMode
Gets or sets the mode used to generate retrieval queries from user input.
- ReasoningLevel
Gets or sets the reasoning level used during response generation.
- RemainingDocumentTokenBudget
Gets the remaining token budget available for full document inclusion.
- RepetitionPenalty
Gets the repetition penalty configuration used to reduce repetitive outputs.
- Reranker
Gets or sets the reranker used to refine passage retrieval results.
- RetrievalStrategy
Gets or sets the retrieval strategy that controls how candidate partitions are scored during the initial retrieval phase.
- SamplingMode
Gets or sets the token sampling strategy for text generation.
- Skills
Registry of Agent Skills available to this conversation.
Skills provide modular capabilities with specialized knowledge and workflows, following the Agent Skills specification.
- SystemPrompt
Gets or sets the system prompt that defines the assistant's behavior.
- ToolPolicy
Per-turn tool-calling policy used by the conversation runtime.
Controls whether tools are allowed, required, disabled, or whether a specific tool must be used on the current turn.
- Tools
Registry of model-callable tools available to this conversation.
Register tools before the first user turn so they are advertised to the model. Tool invocation requires a model that supports tool calls.
- UsedDocumentTokens
Gets the number of tokens currently consumed by full-context documents.
Methods
- ClearDocuments()
Removes all loaded documents and resets the conversation.
- ClearHistory()
Clears the conversation history while keeping loaded documents.
- Dispose()
Releases all resources used by this instance.
- ~PdfChat()
Releases unmanaged resources.
- LoadDocument(Stream, string, DocumentMetadata, CancellationToken)
Loads a PDF document from a stream.
- LoadDocument(string, DocumentMetadata, CancellationToken)
Loads a PDF document from the specified file path.
- LoadDocumentAsync(Stream, string, DocumentMetadata, CancellationToken)
Asynchronously loads a PDF document from a stream.
- LoadDocumentAsync(string, DocumentMetadata, CancellationToken)
Asynchronously loads a PDF document from the specified file path.
- RegenerateResponse(CancellationToken)
Regenerates the last response using the same context.
- RegenerateResponseAsync(CancellationToken)
Asynchronously regenerates the last response using the same context.
- SaveSession()
Saves the current session state (conversation, documents, and configuration) to a byte array.
- SaveSession(string)
Saves the current session state to a file on disk.
- Submit(string, CancellationToken)
Submits a question and returns a response based on the loaded documents.
- SubmitAsync(string, CancellationToken)
Asynchronously submits a question and returns a response based on the loaded documents.
Events
- AfterTextCompletion
Occurs during response generation as text is produced.
- AfterToolInvocation
Fired after a tool invocation finishes (or when it was cancelled/errored).
- BeforeToolInvocation
Fired before a tool invocation. Handlers may cancel the call.
- CacheAccessed
Occurs when the document cache is accessed during loading.
- DocumentImportProgress
Occurs during document import to report progress.
- MemoryRecall
Fired when one or more memory partitions are recalled for this turn.
Subscribers may inspect the recalled content and optionally cancel injection by setting Cancel to
true.
- PassageRetrievalCompleted
Occurs when passage retrieval completes for a query.
- ResponseGenerationStarted
Occurs when response generation begins.
- ToolApprovalRequired
Fired when a tool invocation requires user approval before execution.