Namespace LMKit.Agents.Tools.BuiltIn.Document

Classes

A built-in tool for extracting embedded attachments from documents.

Enables agents to list and save embedded files from PDF, EML, and MBOX documents.

A built-in tool for extracting text content from documents.

Enables agents to extract text from PDF, DOCX, XLSX, PPTX, EML, MBOX, and HTML files with optional page range selection.

DocumentTextInfoTool

A built-in tool for retrieving document text information.

Enables agents to inspect document metadata such as page count, MIME type, text availability, and file size without extracting the full content.

DocumentToMarkdownTool: Converts any supported document (PDF, DOCX, HTML, EML, MBOX, XLSX, PPTX, images, plain text) to Markdown using the unified DocumentToMarkdown converter.

EmlToPdfTool: Converts an EML (email) file into a PDF document with embedded attachments.

ImageCropTool

A built-in tool for cropping images.

Enables agents to automatically detect and remove uniform borders from scanned documents and images using tolerance-based edge detection.

ImageDeskewTool

Detect and correct skew (rotation) in scanned documents and images.

Straightens scanned documents and photos using Sobel edge detection and structure tensor analysis, then saves the corrected image.

ImageInfoTool: Get image dimensions, pixel format, and file size.

ImageMeasureSkewTool: Measure the skew angle of a scanned document or image without modifying it.

ImageResizeBoxTool: Resize an image to fit within a bounding box while preserving aspect ratio, with padding.

ImageResizeTool: Resize an image to exact dimensions with optional pixel format conversion.

ImageToPdfTool

A built-in tool for converting image files into a single PDF document.

Enables agents to combine one or more images (JPEG, PNG, BMP) into a PDF, with each image on its own page sized to match the image dimensions.

MarkdownToDocxTool: Converts Markdown text to a DOCX file.

MarkdownToHtmlTool: Converts Markdown text to HTML.

MarkdownToPdfTool: Converts Markdown content to a PDF file with full formatting support.

OcrRecognizeTool

Extract text from images using OCR (Optical Character Recognition).

Supports 34 languages including English, French, German, Chinese, Japanese, Arabic, and more.

PdfExtractTool: Extracts specific pages from a PDF and saves them as a single output file.

PdfMergeTool

A built-in tool for merging multiple PDF files into one.

Enables agents to combine multiple PDF documents into a single output file, preserving all pages in the specified order.

PdfMetadataTool

A built-in tool for retrieving PDF metadata and basic document information.

Enables agents to inspect PDF page count, file size, version, and metadata fields (title, author, subject, keywords, creation date).

PdfPagesTool

A built-in tool for inspecting PDF page details and extracting page text.

Enables agents to retrieve page dimensions, text-only flags, and extract text content from specific pages of a PDF document.

PdfSearchHighlightTool

A built-in tool for searching text in a PDF and producing a highlighted copy with all matches visually marked.

Enables agents to find query occurrences and generate an annotated PDF that highlights every match.

PdfSearchTool

A built-in tool for searching text inside PDF documents.

Enables agents to find query occurrences across all pages or a selected page range, returning page numbers and text snippets.

PdfSplitTool: Splits a PDF into multiple output files by page ranges.

PdfToImageTool

A built-in tool for rendering PDF pages as images.

Enables agents to convert PDF pages to JPEG, PNG, or BMP image files with configurable resolution, format, and quality options.

PdfUnlockTool

A built-in tool for removing password protection from PDF documents.

Enables agents to unlock a password-protected PDF by providing the known password, producing an unprotected copy that can be freely opened.

Table of Contents

Namespace LMKit.Agents.Tools.BuiltIn.Document

Classes