👉 Try the demo:
https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/document_summarizer

Document Summarizer in .NET Applications

🎯 Purpose of the Demo

Document Summarizer demonstrates how to use LM-Kit.NET to load a local model and generate a short summary, optionally with an auto-generated title, from a document on disk.

This demo is especially useful for PDFs and images, which are the most common real-world formats in document workflows:

PDF files: text-based PDFs (selectable text) and scanned PDFs (image-based pages).
Images: screenshots, photos of documents, receipts, forms, etc.

The sample shows how to:

Download and load a model with progress callbacks.
Select a predefined model from a simple menu (or paste a custom model URI).
Wrap a file as an Attachment.
Summarize the document using LM-Kit’s Summarizer.
Configure summarization output (title, summary text, max length, optional guidance).
Run in a loop to summarize one document after another.

Why summarize documents with LM-Kit.NET?

Local-first: summarize sensitive files on your own hardware.
Unified input: PDFs and images go through the same Attachment entry point.
Configurable: tune output length and add guidance (example: always summarize in French).
Simple developer experience: minimal C# console app with a readable flow.

👥 Target Audience

Product and Platform - add summarization to existing .NET services.
Data and Document Processing - quickly digest large sets of PDFs, scans, and screenshots.
RPA and Back-office - summarize reports, tickets, receipts, and back-office documents.
Demo and Education - minimal example of model loading plus a practical document task in C#.

🚀 Problem Solved

Turn long documents into short summaries: get a quick overview without reading everything.
Summarize PDFs and images: handle common formats like PDF reports and screenshot captures.
Model flexibility: pick a model based on your VRAM and latency requirements.
Repeatable loop: summarize multiple files in one console session.
Optional control: generate a title, control summary length, and add guidance.

💻 Sample Application Description

Console app that:

Lets you choose a model (or paste a custom model URI).
Downloads the model if needed, with live progress updates.
Loads the model with progress reporting.
Creates a Summarizer configured to:
- Generate a title
- Generate summary content
- Limit summary size to 100 words
- Apply optional guidance
Repeatedly prompts for a document path:
- Loads it as an Attachment (PDF, image, and other supported formats).
- Calls summarizer.Summarize(attachment).
- Prints Title and Summary.
Stops when you submit an empty path.

✨ Key Features

🧠 Document summarization: generates a short summary from a document input.
🏷️ Auto-title: generates a title from the document content.
📄 PDF-focused: great for reports and multi-page PDFs (including scanned PDFs).
🖼️ Image-friendly: summarize screenshots and photos of documents.
📥 Interactive loop: enter a path, get a summary, repeat.
📏 Output control:
- MaxContentWords to cap length
- Guidance to steer style or language
📦 Model lifecycle:
- Automatic download on first use
- Loading progress shown in the console
❌ Nice errors: friendly message when a file path is invalid or inaccessible.

On startup, the demo shows a model selection menu:

Option	Model	Approx. VRAM Needed
0	MiniCPM o 4.5 9B	~5.9 GB VRAM
1	Alibaba Qwen 3.5 2B	~2 GB VRAM
2	Alibaba Qwen 3.5 4B	~3.5 GB VRAM
3	Alibaba Qwen 3.5 9B	~7 GB VRAM
4	Google Gemma 3 4B	~5.7 GB VRAM
5	Google Gemma 3 12B	~11 GB VRAM
6	Alibaba Qwen 3.5 27B	~18 GB VRAM
other	Custom model URI (GGUF / LMK...)	depends on model

Choosing a model for PDFs and images

For text-based PDFs, most models work well since the input already contains clean text.
For images and scanned PDFs, prefer a vision-capable model if your pipeline needs to read pixels (screenshots, photos, scanned pages). If the document is already selectable text, vision capability is usually less important.

🧠 Supported Models

The demo is pre-wired to LM-Kit’s predefined model cards:

minicpm-o-45
qwen3.5:2b
qwen3.5:4b
qwen3.5:9b
gemma3:4b
gemma3:12b
qwen3.5:27b

Internally:

modelLink = ModelCard
    .GetPredefinedModelCardByModelID("gemma3:4b")
    .ModelUri
    .ToString();

You can also provide any valid model URI manually (including local paths or custom model servers) by typing or pasting it when prompted.

🛠️ Commands and Flow

Inside the console loop:

On startup
- Select a model (0-9) or paste a custom model URI.
- The model is downloaded (if needed) and loaded with progress reporting.
Per document
- The app prompts: Enter the path to a document:
- Type a file path and press Enter.
- The app loads it into an Attachment (PDF, image, and other supported formats).
- The app runs summarization:
  - var result = summarizer.Summarize(attachment);
- The app prints:
  - Title: ...
  - Summary: ...
Quit
- Submitting an empty path exits the loop.
- The app then waits for a key press to close.

🗣️ Example Use Cases

Try the demo with:

A PDF report -> produce a short recap for quick review.
A multi-page scanned PDF -> get a summary without reading the whole scan.
A screenshot of a web page -> capture the key idea and main sections.
A photo of a document (phone capture) -> sanity-check what the model understood.
A receipt or invoice image -> generate a short description of the purchase and totals.
Multi-language content -> test guidance like “always summarize in French”.

For PDFs and images, results often depend on the source quality:

Higher resolution and clean contrast usually improves output.
Cropped images with only the relevant content usually summarize better than full-screen clutter.

⚙️ Behavior and Policies (quick reference)

Model selection: exactly one model per process. To change models, restart the app.
Primary formats: the demo is most commonly used with PDFs and images, but any format supported by Attachment can work.
Download and load:
- ModelDownloadingProgress prints Downloading model XX.XX% (or bytes).
- ModelLoadingProgress prints Loading model XX% and clears the console after download.
Summarization settings (as configured in the demo):
- GenerateTitle = true
- GenerateContent = true
- MaxContentWords = 100
- Guidance = "" (optional)
Exit condition: submitting an empty document path ends the loop.
Licensing:
- You can set an optional license key via LicenseManager.SetLicenseKey("").
- A free community license is available from the LM-Kit website.

💻 Minimal Integration Snippet

using System;
using System.Text;
using LMKit.Data;
using LMKit.Model;
using LMKit.TextGeneration;

public class DocumentSummarizerSample
{
    public void SummarizeFile(string modelUri, string filePath)
    {
        Console.InputEncoding = Encoding.UTF8;
        Console.OutputEncoding = Encoding.UTF8;

        // Load the model
        var model = new LM(
            new Uri(modelUri),
            downloadingProgress: (path, contentLength, bytesRead) => true,
            loadingProgress: progress => true);

        // Create summarizer
        var summarizer = new Summarizer(model)
        {
            GenerateTitle = true,
            GenerateContent = true,
            MaxContentWords = 100,
            Guidance = "" // Example: "Always summarize in French"
        };

        // Wrap the file as an Attachment (PDF, image, etc.)
        var attachment = new Attachment(filePath);

        // Run summarization
        var result = summarizer.Summarize(attachment);

        Console.WriteLine($"Title: {result.Title}");
        Console.WriteLine($"Summary: {result.Summary}");
    }
}

Use this pattern to integrate summarization into web APIs, background workers, or desktop apps.

🛠️ Getting Started

📋 Prerequisites

.NET 8.0 or later

📥 Download

git clone https://github.com/LM-Kit/lm-kit-net-samples
cd lm-kit-net-samples/console_net/document_summarizer

Project Link: document_summarizer (same path as above)

▶️ Run

dotnet build
dotnet run

Then:

Select a model by typing 0-9, or paste a custom model URI.
Wait for the model to download (first run) and load.
When prompted, type the path to a PDF or image (or any other supported document file).
Inspect the generated Title and Summary.
Press Enter to summarize another file, or submit an empty path to exit.

🔍 Notes on Key Types

LM (LMKit.Model) - model wrapper used by LM-Kit.NET:
- Accepts a Uri pointing to the model.
- Uses callbacks for download and load progress.
Summarizer (LMKit.TextGeneration) - document summarization engine:
- Summarize(Attachment) returns a result with fields like Title and Summary.
- Controlled by properties such as GenerateTitle, GenerateContent, MaxContentWords, and Guidance.
Attachment (LMKit.Data) - wraps external data:
- new Attachment(string path) loads a file from disk.
- Common real-world usage includes PDFs and images (screenshots, scanned pages, photos).
- Exceptions are raised when the path is invalid or inaccessible.

🔧 Extend the Demo

Display elapsed time (the demo already measures it with Stopwatch, it just does not print it yet).
Add PDF-focused options:
- summarize only the first N pages
- summarize selected pages (example: --pages 1,3-5)
- page-by-page summaries then a final combined summary
Add CLI flags:
- --max-words 200
- --no-title
- --guidance "Always summarize in French"
Add batch mode: summarize every PDF or image in a directory.
Write output to disk (example: output.md or output.json) instead of only printing to console.
Add formatting modes:
- bullet summary
- executive summary
- “key takeaways” list
Chain with LM-Kit’s Structured Extraction to go from: document -> summary -> structured data

How-To: Summarize Documents and Text: Step-by-step guide to using the Summarizer API for document summarization.
Glossary: Text Summarization: Explains AI-driven text summarization concepts and techniques.
Glossary: Intelligent Document Processing: Covers the broader field of automated document understanding.
Chat with PDF Demo: Interactive document Q&A that complements summarization with retrieval-augmented answers.

Table of Contents

Document Summarizer in .NET Applications

🎯 Purpose of the Demo

👥 Target Audience

🚀 Problem Solved

💻 Sample Application Description

✨ Key Features

🧰 Built-In Models (menu)

🧠 Supported Models

🛠️ Commands and Flow

🗣️ Example Use Cases

⚙️ Behavior and Policies (quick reference)

💻 Minimal Integration Snippet

🛠️ Getting Started

📋 Prerequisites

📥 Download

▶️ Run

🔍 Notes on Key Types

🔧 Extend the Demo

Table of Contents

Document Summarizer in .NET Applications

🎯 Purpose of the Demo

👥 Target Audience

🚀 Problem Solved

💻 Sample Application Description

✨ Key Features

🧰 Built-In Models (menu)

🧠 Supported Models

🛠️ Commands and Flow

🗣️ Example Use Cases

⚙️ Behavior and Policies (quick reference)

💻 Minimal Integration Snippet

🛠️ Getting Started

📋 Prerequisites

📥 Download

▶️ Run

🔍 Notes on Key Types

🔧 Extend the Demo

📚 Related Content