👉 Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/document-intelligence/structured-data-extraction/receipt_expense_scanner

AI Receipt & Expense Scanner for C# .NET Applications

🎯 Purpose of the Demo

AI Receipt & Expense Scanner demonstrates how to use LM-Kit.NET with vision-language models to extract structured expense data from receipts entirely on local hardware. Feed it any receipt, whether a text file, PDF, or a scanned photo, and get back structured JSON with store info, line items, totals, tax breakdown, payment details, and an automatic expense category.

The sample shows how to:

Use TextExtraction with a VLM to extract structured data from both text and image-based receipts.
Define extraction schemas with TextExtractionElement including line item arrays.
Automatically categorize expenses (Meals, Office Supplies, Travel, Groceries, etc.).
Parse text content with SetContent(string) or load documents and images with Attachment.

Why a Local Receipt Scanner with Vision-Language Models?

Financial data stays private: process expense receipts without uploading to cloud services.
Process any format: VLMs understand scanned receipt photos, PDFs, and text files natively.
One API call: a single Parse() call extracts all receipt data. No agents, no chaining.
Line item detail: each purchased item extracted with quantity, unit price, and total.
Works offline: no API keys, no internet connection required.

👥 Who Should Use This Demo

Finance Teams: automate receipt processing for expense reports.
Accountants: extract structured data from receipts for bookkeeping, including scanned paper receipts.
Small Business Owners: digitize paper receipts for tax preparation.
Expense Management Developers: integrate local receipt scanning into expense apps.
Developers: learn the TextExtraction API with VLM models and nested array elements for line items.

🚀 What Problem It Solves

Paper receipt scanning: extract structured data from receipt photos and scans using vision-language models.
Manual data entry: eliminate typing receipt details into expense reports.
Data privacy: process receipts with sensitive financial data without cloud exposure.
Categorization: automatically classify expenses into accounting categories.
Integration: produce JSON output ready for accounting software or ERP systems.

💻 Demo Application Overview

Console app that:

Lets you choose from multiple vision-language models optimized for document understanding.
Downloads models if needed, with live progress updates.
Creates a TextExtraction instance with a receipt-specific schema.
Enters an interactive loop where you can:
- Type sample to scan a built-in grocery store receipt.
- Enter a file path to scan your own receipt (.txt, .pdf, .png, .jpg).
- View the extracted expense data with color-coded output.
- View the full JSON output for integration.
Loops until you type q.

Key Features

Vision-Language Models: process scanned receipt photos, PDFs, and text files natively.
Line Item Extraction: Each item parsed with description, quantity, unit price, and total.
Tax Breakdown: Tax rate and tax amount extracted separately.
Auto-Categorization: AI suggests the expense category for accounting.
Built-In Sample: Realistic grocery receipt for instant demo.
JSON Output: Machine-readable structured data for any accounting system.
Zero Configuration: No API keys, no cloud accounts, no external dependencies.

🏗️ Architecture

  ┌─────────────────────────────────────────────────┐
  │     Receipt (text, PDF, PNG, JPG, ...)          │
  └───────────────────┬─────────────────────────────┘
                      │
                      ▼
  ┌─────────────────────────────────────────────────┐
  │    TextExtraction + Vision-Language Model        │
  │                                                  │
  │   Schema:                                        │
  │   ├── Store Name (String)                        │
  │   ├── Store Address (String)                     │
  │   ├── Date (Date)                                │
  │   ├── Time (String)                              │
  │   ├── Items (Array)                              │
  │   │   ├── Description, Quantity                  │
  │   │   └── Unit Price, Total                      │
  │   ├── Subtotal (Float)                           │
  │   ├── Tax Rate / Tax Amount (Float)              │
  │   ├── Discount (Float)                           │
  │   ├── Total (Float)                              │
  │   ├── Payment Method (String)                    │
  │   ├── Transaction ID (String)                    │
  │   └── Expense Category (String)                  │
  │                                                  │
  └───────────────────┬─────────────────────────────┘
                      │
                      ▼
  ┌─────────────────────────────────────────────────┐
  │           Structured Expense Report              │
  │           (formatted display + JSON)             │
  └─────────────────────────────────────────────────┘

Extraction Schema

Field	Type	Description
Store Name	String	Name of the store or business
Store Address	String	Full address of the store
Date	Date	Transaction date
Time	String	Transaction time
Items	Array	Description, quantity, unit price, line total
Subtotal	Float	Total before tax and discounts
Tax Rate	String	Applied tax rate
Tax Amount	Float	Total tax charged
Discount	Float	Discount applied
Total	Float	Final amount paid
Payment Method	String	Cash, card, etc.
Transaction ID	String	Receipt reference number
Expense Category	String	AI-suggested category

Code Highlights

using LMKit.Extraction;
using LMKit.Model;

// Load a vision-language model for receipt understanding
LM model = LM.LoadFromModelID("qwen3.5:9b");

var textExtraction = new TextExtraction(model);
textExtraction.Elements = new List<TextExtractionElement>
{
    new("Store Name", ElementType.String, "Name of the store."),
    new("Date", ElementType.Date, "Transaction date."),
    new("Items",
        new List<TextExtractionElement>
        {
            new("Description", ElementType.String, "Item name."),
            new("Quantity", ElementType.Integer, "Quantity purchased."),
            new("Unit Price", ElementType.Float, "Price per unit."),
            new("Total", ElementType.Float, "Line total.")
        },
        isArray: true,
        "Purchased items."
    ),
    new("Total", ElementType.Float, "Final amount paid."),
    new("Expense Category", ElementType.String, "Suggested category."),
    // ... more elements
};

// Works with text, PDF, and receipt photos
textExtraction.SetContent(new Attachment("receipt.jpg"));
var result = textExtraction.Parse();

Console.WriteLine(result.Json);

⚙️ Getting Started

Prerequisites

.NET 8.0 or later
Sufficient VRAM for the selected model (2.5 to 18 GB depending on model choice)

Download

git clone https://github.com/LM-Kit/lm-kit-net-samples
cd lm-kit-net-samples/console_net/document-intelligence/structured-data-extraction/receipt_expense_scanner

Run

dotnet build
dotnet run

Then:

Select a vision-language model from the menu, or paste a custom model URI.
Wait for the model to download (first run) and load.
Type sample to try the built-in receipt, or enter a file path (text, PDF, or image).
View the extracted expense data and JSON output.
Type q to exit.

Models

Option	Model	Approx. VRAM
0	Z.ai GLM-V 4.6 Flash 10B	~7 GB
1	MiniCPM o 4.5 9B	~5.9 GB
2	Alibaba Qwen 3.5 2B	~2 GB
3	Alibaba Qwen 3.5 4B	~3.5 GB
4	Alibaba Qwen 3.5 9B (Recommended)	~7 GB
6	Google Gemma 4 E4B	~6 GB
7	Alibaba Qwen 3.6 27B	~18 GB
8	Alibaba Qwen 3.6 35B-A3B	~22 GB
9	Mistral Ministral 3 8B	~6.5 GB

🔧 Troubleshooting

Incomplete line items
- Try a larger model (8B+) for better accuracy on receipts with many items.
- Ensure the receipt text or image is complete and legible.
Scanned receipt photos
- Use a vision-capable model (Qwen 3.5, GLM-V, or Gemma 4) for best results with photos.
- Ensure the photo is clear, well-lit, and at sufficient resolution.
Incorrect totals
- The model extracts values as-is from the receipt, it does not recalculate.
- Verify the original receipt data is correct.

🚀 Extend the Demo

Batch scanning: process a folder of receipts and generate an expense summary.
Budget tracking: aggregate extracted totals by category over time.
Accounting integration: export JSON to QuickBooks, Xero, or FreshBooks.
Multi-currency: extract and convert currency for international receipts.
OCR enhancement: combine with LM-Kit OCR for degraded or low-quality scans.

📚 Additional Resources

How-To: Extract Structured Data: General guide to defining extraction schemas with nested arrays for line items.
How-To: Extract Invoice Data from Documents: Similar VLM-powered extraction workflow applied to invoices.
Glossary: Structured Data Extraction: Explains schema-driven extraction concepts used for receipt parsing.
Glossary: Vision Language Models (VLM): Covers the vision-language model capabilities used in this demo.
Invoice Data Extraction Demo: Companion demo for extracting structured data from invoice documents.

Table of Contents