Preprocess Images for Vision Pipelines

Before feeding images to a Vision Language Model (VLM) or OCR engine, preprocessing can dramatically improve results. Scanned documents often have skewed alignment, excess borders, or blank pages that waste context tokens and reduce accuracy. LM-Kit.NET's ImageBuffer class provides auto-crop, deskew, background detection, and blank page detection, all running locally with zero external dependencies. This tutorial builds a document preprocessing pipeline that cleans images before vision analysis.

Why Image Preprocessing Matters

Two real-world problems that image preprocessing solves:

Scanned documents with skewed pages. A scanner that feeds pages at a slight angle produces images where text lines are tilted. VLMs and OCR handle horizontal text best. Deskewing corrects the rotation automatically.
Wasted context on borders and blank pages. Scanned images often include wide white margins or entirely blank pages. Auto-cropping removes unused borders, and blank detection skips empty pages. This reduces the image data sent to the model, improving speed and accuracy.

Prerequisites

Requirement	Minimum
.NET SDK	8.0+
RAM	4 GB
Input formats	PNG, JPEG, BMP, TIFF, WebP

No GPU is required for image preprocessing. GPU is only needed if you follow up with VLM inference.

Step 1: Create the Project

dotnet new console -n ImagePreprocessing
cd ImagePreprocessing
dotnet add package LM-Kit.NET

Step 2: Understand the Preprocessing Pipeline

┌──────────────┐
│  Load Image  │
│  (LoadAsRGB) │
└──────┬───────┘
       │
       ▼
┌──────────────┐     ┌──────────────┐
│  Is Blank?   │─Yes─│  Skip Page   │
└──────┬───────┘     └──────────────┘
       │ No
       ▼
┌──────────────┐
│  Detect      │
│  Background  │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Auto-Crop   │
│  (borders)   │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Deskew      │
│  (rotation)  │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Save / Send │
│  to VLM      │
└──────────────┘

Method	Purpose
`ImageBuffer.LoadAsRGB(path)`	Load an image from disk as 24-bit RGB
`IsBlank(tolerance)`	Check if the image is uniform (blank page)
`TryDetectBorderBackgroundColor(out color, minAccuracy)`	Detect the border background color
`CropAuto(margin, tolerance)`	Remove uniform borders
`Deskew(parameters)`	Correct page rotation

Step 3: Write the Program

using System.Text;
using LMKit.Graphics.Primitives;
using LMKit.Media.Image;

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Define the input and output paths
// ──────────────────────────────────────
string inputFolder = "scanned_pages";
string outputFolder = "preprocessed";

if (!Directory.Exists(inputFolder))
{
    Console.WriteLine($"Create a '{inputFolder}' folder with scanned images, then run again.");
    return;
}

Directory.CreateDirectory(outputFolder);

string[] imageFiles = Directory.GetFiles(inputFolder, "*.*")
    .Where(f => f.EndsWith(".png", StringComparison.OrdinalIgnoreCase) ||
                f.EndsWith(".jpg", StringComparison.OrdinalIgnoreCase) ||
                f.EndsWith(".jpeg", StringComparison.OrdinalIgnoreCase) ||
                f.EndsWith(".bmp", StringComparison.OrdinalIgnoreCase))
    .ToArray();

Console.WriteLine($"Found {imageFiles.Length} image(s) in '{inputFolder}'\n");

int processedCount = 0;
int skippedBlank = 0;

// ──────────────────────────────────────
// 2. Process each image
// ──────────────────────────────────────
foreach (string filePath in imageFiles)
{
    string fileName = Path.GetFileName(filePath);
    Console.ForegroundColor = ConsoleColor.White;
    Console.WriteLine($"Processing: {fileName}");
    Console.ResetColor();

    // Load as RGB24
    using ImageBuffer original = ImageBuffer.LoadAsRGB(filePath);
    Console.WriteLine($"  Loaded: {original.Width}x{original.Height} ({original.Format})");

    // ──────────────────────────────────────
    // 2a. Blank page detection
    // ──────────────────────────────────────
    if (original.IsBlank(tolerance: 10))
    {
        Console.ForegroundColor = ConsoleColor.DarkGray;
        Console.WriteLine("  Blank page detected. Skipping.\n");
        Console.ResetColor();
        skippedBlank++;
        continue;
    }

    // ──────────────────────────────────────
    // 2b. Background color detection
    // ──────────────────────────────────────
    if (original.TryDetectBorderBackgroundColor(out Color32 bgColor, minAccuracy: 0.8f))
    {
        Console.WriteLine($"  Background color: RGB({bgColor.R}, {bgColor.G}, {bgColor.B})");
    }
    else
    {
        Console.WriteLine("  No uniform background detected.");
    }

    // Start with the original; each step may produce a new buffer
    ImageBuffer current = original;
    bool ownsBuffer = false;

    // ──────────────────────────────────────
    // 2c. Auto-crop to remove borders
    // ──────────────────────────────────────
    ImageBuffer cropped = current.CropAuto(margin: 5, tolerance: 15);

    if (cropped != null)
    {
        int widthDiff = current.Width - cropped.Width;
        int heightDiff = current.Height - cropped.Height;
        Console.WriteLine($"  Cropped: {current.Width}x{current.Height} -> " +
                          $"{cropped.Width}x{cropped.Height} " +
                          $"(removed {widthDiff}px width, {heightDiff}px height)");

        if (ownsBuffer) current.Dispose();
        current = cropped;
        ownsBuffer = true;
    }
    else
    {
        Console.WriteLine("  No borders to crop.");
    }

    // ──────────────────────────────────────
    // 2d. Deskew (correct rotation)
    // ──────────────────────────────────────
    var deskewParams = new ImageBuffer.DeskewParameters
    {
        MaxAngle = 15f,       // search up to 15 degrees
        MinAngle = 0.3f,      // ignore angles below 0.3 degrees
        Interpolate = true    // bilinear interpolation for quality
    };

    ImageBuffer.DeskewResult deskewResult = current.Deskew(deskewParams);

    if (deskewResult.Image != null)
    {
        Console.WriteLine($"  Deskewed: {deskewResult.Angle:F2} degrees corrected");

        if (ownsBuffer) current.Dispose();
        current = deskewResult.Image;
        ownsBuffer = true;
    }
    else
    {
        Console.WriteLine($"  Skew angle: {deskewResult.Angle:F2} degrees (below threshold, no correction)");
    }

    // ──────────────────────────────────────
    // 2e. Save the preprocessed image
    // ──────────────────────────────────────
    string outputPath = Path.Combine(outputFolder, fileName.Replace(
        Path.GetExtension(fileName), ".png"));
    current.SaveAsPng(outputPath, compressionLevel: 6);
    Console.ForegroundColor = ConsoleColor.Green;
    Console.WriteLine($"  Saved: {outputPath} ({current.Width}x{current.Height})\n");
    Console.ResetColor();

    if (ownsBuffer) current.Dispose();
    processedCount++;
}

// ──────────────────────────────────────
// 3. Summary
// ──────────────────────────────────────
Console.WriteLine("=== Summary ===");
Console.WriteLine($"  Processed: {processedCount}");
Console.WriteLine($"  Blank pages skipped: {skippedBlank}");
Console.WriteLine($"  Output folder: {Path.GetFullPath(outputFolder)}");

Step 4: Run the Pipeline

Create a scanned_pages/ folder with some scanned document images, then:

dotnet run

Expected output:

Found 4 image(s) in 'scanned_pages'

Processing: page1.png
  Loaded: 2480x3508 (RGB24)
  Background color: RGB(255, 255, 255)
  Cropped: 2480x3508 -> 2300x3400 (removed 180px width, 108px height)
  Deskewed: 1.35 degrees corrected
  Saved: preprocessed/page1.png (2300x3400)

Processing: page2.png
  Loaded: 2480x3508 (RGB24)
  Blank page detected. Skipping.

Processing: page3.jpg
  Loaded: 1240x1754 (RGB24)
  Background color: RGB(248, 248, 245)
  No borders to crop.
  Skew angle: 0.12 degrees (below threshold, no correction)
  Saved: preprocessed/page3.png (1240x1754)

=== Summary ===
  Processed: 3
  Blank pages skipped: 1
  Output folder: C:\projects\ImagePreprocessing\preprocessed

Method Reference

IsBlank

Checks if all pixels are within tolerance of the first pixel:

using LMKit.Data;
using LMKit.Media.Image;
using LMKit.Model;
using LMKit.TextGeneration;

using LM vlm = LM.LoadFromModelID("gemma3:4b");

bool isBlank = image.IsBlank(tolerance: 0);     // exact match only
bool isBlank = image.IsBlank(tolerance: 10);     // allow slight variation (noise)
bool isBlank = image.IsBlank(tolerance: 30);     // allow noticeable variation

Returns true immediately on the first pixel that exceeds the tolerance, so performance is O(1) best case and O(n) worst case.

TryDetectBorderBackgroundColor

Scans the image perimeter to find the dominant border color:

using LMKit.Data;
using LMKit.Graphics.Primitives;
using LMKit.Media.Image;
using LMKit.Model;
using LMKit.TextGeneration;

using LM vlm = LM.LoadFromModelID("gemma3:4b");

if (image.TryDetectBorderBackgroundColor(out Color32 bg, minAccuracy: 0.85f))
{
    // bg contains the detected background color
    byte luminance = bg.GetLuminance();  // Rec. 601 luminance
}

The minAccuracy parameter (0.0 to 1.0) sets the minimum fraction of border pixels that must share the same color. Lower values accept noisier borders.

CropAuto

Removes uniform-color borders from all four edges:

using LMKit.Data;
using LMKit.Media.Image;
using LMKit.Model;
using LMKit.TextGeneration;

using LM vlm = LM.LoadFromModelID("gemma3:4b");

// Basic crop (exact color match)
ImageBuffer cropped = image.CropAuto();

// With margin and tolerance
ImageBuffer cropped = image.CropAuto(
    margin: 10,       // keep 10px extra on each side
    tolerance: 20     // per-channel color tolerance (0-255)
);

Returns null when no border is detected or the entire image is one color. Supports GRAY8, RGB24, and RGBA32 formats. For RGBA32, fully transparent pixels (alpha = 0) are treated as border.

Deskew

Estimates and corrects page skew using edge detection and angular voting:

using LMKit.Data;
using LMKit.Media.Image;
using LMKit.Model;
using LMKit.TextGeneration;

using LM vlm = LM.LoadFromModelID("gemma3:4b");

var parameters = new ImageBuffer.DeskewParameters
{
    MaxAngle = 15f,       // search range: [-15, +15] degrees
    MinAngle = 0.5f,      // ignore corrections below 0.5 degrees
    Interpolate = true    // bilinear (true) or nearest-neighbor (false)
};

ImageBuffer.DeskewResult result = image.Deskew(parameters);

if (result.Image != null)
{
    // result.Image is the corrected image
    // result.Angle is the detected skew (positive = clockwise)
    using var deskewed = result.Image;
    deskewed.SaveAsPng("corrected.png");
}

Parameter	Default	Range	Effect
`MaxAngle`	15	2 to 45	Wider range detects larger skews but takes longer
`MinAngle`	0.5	0+	Angles below this threshold are ignored
`Interpolate`	true	true/false	Bilinear gives smoother results; nearest-neighbor is faster

Format Conversions

ImageBuffer supports conversion between pixel formats:

using LMKit.Graphics.Primitives;
using LMKit.Media.Image;

using ImageBuffer rgb = ImageBuffer.LoadAsRGB("photo.jpg");

// Convert to grayscale (useful for OCR preprocessing)
using ImageBuffer gray = rgb.ConvertGRAY8();

// Convert to RGBA (useful when transparency is needed)
using ImageBuffer rgba = rgb.ConvertRGBA32();

// Resize to specific dimensions
using ImageBuffer resized = rgb.Resize(800, 600);

// Resize preserving aspect ratio (adds padding)
using ImageBuffer boxed = rgb.ResizeBox(800, 600,
    new Color32(255, 255, 255));  // white padding

Feeding Preprocessed Images to a VLM

After preprocessing, send the cleaned image to a Vision Language Model:

using LMKit.Data;
using LMKit.Model;
using LMKit.TextGeneration;

using LM vlm = LM.LoadFromModelID("gemma3:4b");

// Load preprocessed image as an attachment
using ImageBuffer preprocessed = ImageBuffer.LoadAsRGB("preprocessed/page1.png");
var attachment = new Attachment(preprocessed);

// Use in a chat conversation
var chat = new SingleTurnConversation(vlm)
{
    SystemPrompt = "Extract all text from this document image.",
    MaximumCompletionTokens = 1024
};

var result = chat.Submit("Read this document.", attachments: new[] { attachment });
Console.WriteLine(result.Completion);

Common Issues

Problem	Cause	Fix
`CropAuto` returns null	Image has no uniform border	Increase `tolerance` or skip the crop step
Deskew produces artifacts	Angle too large (>30 degrees)	Use `Rotate(90)` or `Rotate(270)` first for heavily rotated pages
`IsBlank` returns false on near-blank pages	Tolerance too low	Increase tolerance to 20-30 for scanned pages with noise
Wrong background color detected	Mixed-color borders	Increase `minAccuracy` to reject noisy edges

Agent-Based Image Preprocessing

If you are building an AI agent that needs to preprocess images as part of a larger workflow, LM-Kit.NET provides built-in tools that wrap the same ImageBuffer capabilities shown above. The agent can call these tools autonomously based on natural language instructions:

using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;

var agent = Agent.CreateBuilder(model)
    .WithPersona("Document Preprocessing Agent")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.ImageDeskew);   // Correct page rotation
        tools.Register(BuiltInTools.ImageCrop);      // Remove uniform borders
        tools.Register(BuiltInTools.ImageResize);    // Resize or convert formats
    })
    .Build();

var result = await agent.RunAsync(
    "Deskew the scanned page at 'scan.png', then crop its borders, " +
    "and resize to 1200x1600 preserving aspect ratio.");

See Equip an Agent with Built-In Tools for the complete Document tools reference.

Next Steps

Analyze Images with Vision Language Models: send images to VLMs for understanding.
Convert Documents to Markdown with VLM OCR: full document-to-text pipeline.
Import and Query Documents with Vision Understanding: vision-based document indexing.
Equip an Agent with Built-In Tools: use ImageDeskew, ImageCrop, and ImageResize tools in agent workflows.

Table of Contents