Preprocess Images for Vision Pipelines
Before feeding images to a Vision Language Model (VLM) or OCR engine, preprocessing can dramatically improve results. Scanned documents often have skewed alignment, excess borders, or blank pages that waste context tokens and reduce accuracy. LM-Kit.NET's ImageBuffer class provides auto-crop, deskew, background detection, and blank page detection, all running locally with zero external dependencies. This tutorial builds a document preprocessing pipeline that cleans images before vision analysis.
Why Image Preprocessing Matters
Two real-world problems that image preprocessing solves:
- Scanned documents with skewed pages. A scanner that feeds pages at a slight angle produces images where text lines are tilted. VLMs and OCR handle horizontal text best. Deskewing corrects the rotation automatically.
- Wasted context on borders and blank pages. Scanned images often include wide white margins or entirely blank pages. Auto-cropping removes unused borders, and blank detection skips empty pages. This reduces the image data sent to the model, improving speed and accuracy.
Prerequisites
| Requirement | Minimum |
|---|---|
| .NET SDK | 8.0+ |
| RAM | 4 GB |
| Input formats | PNG, JPEG, BMP, TIFF, WebP |
No GPU is required for image preprocessing. GPU is only needed if you follow up with VLM inference.
Step 1: Create the Project
dotnet new console -n ImagePreprocessing
cd ImagePreprocessing
dotnet add package LM-Kit.NET
Step 2: Understand the Preprocessing Pipeline
┌──────────────┐
│ Load Image │
│ (LoadAsRGB) │
└──────┬───────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ Is Blank? │─Yes─│ Skip Page │
└──────┬───────┘ └──────────────┘
│ No
▼
┌──────────────┐
│ Detect │
│ Background │
└──────┬───────┘
│
▼
┌──────────────┐
│ Auto-Crop │
│ (borders) │
└──────┬───────┘
│
▼
┌──────────────┐
│ Deskew │
│ (rotation) │
└──────┬───────┘
│
▼
┌──────────────┐
│ Save / Send │
│ to VLM │
└──────────────┘
| Method | Purpose |
|---|---|
ImageBuffer.LoadAsRGB(path) |
Load an image from disk as 24-bit RGB |
IsBlank(tolerance) |
Check if the image is uniform (blank page) |
TryDetectBorderBackgroundColor(out color, minAccuracy) |
Detect the border background color |
CropAuto(margin, tolerance) |
Remove uniform borders |
Deskew(parameters) |
Correct page rotation |
Step 3: Write the Program
using System.Text;
using LMKit.Graphics.Primitives;
using LMKit.Media.Image;
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Define the input and output paths
// ──────────────────────────────────────
string inputFolder = "scanned_pages";
string outputFolder = "preprocessed";
if (!Directory.Exists(inputFolder))
{
Console.WriteLine($"Create a '{inputFolder}' folder with scanned images, then run again.");
return;
}
Directory.CreateDirectory(outputFolder);
string[] imageFiles = Directory.GetFiles(inputFolder, "*.*")
.Where(f => f.EndsWith(".png", StringComparison.OrdinalIgnoreCase) ||
f.EndsWith(".jpg", StringComparison.OrdinalIgnoreCase) ||
f.EndsWith(".jpeg", StringComparison.OrdinalIgnoreCase) ||
f.EndsWith(".bmp", StringComparison.OrdinalIgnoreCase))
.ToArray();
Console.WriteLine($"Found {imageFiles.Length} image(s) in '{inputFolder}'\n");
int processedCount = 0;
int skippedBlank = 0;
// ──────────────────────────────────────
// 2. Process each image
// ──────────────────────────────────────
foreach (string filePath in imageFiles)
{
string fileName = Path.GetFileName(filePath);
Console.ForegroundColor = ConsoleColor.White;
Console.WriteLine($"Processing: {fileName}");
Console.ResetColor();
// Load as RGB24
using ImageBuffer original = ImageBuffer.LoadAsRGB(filePath);
Console.WriteLine($" Loaded: {original.Width}x{original.Height} ({original.Format})");
// ──────────────────────────────────────
// 2a. Blank page detection
// ──────────────────────────────────────
if (original.IsBlank(tolerance: 10))
{
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.WriteLine(" Blank page detected. Skipping.\n");
Console.ResetColor();
skippedBlank++;
continue;
}
// ──────────────────────────────────────
// 2b. Background color detection
// ──────────────────────────────────────
if (original.TryDetectBorderBackgroundColor(out Color32 bgColor, minAccuracy: 0.8f))
{
Console.WriteLine($" Background color: RGB({bgColor.R}, {bgColor.G}, {bgColor.B})");
}
else
{
Console.WriteLine(" No uniform background detected.");
}
// Start with the original; each step may produce a new buffer
ImageBuffer current = original;
bool ownsBuffer = false;
// ──────────────────────────────────────
// 2c. Auto-crop to remove borders
// ──────────────────────────────────────
ImageBuffer cropped = current.CropAuto(margin: 5, tolerance: 15);
if (cropped != null)
{
int widthDiff = current.Width - cropped.Width;
int heightDiff = current.Height - cropped.Height;
Console.WriteLine($" Cropped: {current.Width}x{current.Height} -> " +
$"{cropped.Width}x{cropped.Height} " +
$"(removed {widthDiff}px width, {heightDiff}px height)");
if (ownsBuffer) current.Dispose();
current = cropped;
ownsBuffer = true;
}
else
{
Console.WriteLine(" No borders to crop.");
}
// ──────────────────────────────────────
// 2d. Deskew (correct rotation)
// ──────────────────────────────────────
var deskewParams = new ImageBuffer.DeskewParameters
{
MaxAngle = 15f, // search up to 15 degrees
MinAngle = 0.3f, // ignore angles below 0.3 degrees
Interpolate = true // bilinear interpolation for quality
};
ImageBuffer.DeskewResult deskewResult = current.Deskew(deskewParams);
if (deskewResult.Image != null)
{
Console.WriteLine($" Deskewed: {deskewResult.Angle:F2} degrees corrected");
if (ownsBuffer) current.Dispose();
current = deskewResult.Image;
ownsBuffer = true;
}
else
{
Console.WriteLine($" Skew angle: {deskewResult.Angle:F2} degrees (below threshold, no correction)");
}
// ──────────────────────────────────────
// 2e. Save the preprocessed image
// ──────────────────────────────────────
string outputPath = Path.Combine(outputFolder, fileName.Replace(
Path.GetExtension(fileName), ".png"));
current.SaveAsPng(outputPath, compressionLevel: 6);
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine($" Saved: {outputPath} ({current.Width}x{current.Height})\n");
Console.ResetColor();
if (ownsBuffer) current.Dispose();
processedCount++;
}
// ──────────────────────────────────────
// 3. Summary
// ──────────────────────────────────────
Console.WriteLine("=== Summary ===");
Console.WriteLine($" Processed: {processedCount}");
Console.WriteLine($" Blank pages skipped: {skippedBlank}");
Console.WriteLine($" Output folder: {Path.GetFullPath(outputFolder)}");
Step 4: Run the Pipeline
Create a scanned_pages/ folder with some scanned document images, then:
dotnet run
Expected output:
Found 4 image(s) in 'scanned_pages'
Processing: page1.png
Loaded: 2480x3508 (RGB24)
Background color: RGB(255, 255, 255)
Cropped: 2480x3508 -> 2300x3400 (removed 180px width, 108px height)
Deskewed: 1.35 degrees corrected
Saved: preprocessed/page1.png (2300x3400)
Processing: page2.png
Loaded: 2480x3508 (RGB24)
Blank page detected. Skipping.
Processing: page3.jpg
Loaded: 1240x1754 (RGB24)
Background color: RGB(248, 248, 245)
No borders to crop.
Skew angle: 0.12 degrees (below threshold, no correction)
Saved: preprocessed/page3.png (1240x1754)
=== Summary ===
Processed: 3
Blank pages skipped: 1
Output folder: C:\projects\ImagePreprocessing\preprocessed
Method Reference
IsBlank
Checks if all pixels are within tolerance of the first pixel:
bool isBlank = image.IsBlank(tolerance: 0); // exact match only
bool isBlank = image.IsBlank(tolerance: 10); // allow slight variation (noise)
bool isBlank = image.IsBlank(tolerance: 30); // allow noticeable variation
Returns true immediately on the first pixel that exceeds the tolerance, so performance is O(1) best case and O(n) worst case.
TryDetectBorderBackgroundColor
Scans the image perimeter to find the dominant border color:
if (image.TryDetectBorderBackgroundColor(out Color32 bg, minAccuracy: 0.85f))
{
// bg contains the detected background color
byte luminance = bg.GetLuminance(); // Rec. 601 luminance
}
The minAccuracy parameter (0.0 to 1.0) sets the minimum fraction of border pixels that must share the same color. Lower values accept noisier borders.
CropAuto
Removes uniform-color borders from all four edges:
// Basic crop (exact color match)
ImageBuffer cropped = image.CropAuto();
// With margin and tolerance
ImageBuffer cropped = image.CropAuto(
margin: 10, // keep 10px extra on each side
tolerance: 20 // per-channel color tolerance (0-255)
);
Returns null when no border is detected or the entire image is one color. Supports GRAY8, RGB24, and RGBA32 formats. For RGBA32, fully transparent pixels (alpha = 0) are treated as border.
Deskew
Estimates and corrects page skew using edge detection and angular voting:
var parameters = new ImageBuffer.DeskewParameters
{
MaxAngle = 15f, // search range: [-15, +15] degrees
MinAngle = 0.5f, // ignore corrections below 0.5 degrees
Interpolate = true // bilinear (true) or nearest-neighbor (false)
};
ImageBuffer.DeskewResult result = image.Deskew(parameters);
if (result.Image != null)
{
// result.Image is the corrected image
// result.Angle is the detected skew (positive = clockwise)
using var deskewed = result.Image;
deskewed.SaveAsPng("corrected.png");
}
| Parameter | Default | Range | Effect |
|---|---|---|---|
MaxAngle |
15 | 2 to 45 | Wider range detects larger skews but takes longer |
MinAngle |
0.5 | 0+ | Angles below this threshold are ignored |
Interpolate |
true | true/false | Bilinear gives smoother results; nearest-neighbor is faster |
Format Conversions
ImageBuffer supports conversion between pixel formats:
using ImageBuffer rgb = ImageBuffer.LoadAsRGB("photo.jpg");
// Convert to grayscale (useful for OCR preprocessing)
using ImageBuffer gray = rgb.ConvertGRAY8();
// Convert to RGBA (useful when transparency is needed)
using ImageBuffer rgba = rgb.ConvertRGBA32();
// Resize to specific dimensions
using ImageBuffer resized = rgb.Resize(800, 600);
// Resize preserving aspect ratio (adds padding)
using ImageBuffer boxed = rgb.ResizeBox(800, 600,
new Color32(255, 255, 255)); // white padding
Feeding Preprocessed Images to a VLM
After preprocessing, send the cleaned image to a Vision Language Model:
using LMKit.Data;
using LMKit.Model;
using LMKit.TextGeneration;
using LM vlm = LM.LoadFromModelID("gemma3:4b");
// Load preprocessed image as an attachment
using ImageBuffer preprocessed = ImageBuffer.LoadAsRGB("preprocessed/page1.png");
var attachment = new Attachment(preprocessed);
// Use in a chat conversation
var chat = new SingleTurnConversation(vlm)
{
SystemPrompt = "Extract all text from this document image.",
MaximumCompletionTokens = 1024
};
var result = chat.Submit("Read this document.", attachments: new[] { attachment });
Console.WriteLine(result.Completion);
Common Issues
| Problem | Cause | Fix |
|---|---|---|
CropAuto returns null |
Image has no uniform border | Increase tolerance or skip the crop step |
| Deskew produces artifacts | Angle too large (>30 degrees) | Use Rotate(90) or Rotate(270) first for heavily rotated pages |
IsBlank returns false on near-blank pages |
Tolerance too low | Increase tolerance to 20-30 for scanned pages with noise |
| Wrong background color detected | Mixed-color borders | Increase minAccuracy to reject noisy edges |
Agent-Based Image Preprocessing
If you are building an AI agent that needs to preprocess images as part of a larger workflow, LM-Kit.NET provides built-in tools that wrap the same ImageBuffer capabilities shown above. The agent can call these tools autonomously based on natural language instructions:
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;
var agent = Agent.CreateBuilder(model)
.WithPersona("Document Preprocessing Agent")
.WithTools(tools =>
{
tools.Register(BuiltInTools.ImageDeskew); // Correct page rotation
tools.Register(BuiltInTools.ImageCrop); // Remove uniform borders
tools.Register(BuiltInTools.ImageResize); // Resize or convert formats
})
.Build();
var result = await agent.RunAsync(
"Deskew the scanned page at 'scan.png', then crop its borders, " +
"and resize to 1200x1600 preserving aspect ratio.");
See Equip an Agent with Built-In Tools for the complete Document tools reference.
Next Steps
- Analyze Images with Vision Language Models: send images to VLMs for understanding.
- Convert Documents to Markdown with VLM OCR: full document-to-text pipeline.
- Import and Query Documents with Vision Understanding: vision-based document indexing.
- Equip an Agent with Built-In Tools: use ImageDeskew, ImageCrop, and ImageResize tools in agent workflows.