Table of Contents

👉 Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/vision/image-classification/zero_shot_image_classifier

Photo Auto-Sorter (Zero-Shot) for C# .NET Applications


🎯 Purpose of the Demo

An interactive console app that classifies images into a caller-defined category list and reorganises them on disk: copies or moves each image into a per-category sub-folder, with low-confidence cases routed to _uncertain/ and the entire decision log captured in a manifest CSV.

Built on LM-Kit.NET's Categorization engine — a deterministic classifier that returns a category index plus a calibrated Confidence, not a free-form chat completion.


👥 Industry Target Audience

  • E-commerce: auto-bucketing product photos by type.
  • Marketing operations: routing ad-creative assets by intent / channel.
  • Photo libraries / DAM: ingest auto-tagging.
  • Mailroom / forms: pre-classification before downstream OCR / extraction.
  • Personal archives: cleaning years of IMG_*.jpg.

🚀 Problem Solved

A free-form VLM prompt asks "what is this image?" and gets a sentence back. That's the wrong shape for sorting a folder. Categorization instead asks "which of these N labels matches?" and returns an index + confidence. The demo plugs that into a real folder workflow: confidence threshold → bucket → copy/move → audit CSV.


💻 Application Overview

Interactive menu — no command-line arguments — with two modes. Model load happens once at startup.

Mode What it does
Live Pick a category list, then type image paths one at a time; immediate verdict + confidence.
Folder Pick a category list, point at an input folder, choose copy/move + a confidence threshold, end up with a sorted folder tree and a manifest CSV.
Quit Exit.

Three ways to supply the category list:

Source When
Default 10 sensible categories (product photo, people photo, screenshot, document scan, chart, outdoor, indoor, food, vehicle, other).
Custom Type one label per line at the prompt; blank ends.
File Read from a text file, one label per line. # lines are treated as comments.

Folder mode emits:

  • A <bucket>/... sub-folder per category under the chosen output directory.
  • A _uncertain/ bucket for images below the confidence threshold.
  • sort_manifest.csv with every decision: source, category, confidence, destination.

✨ Key Features

  • Categorization.GetBestCategory + .Confidence: one call per image, deterministic verdict.
  • Custom taxonomies: the category list is the contract; supply any set of labels.
  • Confidence floor: low-confidence cases get a separate bucket so a human can review.
  • Copy or move: pick the operation per session.
  • Manifest CSV: every sort decision recorded with source + destination + confidence.

🧠 Supported Models

  • Google Gemma 3 VL 4B (~3 GB VRAM) — fast default.
  • Google Gemma 3 VL 12B (~8 GB VRAM).
  • Alibaba Qwen 2 VL 2B / 7B (~2 / ~6 GB VRAM).
  • MiniCPM o 4.5 9B (~5.9 GB VRAM).
  • Any custom vision-language model URI.

🛠️ Getting Started

📋 Prerequisites

  • .NET 8.0 or later
  • VRAM appropriate to the selected vision model

▶️ Running the Application

git clone https://github.com/LM-Kit/lm-kit-net-samples
cd lm-kit-net-samples/console_net/vision/image-classification/zero_shot_image_classifier
dotnet run

Pick a model, pick a mode, supply the categories, supply the folder.

Share