Table of Contents

👉 Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/resume_parser

AI Resume Parser for C# .NET Applications


🎯 Purpose of the Demo

AI Resume Parser demonstrates how to use LM-Kit.NET with vision-language models to extract structured candidate profiles from resumes entirely on local hardware. Feed it any resume, whether a text file, PDF, Word document, or a scanned image, and get back structured JSON with contact info, work experience, education, skills, certifications, and languages.

The sample shows how to:

  • Use TextExtraction with a VLM to extract structured data from both text and image-based resumes.
  • Define extraction schemas with TextExtractionElement including nested arrays for complex structures.
  • Parse text content with SetContent(string) or load documents and images with Attachment.
  • Handle both flat fields (name, email) and nested array fields (work experience, education).

Why a Local Resume Parser with Vision-Language Models?

  • Candidate data stays private: parse sensitive resumes without uploading to cloud services.
  • Process any format: VLMs understand scanned resume images, PDFs, and text files natively.
  • One API call: a single Parse() call extracts the entire profile. No agents, no chaining.
  • Structured output: get machine-readable JSON ready for ATS or HR system integration.
  • Works offline: no API keys, no internet connection required.

👥 Who Should Use This Demo

  • HR Teams: automate resume screening without sending candidate data to third parties.
  • Recruiters: quickly extract and compare candidate profiles from stacks of resumes, including scanned copies.
  • ATS Developers: integrate local AI-powered resume parsing into applicant tracking systems.
  • Staffing Agencies: process high volumes of resumes with full data sovereignty.
  • Developers: learn the TextExtraction API with VLM models and nested array elements.

🚀 What Problem It Solves

  • Scanned resumes: extract structured data from resume images and scans using vision-language models.
  • Data privacy: parse resumes containing personal information without cloud exposure.
  • Manual data entry: eliminate copy-pasting candidate details into spreadsheets or databases.
  • Inconsistent formats: extract structured data from any resume format or layout.
  • Integration: produce JSON output ready for any HR system, database, or API.

💻 Demo Application Overview

Console app that:

  • Lets you choose from multiple vision-language models optimized for document understanding.
  • Downloads models if needed, with live progress updates.
  • Creates a TextExtraction instance with a resume-specific schema.
  • Enters an interactive loop where you can:
    • Type sample to parse a built-in senior engineer resume.
    • Enter a file path to parse your own resume (.txt, .pdf, .docx, .png, .jpg).
    • View the extracted candidate profile with color-coded output.
    • View the full JSON output for integration.
  • Loops until you type q.

Key Features

  • Vision-Language Models: process scanned resumes, images, and photos alongside text documents.
  • Nested Extraction: Work experience and education extracted as structured arrays.
  • Built-In Sample: Realistic senior engineer resume for instant demo.
  • File Support: Parse text files, PDF, DOCX, or image files (PNG, JPG, BMP, TIFF) via Attachment.
  • JSON Output: Machine-readable structured data alongside formatted display.
  • Zero Configuration: No API keys, no cloud accounts, no external dependencies.

🏗️ Architecture

  ┌─────────────────────────────────────────────────┐
  │     Resume (text, PDF, DOCX, PNG, JPG, ...)     │
  └───────────────────┬─────────────────────────────┘
                      │
                      ▼
  ┌─────────────────────────────────────────────────┐
  │    TextExtraction + Vision-Language Model        │
  │                                                  │
  │   Schema:                                        │
  │   ├── Full Name (String)                         │
  │   ├── Email (String)                             │
  │   ├── Phone (String)                             │
  │   ├── Location (String)                          │
  │   ├── Professional Summary (String)              │
  │   ├── Work Experience (Array)                    │
  │   │   ├── Company, Job Title, Period             │
  │   │   └── Key Achievements                       │
  │   ├── Education (Array)                          │
  │   │   └── Institution, Degree, Year              │
  │   ├── Skills (String)                            │
  │   ├── Certifications (String)                    │
  │   └── Languages (String)                         │
  │                                                  │
  └───────────────────┬─────────────────────────────┘
                      │
                      ▼
  ┌─────────────────────────────────────────────────┐
  │         Structured Candidate Profile             │
  │         (formatted display + JSON)               │
  └─────────────────────────────────────────────────┘

Extraction Schema

Field Type Description
Full Name String Candidate's full name
Email String Email address
Phone String Phone number
Location String City, state, or country
Professional Summary String Career objective or summary
Work Experience Array Company, title, period, achievements
Education Array Institution, degree, year
Skills String Technical and professional skills
Certifications String Professional certifications
Languages String Languages with proficiency levels

Code Highlights

using LMKit.Extraction;
using LMKit.Model;

// Load a vision-language model for document understanding
LM model = LM.LoadFromModelID("qwen3.5:9b");

var textExtraction = new TextExtraction(model);
textExtraction.Elements = new List<TextExtractionElement>
{
    new("Full Name", ElementType.String, "Full name of the candidate."),
    new("Email", ElementType.String, "Email address."),
    new("Work Experience",
        new List<TextExtractionElement>
        {
            new("Company", ElementType.String, "Company name."),
            new("Job Title", ElementType.String, "Job title or role."),
            new("Period", ElementType.String, "Employment period."),
            new("Key Achievements", ElementType.String, "Key achievements.")
        },
        isArray: true,
        "Work experience entries."
    ),
    // ... more elements
};

// Works with text, PDF, DOCX, and image files
textExtraction.SetContent(new Attachment("resume.pdf"));
var result = textExtraction.Parse();

Console.WriteLine(result.Json);

⚙️ Getting Started

Prerequisites

  • .NET 8.0 or later
  • Sufficient VRAM for the selected model (2.5 to 18 GB depending on model choice)

Download

git clone https://github.com/LM-Kit/lm-kit-net-samples
cd lm-kit-net-samples/console_net/resume_parser

Run

dotnet build
dotnet run

Then:

  1. Select a vision-language model from the menu, or paste a custom model URI.
  2. Wait for the model to download (first run) and load.
  3. Type sample to try the built-in resume, or enter a file path (text, PDF, DOCX, or image).
  4. View the extracted candidate profile and JSON output.
  5. Type q to exit.

Models

Option Model Approx. VRAM
0 Z.ai GLM-V 4.6 Flash 10B ~7 GB
1 MiniCPM o 4.5 9B ~5.9 GB
2 Alibaba Qwen 3.5 2B ~2 GB
3 Alibaba Qwen 3.5 4B ~3.5 GB
4 Alibaba Qwen 3.5 9B (Recommended) ~7 GB
5 Google Gemma 3 4B ~5.7 GB
6 Google Gemma 3 12B ~11 GB
7 Alibaba Qwen 3.5 27B ~18 GB
8 Mistral Ministral 3 8B ~6.5 GB

🔧 Troubleshooting

  • Incomplete extraction

    • Try a larger model (8B+) for better accuracy on complex resume layouts.
    • Ensure the resume content is complete and not truncated.
  • Scanned resume images

    • Use a vision-capable model (Qwen 3.5, GLM-V, or Gemma 3) for best results with images.
    • Ensure the image is clear, well-lit, and at sufficient resolution.
  • Missing fields

    • The model returns empty values for fields not present in the resume.
    • Verify the resume contains the expected information.

🚀 Extend the Demo

  • Batch processing: scan a folder of resumes and generate a comparison spreadsheet.
  • Scoring: add extraction elements for years of experience or skill match percentage.
  • Database integration: pipe JSON output into an ATS or candidate database.
  • Multi-language: test with resumes in different languages using multilingual VLMs.
  • OCR enhancement: combine with Tesseract OCR for degraded or low-quality scans.

📚 Additional Resources

Share