Local AI Agent Platform for .NET Developers

Your AI. Your Data. On Your Device.

LM-Kit.NET is the complete local AI stack for .NET: high-performance inference, multi-agent orchestration, document intelligence, RAG pipelines, and production-ready tooling in a single NuGet package. Everything runs in-process with zero cloud dependency and zero external dependencies, giving you full control over data, latency, and cost from C# or VB.NET.

🔒 100% Local ⚡ No Signup 🌐 Cross-Platform

Getting Started Model Catalog Samples API Reference

What Do You Want to Build?

🤖 Build Autonomous AI Agents Agents that reason, plan, call tools, and search the web 🔗 Orchestrate Multi-Agent Systems Pipeline, parallel, router, and supervisor patterns 📄 Chat with Your Documents PDF Q&A, document intelligence, and grounded answers 📚 Build RAG Pipelines Retrieval-augmented generation over your own data 🔍 Extract Structured Data NER, invoices, forms, and schema-driven extraction 🛡️ Detect and Redact PII Privacy compliance with local PII detection and redaction 💬 Add Conversational AI Multi-turn chat with memory, streaming, and context 🎤 Transcribe Speech Locally Whisper-powered speech-to-text with voice activity detection 🖼️ Analyze Images with Vision Models VLM-powered OCR, visual Q&A, and image understanding 💻 Build AI Coding Assistants Read, analyze, and write code with built-in file system tools 📡 Deploy Offline on Any Device Air-gapped, edge, and fully offline AI applications

Why LM-Kit.NET

Add AI to any .NET app in minutes. Install one NuGet package and start building. No Python runtimes, no containers, no external services, no dependencies to manage. LM-Kit.NET fits into your existing architecture and deployment pipeline.

Built by experts, updated continuously. Our team ships the latest advances in generative AI, symbolic AI, and NLP research directly into the SDK. Check our changelog to see the pace of innovation.

From prompts to production agents. Multi-agent orchestration, resilience policies, and comprehensive observability let you ship reliable AI workflows, not just prototypes.

Complete data sovereignty Sensitive information stays within your infrastructure

Zero network latency Responses as fast as your hardware allows

No per-token costs Unlimited inference once deployed

Offline operation Works without internet connectivity

Regulatory compliance GDPR, HIPAA, and data residency requirements by design

What You Can Build

Agents and Automation

Autonomous AI agents that reason, plan, and execute multi-step tasks using a growing catalog of built-in tools or your custom APIs
Multi-agent systems with pipeline, parallel, router, and supervisor orchestration patterns
Research assistants that search the web, analyze results, and synthesize findings using ReAct planning
Task automation workflows with agent delegation, resilience policies, and comprehensive observability

Document and Knowledge Workflows

RAG-powered knowledge assistants over local documents, databases, and enterprise data sources
PDF chat and document Q&A with retrieval, reranking, and grounded generation
OCR and extraction pipelines for invoices, forms, IDs, emails, and scanned documents
Intelligent document splitting that detects logical boundaries in multi-page PDFs using vision models

Multimodal and Compliance

Voice-driven assistants with speech-to-text, reasoning, and function calling
Compliance-focused text intelligence with PII extraction, NER, classification, and sentiment analysis

Core Capabilities

🤖 AI Agents and Orchestration

Build autonomous AI agents that reason, plan, and execute complex workflows within your applications.

Agent Framework - Complete agent infrastructure with Agent, AgentBuilder, AgentExecutor, and AgentRegistry
Multi-Agent Orchestration - Coordinate multiple agents with PipelineOrchestrator, ParallelOrchestrator, RouterOrchestrator, and SupervisorOrchestrator
Planning Strategies - ReAct, Chain-of-Thought, Tree-of-Thought, Plan-and-Execute, and Reflection
Agent-to-Agent Delegation - Delegate tasks to specialized sub-agents with DelegationManager and DelegateTool
Agent Templates - Pre-built templates including Chat, Assistant, Code, Research, Analyst, Planner, and more

Extensive Built-in Tools - A growing catalog of ready-to-use tools across eight categories (Data, Text, Numeric, Security, Utility, Document, IO, Net), each following the 1 tool = 1 feature atomic design
MCP Client Support - Connect to Model Context Protocol servers for extended tool access, resources, and prompts
Function Calling - Let models dynamically invoke your application's methods with structured parameters

Resilience Policies - Retry, Circuit Breaker, Timeout, Rate Limit, Bulkhead, and Fallback
Streaming Support - Real-time response streaming with buffered, multicast, and delegate handlers
Agent Observability - Full tracing and metrics with AgentTracer, AgentMetrics, and JSON export
Agent Memory - Persistent memory across conversation sessions with RAG-based recall
Reasoning Control - Adjust reasoning depth for models that support extended thinking

📄 Document Intelligence

Process, extract, and transform documents across PDF, DOCX, XLSX, PPTX, EML, MBOX, HTML, and image formats.

VLM-Powered OCR - High-accuracy text extraction from images and scanned content using vision language models
Structured Extraction - Define extraction targets with JSON schemas, custom elements, and pattern constraints
Confidence Scoring and Validation - Per-element confidence scores, entity auto-detection, format validation, and human verification flags
Named Entity Recognition (NER) - Extract people, organizations, locations, and custom entity types
PII Detection - Identify and classify personal identifiers for privacy compliance

Document Splitting - Detect logical document boundaries in multi-page files using vision-based analysis
PDF Manipulation - Split, merge, search, extract pages, render to image, and unlock password-protected files
Format Conversion - Convert between Markdown, HTML, and DOCX
Layout-Aware Processing - Detect paragraphs and lines, support region-based workflows

📚 Retrieval-Augmented Generation (RAG)

Ground AI responses in your organization's knowledge with a flexible, extensible RAG framework.

Modular RAG Architecture - Use built-in pipelines or implement custom retrieval strategies
Built-in Vector Database - Store and search embeddings without external dependencies
PDF Chat and Document RAG - Chat and retrieve over documents with dedicated workflows
Multimodal RAG - Retrieve relevant content from both text and images
Advanced Chunking - Markdown-aware, HTML-aware, semantic, and layout-based chunking strategies
Reranking - Improve retrieval precision with semantic reranking
External Vector Store Integration - Connect to Qdrant and other vector databases

🔍 Vision, Speech, and Content Analysis

Process visual, audio, and textual content.

Vision Language Models (VLM) - Analyze images, extract information, answer questions about visual content
Image Embeddings - Generate semantic representations of images for similarity search
Speech-to-Text - Transcribe audio with voice activity detection and multi-language support

Sentiment and Emotion Analysis - Detect emotional tone from text and images
Custom Classification - Categorize text and images into your defined classes
Language Detection - Identify languages from text, images, or audio
Summarization - Condense long content with configurable strategies
Keyword Extraction - Identify key terms and phrases

✍️ Text Generation and Transformation

Generate and refine content with precise control.

Conversational AI - Build context-aware chatbots with multi-turn memory
Constrained Generation - Guide model outputs using JSON schemas, templates, or custom grammar rules
Translation - Convert text between languages with confidence scoring
Text Enhancement - Improve clarity, fix grammar, adapt tone

🛠️ Model Customization

Tailor models to your specific domain.

Fine-Tuning - Train models on your data with LoRA support
Dynamic LoRA Loading - Switch adapters at runtime without reloading base models
Quantization - Optimize models for your deployment constraints
Training Dataset Tools - Prepare and export datasets in standard formats

Supported Models

LM-Kit.NET ships domain-tuned models optimized for real-world tasks and supports a wide range of open models across four modalities:

Text - Chat, reasoning, code generation, and tool calling
Vision - Image understanding, visual Q&A, and VLM-powered OCR
Embeddings - Semantic search and retrieval
Speech - Transcription with voice activity detection

Note

New model families are added continuously. Browse the full list in the Model Catalog, or load any compatible model directly from Hugging Face.

Performance and Hardware

Hardware Acceleration

LM-Kit.NET automatically leverages the best available acceleration on any hardware:

NVIDIA GPUs - CUDA backends with optimized kernels
AMD/Intel GPUs - Vulkan backend for cross-vendor GPU support
Apple Silicon - Metal acceleration for M-series chips
Multi-GPU - Distribute models across multiple GPUs
CPU Fallback - Optimized CPU inference when GPU unavailable

Dual Backend Architecture

Choose the optimal inference engine for your use case:

llama.cpp Backend - Broad model compatibility, memory efficiency
ONNX Runtime - Optimized inference for supported model formats

Observability

Gain full visibility into AI operations with comprehensive instrumentation:

OpenTelemetry Integration - GenAI semantic conventions for distributed tracing and metrics
Inference Metrics - Token counts, processing rates, generation speeds, context utilization, perplexity scores, and sampling statistics
Event Callbacks - Fine-grained hooks for token sampling, tool invocations, and generation lifecycle

Platform Support

Operating Systems

Windows - Windows 7 through Windows 11
macOS - macOS 11+ (Intel and Apple Silicon)
Linux - glibc 2.27+ (x64 and ARM64)

.NET Frameworks

Compatible from .NET Framework 4.6.2 through the latest .NET releases, with optimized binaries for each version.

Integration

Zero Dependencies

LM-Kit.NET ships as a single NuGet package with absolutely no external dependencies:

Tip

dotnet add package LM-Kit.NET

No Python runtime. No containers. No external services. No native libraries to manage separately. Everything runs in-process.

Ecosystem Connections

Microsoft Semantic Kernel - Use LM-Kit.NET as a local inference provider for Microsoft Semantic Kernel via the open-source LM-Kit.NET.SemanticKernel bridge package
Microsoft.Extensions.AI - Plug LM-Kit.NET into any .NET application that targets the standard IChatClient and IEmbeddingGenerator abstractions via the open-source LM-Kit.NET.ExtensionsAI bridge package
Vector Databases - Integrate with Qdrant via open-source connectors
MCP Servers - Connect to Model Context Protocol servers for extended tool access

Data Privacy and Security

Running inference locally provides inherent security advantages:

No data transmission - Content never leaves your network
No third-party access - No external services process your data
Audit-friendly - Complete visibility into AI operations
Air-gapped deployment - Works in disconnected environments

This architecture simplifies compliance with GDPR, HIPAA, SOC 2, and other regulatory frameworks.

Getting Started

Basic Chat

using LMKit.Model;
using LMKit.TextGeneration;

// Load a model
var model = new LM("path/to/model.gguf");

// Create a conversation
var conversation = new MultiTurnConversation(model);

// Chat
var response = await conversation.SubmitAsync("Explain quantum computing briefly.");
Console.WriteLine(response);

AI Agent with Tools

using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;

// Load a model
var model = new LM("path/to/model.gguf");

// Build an agent with built-in tools
var agent = Agent.CreateBuilder(model)
    .WithSystemPrompt("You are a helpful research assistant.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.WebSearch);
        tools.Register(BuiltInTools.Calculator);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithPlanning(PlanningStrategy.ReAct)
    .Build();

// Execute a task
var result = await agent.ExecuteAsync("What is the current population of Tokyo?");
Console.WriteLine(result.Response);

Your First Agent GitHub Samples All Demos API Reference

Table of Contents