Local AI Agent Platform for .NET Developers
Your AI. Your Data. On Your Device.
LM-Kit.NET is the complete local AI stack for .NET: high-performance inference, multi-agent orchestration, document intelligence, RAG pipelines, and production-ready tooling in a single NuGet package. Everything runs in-process with zero cloud dependency and zero external dependencies, giving you full control over data, latency, and cost from C# or VB.NET.
🔒 100% Local ⚡ No Signup 🌐 Cross-Platform
What Do You Want to Build?
Why LM-Kit.NET
Add AI to any .NET app in minutes. Install one NuGet package and start building. No Python runtimes, no containers, no external services, no dependencies to manage. LM-Kit.NET fits into your existing architecture and deployment pipeline.
Built by experts, updated continuously. Our team ships the latest advances in generative AI, symbolic AI, and NLP research directly into the SDK. Check our changelog to see the pace of innovation.
From prompts to production agents. Multi-agent orchestration, resilience policies, and comprehensive observability let you ship reliable AI workflows, not just prototypes.
What You Can Build
Agents and Automation
- Autonomous AI agents that reason, plan, and execute multi-step tasks using a growing catalog of built-in tools or your custom APIs
- Multi-agent systems with pipeline, parallel, router, and supervisor orchestration patterns
- Research assistants that search the web, analyze results, and synthesize findings using ReAct planning
- Task automation workflows with agent delegation, resilience policies, and comprehensive observability
Document and Knowledge Workflows
- RAG-powered knowledge assistants over local documents, databases, and enterprise data sources
- PDF chat and document Q&A with retrieval, reranking, and grounded generation
- OCR and extraction pipelines for invoices, forms, IDs, emails, and scanned documents
- Intelligent document splitting that detects logical boundaries in multi-page PDFs using vision models
Multimodal and Compliance
- Voice-driven assistants with speech-to-text, reasoning, and function calling
- Compliance-focused text intelligence with PII extraction, NER, classification, and sentiment analysis
Core Capabilities
🤖 AI Agents and Orchestration
Build autonomous AI agents that reason, plan, and execute complex workflows within your applications.
- Agent Framework - Complete agent infrastructure with
Agent,AgentBuilder,AgentExecutor, andAgentRegistry - Multi-Agent Orchestration - Coordinate multiple agents with
PipelineOrchestrator,ParallelOrchestrator,RouterOrchestrator, andSupervisorOrchestrator - Planning Strategies - ReAct, Chain-of-Thought, Tree-of-Thought, Plan-and-Execute, and Reflection
- Agent-to-Agent Delegation - Delegate tasks to specialized sub-agents with
DelegationManagerandDelegateTool - Agent Templates - Pre-built templates including Chat, Assistant, Code, Research, Analyst, Planner, and more
- Extensive Built-in Tools - A growing catalog of ready-to-use tools across eight categories (Data, Text, Numeric, Security, Utility, Document, IO, Net), each following the 1 tool = 1 feature atomic design
- MCP Client Support - Connect to Model Context Protocol servers for extended tool access, resources, and prompts
- Function Calling - Let models dynamically invoke your application's methods with structured parameters
- Resilience Policies - Retry, Circuit Breaker, Timeout, Rate Limit, Bulkhead, and Fallback
- Streaming Support - Real-time response streaming with buffered, multicast, and delegate handlers
- Agent Observability - Full tracing and metrics with
AgentTracer,AgentMetrics, and JSON export - Agent Memory - Persistent memory across conversation sessions with RAG-based recall
- Reasoning Control - Adjust reasoning depth for models that support extended thinking
📄 Document Intelligence
Process, extract, and transform documents across PDF, DOCX, XLSX, PPTX, EML, MBOX, HTML, and image formats.
- VLM-Powered OCR - High-accuracy text extraction from images and scanned content using vision language models
- Structured Extraction - Define extraction targets with JSON schemas, custom elements, and pattern constraints
- Confidence Scoring and Validation - Per-element confidence scores, entity auto-detection, format validation, and human verification flags
- Named Entity Recognition (NER) - Extract people, organizations, locations, and custom entity types
- PII Detection - Identify and classify personal identifiers for privacy compliance
- Document Splitting - Detect logical document boundaries in multi-page files using vision-based analysis
- PDF Manipulation - Split, merge, search, extract pages, render to image, and unlock password-protected files
- Format Conversion - Convert between Markdown, HTML, and DOCX
- Layout-Aware Processing - Detect paragraphs and lines, support region-based workflows
📚 Retrieval-Augmented Generation (RAG)
Ground AI responses in your organization's knowledge with a flexible, extensible RAG framework.
- Modular RAG Architecture - Use built-in pipelines or implement custom retrieval strategies
- Built-in Vector Database - Store and search embeddings without external dependencies
- PDF Chat and Document RAG - Chat and retrieve over documents with dedicated workflows
- Multimodal RAG - Retrieve relevant content from both text and images
- Advanced Chunking - Markdown-aware, HTML-aware, semantic, and layout-based chunking strategies
- Reranking - Improve retrieval precision with semantic reranking
- External Vector Store Integration - Connect to Qdrant and other vector databases
🔍 Vision, Speech, and Content Analysis
Process visual, audio, and textual content.
- Vision Language Models (VLM) - Analyze images, extract information, answer questions about visual content
- Image Embeddings - Generate semantic representations of images for similarity search
- Speech-to-Text - Transcribe audio with voice activity detection and multi-language support
- Sentiment and Emotion Analysis - Detect emotional tone from text and images
- Custom Classification - Categorize text and images into your defined classes
- Language Detection - Identify languages from text, images, or audio
- Summarization - Condense long content with configurable strategies
- Keyword Extraction - Identify key terms and phrases
✍️ Text Generation and Transformation
Generate and refine content with precise control.
- Conversational AI - Build context-aware chatbots with multi-turn memory
- Constrained Generation - Guide model outputs using JSON schemas, templates, or custom grammar rules
- Translation - Convert text between languages with confidence scoring
- Text Enhancement - Improve clarity, fix grammar, adapt tone
🛠️ Model Customization
Tailor models to your specific domain.
- Fine-Tuning - Train models on your data with LoRA support
- Dynamic LoRA Loading - Switch adapters at runtime without reloading base models
- Quantization - Optimize models for your deployment constraints
- Training Dataset Tools - Prepare and export datasets in standard formats
Supported Models
LM-Kit.NET ships domain-tuned models optimized for real-world tasks and supports a wide range of open models across four modalities:
- Text - Chat, reasoning, code generation, and tool calling
- Vision - Image understanding, visual Q&A, and VLM-powered OCR
- Embeddings - Semantic search and retrieval
- Speech - Transcription with voice activity detection
Note
New model families are added continuously. Browse the full list in the Model Catalog, or load any compatible model directly from Hugging Face.
Performance and Hardware
Hardware Acceleration
LM-Kit.NET automatically leverages the best available acceleration on any hardware:
- NVIDIA GPUs - CUDA backends with optimized kernels
- AMD/Intel GPUs - Vulkan backend for cross-vendor GPU support
- Apple Silicon - Metal acceleration for M-series chips
- Multi-GPU - Distribute models across multiple GPUs
- CPU Fallback - Optimized CPU inference when GPU unavailable
Dual Backend Architecture
Choose the optimal inference engine for your use case:
- llama.cpp Backend - Broad model compatibility, memory efficiency
- ONNX Runtime - Optimized inference for supported model formats
Observability
Gain full visibility into AI operations with comprehensive instrumentation:
- OpenTelemetry Integration - GenAI semantic conventions for distributed tracing and metrics
- Inference Metrics - Token counts, processing rates, generation speeds, context utilization, perplexity scores, and sampling statistics
- Event Callbacks - Fine-grained hooks for token sampling, tool invocations, and generation lifecycle
Platform Support
Operating Systems
- Windows - Windows 7 through Windows 11
- macOS - macOS 11+ (Intel and Apple Silicon)
- Linux - glibc 2.27+ (x64 and ARM64)
.NET Frameworks
Compatible from .NET Framework 4.6.2 through the latest .NET releases, with optimized binaries for each version.
Integration
Zero Dependencies
LM-Kit.NET ships as a single NuGet package with absolutely no external dependencies:
Tip
dotnet add package LM-Kit.NET
No Python runtime. No containers. No external services. No native libraries to manage separately. Everything runs in-process.
Ecosystem Connections
- Microsoft Semantic Kernel - Use LM-Kit.NET as a local inference provider for Microsoft Semantic Kernel via the open-source
LM-Kit.NET.SemanticKernelbridge package - Microsoft.Extensions.AI - Plug LM-Kit.NET into any .NET application that targets the standard
IChatClientandIEmbeddingGeneratorabstractions via the open-sourceLM-Kit.NET.ExtensionsAIbridge package - Vector Databases - Integrate with Qdrant via open-source connectors
- MCP Servers - Connect to Model Context Protocol servers for extended tool access
Data Privacy and Security
Running inference locally provides inherent security advantages:
- No data transmission - Content never leaves your network
- No third-party access - No external services process your data
- Audit-friendly - Complete visibility into AI operations
- Air-gapped deployment - Works in disconnected environments
This architecture simplifies compliance with GDPR, HIPAA, SOC 2, and other regulatory frameworks.
Getting Started
Basic Chat
using LMKit.Model;
using LMKit.TextGeneration;
// Load a model
var model = new LM("path/to/model.gguf");
// Create a conversation
var conversation = new MultiTurnConversation(model);
// Chat
var response = await conversation.SubmitAsync("Explain quantum computing briefly.");
Console.WriteLine(response);
AI Agent with Tools
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;
// Load a model
var model = new LM("path/to/model.gguf");
// Build an agent with built-in tools
var agent = Agent.CreateBuilder(model)
.WithSystemPrompt("You are a helpful research assistant.")
.WithTools(tools =>
{
tools.Register(BuiltInTools.WebSearch);
tools.Register(BuiltInTools.Calculator);
tools.Register(BuiltInTools.DateTimeNow);
})
.WithPlanning(PlanningStrategy.ReAct)
.Build();
// Execute a task
var result = await agent.ExecuteAsync("What is the current population of Tokyo?");
Console.WriteLine(result.Response);