On-Device AI Agent Platform for .NET Developers
Your AI. Your Data. On Your Device.
LM-Kit.NET is a very unique full-stack AI framework for .NET that unifies everything you need to build and deploy AI agents with zero cloud dependency and zero external dependencies. It combines the fastest .NET inference engine, production-ready trained models, agent orchestration, RAG pipelines, and MCP-compatible tool calling in a single in-process SDK for C# and VB.NET. That makes LM-Kit.NET a category of one in the .NET ecosystem.
🔒 100% Local ⚡ No Signup 🌐 Cross-Platform
Why LM-Kit.NET
A complete AI stack with no moving parts. LM-Kit.NET integrates inference, models, orchestration, and RAG into your .NET application as a single NuGet package. No Python runtimes, no containers, no external services, no dependencies to manage. Everything runs in-process.
Built by experts, updated continuously. Our team ships the latest advances in generative AI, symbolic AI, and NLP research directly into the SDK. Check our changelog to see the pace of innovation.
Not every problem requires a massive LLM. Dedicated task agents deliver faster execution, lower costs, and higher accuracy for specific workflows.
- Complete data sovereignty - sensitive information stays within your infrastructure
- Zero network latency - responses as fast as your hardware allows
- No per-token costs - unlimited inference once deployed
- Offline operation - works without internet connectivity
- Regulatory compliance - meets GDPR, HIPAA, and data residency requirements by design
What You Can Build
- Autonomous AI agents that reason, plan, and execute multi-step tasks using your application's tools and APIs
- RAG-powered knowledge assistants over local documents, databases, and enterprise data sources
- PDF chat and document Q&A with retrieval, reranking, and grounded generation
- Multi-agent workflows that orchestrate specialized task agents for complex business processes
- Voice-driven assistants with speech-to-text, reasoning, and function calling
- OCR and extraction pipelines for invoices, forms, IDs, emails, and scanned documents
- Compliance-focused text intelligence - PII extraction, NER, classification, sentiment analysis
Core Capabilities
LM-Kit.NET delivers a complete AI stack: the fastest .NET inference engine, domain-tuned models that solve real-world problems out of the box, and a comprehensive orchestration layer for building agents and RAG applications.
🤖 AI Agents and Orchestration
Build autonomous AI agents that reason, plan, and execute complex workflows within your applications.
- Task Agents - Reusable specialists designed for specific tasks with high speed and accuracy
- Agent Orchestration - Compose multi-agent workflows with RAG, tools, and APIs under strict control
- Function Calling - Let models dynamically invoke your application's methods with structured parameters
- Tool Registry - Define and manage collections of tools agents can use
- MCP Client Support - Connect to Model Context Protocol servers for extended capabilities including resources, prompts, and tool discovery
- Agent Memory - Persistent memory that survives across conversation sessions
- Reasoning Control - Adjust reasoning depth for models that support extended thinking
🔍 Multimodal Intelligence
Process and understand content across text, images, documents, and audio.
- Vision Language Models (VLM) - Analyze images, extract information, answer questions about visual content
- VLM-Based OCR - High-accuracy text extraction from images and scanned content
- Speech-to-Text - Transcribe audio with voice activity detection and multi-language support
- Document Processing - Native support for PDF, DOCX, XLSX, PPTX, HTML, and image formats
- Image Embeddings - Generate semantic representations of images for similarity search
📚 Retrieval-Augmented Generation (RAG)
Ground AI responses in your organization's knowledge with a flexible, extensible RAG framework.
- Modular RAG Architecture - Use built-in pipelines or implement custom retrieval strategies
- Built-in Vector Database - Store and search embeddings without external dependencies
- PDF Chat and Document RAG - Chat and retrieve over documents with dedicated workflows
- Multimodal RAG - Retrieve relevant content from both text and images
- Advanced Chunking - Markdown-aware, semantic, and layout-based chunking strategies
- Reranking - Improve retrieval precision with semantic reranking
- External Vector Store Integration - Connect to Qdrant and other vector databases
📊 Structured Data Extraction
Transform unstructured content into structured, actionable data.
- Schema-Based Extraction - Define extraction targets using JSON schemas or custom elements
- Named Entity Recognition (NER) - Extract people, organizations, locations, and custom entity types
- PII Detection - Identify and classify personal identifiers for privacy compliance
- Multimodal Extraction - Extract structured data from images and documents
- Layout-Aware Processing - Detect paragraphs and lines, support region-based workflows
💡 Content Intelligence
Analyze and understand text and visual content.
- Sentiment and Emotion Analysis - Detect emotional tone from text and images
- Custom Classification - Categorize text and images into your defined classes
- Keyword Extraction - Identify key terms and phrases
- Language Detection - Identify languages from text, images, or audio
- Summarization - Condense long content with configurable strategies
✍️ Text Generation and Transformation
Generate and refine content with precise control.
- Conversational AI - Build context-aware chatbots with multi-turn memory
- Constrained Generation - Guide model outputs using JSON schemas, templates, or custom grammar rules
- Translation - Convert text between languages with confidence scoring
- Text Enhancement - Improve clarity, fix grammar, adapt tone
🛠️ Model Customization
Tailor models to your specific domain.
- Fine-Tuning - Train models on your data with LoRA support
- Dynamic LoRA Loading - Switch adapters at runtime without reloading base models
- Quantization - Optimize models for your deployment constraints
- Training Dataset Tools - Prepare and export datasets in standard formats
Supported Models
LM-Kit.NET includes domain-tuned models optimized for real-world tasks, plus broad compatibility with models from leading providers:
Text Models: LLaMA, Mistral, Mixtral, Qwen, Phi, Gemma, Granite, DeepSeek, Falcon, and more
Vision Models: Qwen-VL, MiniCPM-V, Pixtral, Gemma Vision, LightOnOCR
Embedding Models: BGE, Nomic, Qwen Embedding, Gemma Embedding
Speech Models: Whisper (all sizes), with voice activity detection
Browse production-ready models in the Model Catalog, or load models directly from any Hugging Face repository.
Performance and Hardware
The Fastest .NET Inference Engine
LM-Kit.NET automatically leverages the best available acceleration on any hardware:
- NVIDIA GPUs - CUDA backends with optimized kernels
- AMD/Intel GPUs - Vulkan backend for cross-vendor GPU support
- Apple Silicon - Metal acceleration for M-series chips
- Multi-GPU - Distribute models across multiple GPUs
- CPU Fallback - Optimized CPU inference when GPU unavailable
Dual Backend Architecture
Choose the optimal inference engine for your use case:
- llama.cpp Backend - Broad model compatibility, memory efficiency
- ONNX Runtime - Optimized inference for supported model formats
Observability
Gain full visibility into AI operations with comprehensive instrumentation:
- OpenTelemetry Integration - GenAI semantic conventions for distributed tracing and metrics
- Inference Metrics - Token counts, processing rates, generation speeds, context utilization, perplexity scores, and sampling statistics
- Event Callbacks - Fine-grained hooks for token sampling, tool invocations, and generation lifecycle
Platform Support
Operating Systems
- Windows - Windows 7 through Windows 11
- macOS - macOS 11+ (Intel and Apple Silicon)
- Linux - glibc 2.27+ (x64 and ARM64)
.NET Frameworks
Compatible from .NET Framework 4.6.2 through the latest .NET releases, with optimized binaries for each version.
Integration
Zero Dependencies
LM-Kit.NET ships as a single NuGet package with absolutely no external dependencies:
dotnet add package LM-Kit.NET
No Python runtime. No containers. No external services. No native libraries to manage separately. The entire AI stack runs in-process within your .NET application, making deployment as simple as any other NuGet package.
Ecosystem Connections
- Semantic Kernel - Use LM-Kit.NET as a backend for Microsoft Semantic Kernel
- Vector Databases - Integrate with Qdrant via open-source connectors
- MCP Servers - Connect to Model Context Protocol servers for extended tool access
Data Privacy and Security
Running inference locally provides inherent security advantages:
- No data transmission - Content never leaves your network
- No third-party access - No external services process your data
- Audit-friendly - Complete visibility into AI operations
- Air-gapped deployment - Works in disconnected environments
This architecture simplifies compliance with GDPR, HIPAA, SOC 2, and other regulatory frameworks.
Getting Started
using LMKit;
using LMKit.Model;
// Load a model
var model = new LM("path/to/model.gguf");
// Create a conversation
var conversation = new MultiTurnConversation(model);
// Chat
var response = await conversation.SubmitAsync("Explain quantum computing briefly.");
Console.WriteLine(response);
Explore more: