Table of Contents

On-Device AI Agent Platform for .NET Developers

Your AI. Your Data. On Your Device.

LM-Kit.NET is a very unique full-stack AI framework for .NET that unifies everything you need to build and deploy AI agents with zero cloud dependency and zero external dependencies. It combines the fastest .NET inference engine, production-ready trained models, agent orchestration, RAG pipelines, and MCP-compatible tool calling in a single in-process SDK for C# and VB.NET. That makes LM-Kit.NET a category of one in the .NET ecosystem.

🔒 100% Local    ⚡ No Signup    🌐 Cross-Platform


Why LM-Kit.NET

A complete AI stack with no moving parts. LM-Kit.NET integrates inference, models, orchestration, and RAG into your .NET application as a single NuGet package. No Python runtimes, no containers, no external services, no dependencies to manage. Everything runs in-process.

Built by experts, updated continuously. Our team ships the latest advances in generative AI, symbolic AI, and NLP research directly into the SDK. Check our changelog to see the pace of innovation.

Not every problem requires a massive LLM. Dedicated task agents deliver faster execution, lower costs, and higher accuracy for specific workflows.

  • Complete data sovereignty - sensitive information stays within your infrastructure
  • Zero network latency - responses as fast as your hardware allows
  • No per-token costs - unlimited inference once deployed
  • Offline operation - works without internet connectivity
  • Regulatory compliance - meets GDPR, HIPAA, and data residency requirements by design

What You Can Build

  • Autonomous AI agents that reason, plan, and execute multi-step tasks using your application's tools and APIs
  • RAG-powered knowledge assistants over local documents, databases, and enterprise data sources
  • PDF chat and document Q&A with retrieval, reranking, and grounded generation
  • Multi-agent workflows that orchestrate specialized task agents for complex business processes
  • Voice-driven assistants with speech-to-text, reasoning, and function calling
  • OCR and extraction pipelines for invoices, forms, IDs, emails, and scanned documents
  • Compliance-focused text intelligence - PII extraction, NER, classification, sentiment analysis

Core Capabilities

LM-Kit.NET delivers a complete AI stack: the fastest .NET inference engine, domain-tuned models that solve real-world problems out of the box, and a comprehensive orchestration layer for building agents and RAG applications.

🤖 AI Agents and Orchestration

Build autonomous AI agents that reason, plan, and execute complex workflows within your applications.

  • Task Agents - Reusable specialists designed for specific tasks with high speed and accuracy
  • Agent Orchestration - Compose multi-agent workflows with RAG, tools, and APIs under strict control
  • Function Calling - Let models dynamically invoke your application's methods with structured parameters
  • Tool Registry - Define and manage collections of tools agents can use
  • MCP Client Support - Connect to Model Context Protocol servers for extended capabilities including resources, prompts, and tool discovery
  • Agent Memory - Persistent memory that survives across conversation sessions
  • Reasoning Control - Adjust reasoning depth for models that support extended thinking

🔍 Multimodal Intelligence

Process and understand content across text, images, documents, and audio.

  • Vision Language Models (VLM) - Analyze images, extract information, answer questions about visual content
  • VLM-Based OCR - High-accuracy text extraction from images and scanned content
  • Speech-to-Text - Transcribe audio with voice activity detection and multi-language support
  • Document Processing - Native support for PDF, DOCX, XLSX, PPTX, HTML, and image formats
  • Image Embeddings - Generate semantic representations of images for similarity search

📚 Retrieval-Augmented Generation (RAG)

Ground AI responses in your organization's knowledge with a flexible, extensible RAG framework.

  • Modular RAG Architecture - Use built-in pipelines or implement custom retrieval strategies
  • Built-in Vector Database - Store and search embeddings without external dependencies
  • PDF Chat and Document RAG - Chat and retrieve over documents with dedicated workflows
  • Multimodal RAG - Retrieve relevant content from both text and images
  • Advanced Chunking - Markdown-aware, semantic, and layout-based chunking strategies
  • Reranking - Improve retrieval precision with semantic reranking
  • External Vector Store Integration - Connect to Qdrant and other vector databases

📊 Structured Data Extraction

Transform unstructured content into structured, actionable data.

  • Schema-Based Extraction - Define extraction targets using JSON schemas or custom elements
  • Named Entity Recognition (NER) - Extract people, organizations, locations, and custom entity types
  • PII Detection - Identify and classify personal identifiers for privacy compliance
  • Multimodal Extraction - Extract structured data from images and documents
  • Layout-Aware Processing - Detect paragraphs and lines, support region-based workflows

💡 Content Intelligence

Analyze and understand text and visual content.

  • Sentiment and Emotion Analysis - Detect emotional tone from text and images
  • Custom Classification - Categorize text and images into your defined classes
  • Keyword Extraction - Identify key terms and phrases
  • Language Detection - Identify languages from text, images, or audio
  • Summarization - Condense long content with configurable strategies

✍️ Text Generation and Transformation

Generate and refine content with precise control.

  • Conversational AI - Build context-aware chatbots with multi-turn memory
  • Constrained Generation - Guide model outputs using JSON schemas, templates, or custom grammar rules
  • Translation - Convert text between languages with confidence scoring
  • Text Enhancement - Improve clarity, fix grammar, adapt tone

🛠️ Model Customization

Tailor models to your specific domain.

  • Fine-Tuning - Train models on your data with LoRA support
  • Dynamic LoRA Loading - Switch adapters at runtime without reloading base models
  • Quantization - Optimize models for your deployment constraints
  • Training Dataset Tools - Prepare and export datasets in standard formats

Supported Models

LM-Kit.NET includes domain-tuned models optimized for real-world tasks, plus broad compatibility with models from leading providers:

Text Models: LLaMA, Mistral, Mixtral, Qwen, Phi, Gemma, Granite, DeepSeek, Falcon, and more

Vision Models: Qwen-VL, MiniCPM-V, Pixtral, Gemma Vision, LightOnOCR

Embedding Models: BGE, Nomic, Qwen Embedding, Gemma Embedding

Speech Models: Whisper (all sizes), with voice activity detection

Browse production-ready models in the Model Catalog, or load models directly from any Hugging Face repository.


Performance and Hardware

The Fastest .NET Inference Engine

LM-Kit.NET automatically leverages the best available acceleration on any hardware:

  • NVIDIA GPUs - CUDA backends with optimized kernels
  • AMD/Intel GPUs - Vulkan backend for cross-vendor GPU support
  • Apple Silicon - Metal acceleration for M-series chips
  • Multi-GPU - Distribute models across multiple GPUs
  • CPU Fallback - Optimized CPU inference when GPU unavailable

Dual Backend Architecture

Choose the optimal inference engine for your use case:

  • llama.cpp Backend - Broad model compatibility, memory efficiency
  • ONNX Runtime - Optimized inference for supported model formats

Observability

Gain full visibility into AI operations with comprehensive instrumentation:

  • OpenTelemetry Integration - GenAI semantic conventions for distributed tracing and metrics
  • Inference Metrics - Token counts, processing rates, generation speeds, context utilization, perplexity scores, and sampling statistics
  • Event Callbacks - Fine-grained hooks for token sampling, tool invocations, and generation lifecycle

Platform Support

Operating Systems

  • Windows - Windows 7 through Windows 11
  • macOS - macOS 11+ (Intel and Apple Silicon)
  • Linux - glibc 2.27+ (x64 and ARM64)

.NET Frameworks

Compatible from .NET Framework 4.6.2 through the latest .NET releases, with optimized binaries for each version.


Integration

Zero Dependencies

LM-Kit.NET ships as a single NuGet package with absolutely no external dependencies:

dotnet add package LM-Kit.NET

No Python runtime. No containers. No external services. No native libraries to manage separately. The entire AI stack runs in-process within your .NET application, making deployment as simple as any other NuGet package.

Ecosystem Connections

  • Semantic Kernel - Use LM-Kit.NET as a backend for Microsoft Semantic Kernel
  • Vector Databases - Integrate with Qdrant via open-source connectors
  • MCP Servers - Connect to Model Context Protocol servers for extended tool access

Data Privacy and Security

Running inference locally provides inherent security advantages:

  • No data transmission - Content never leaves your network
  • No third-party access - No external services process your data
  • Audit-friendly - Complete visibility into AI operations
  • Air-gapped deployment - Works in disconnected environments

This architecture simplifies compliance with GDPR, HIPAA, SOC 2, and other regulatory frameworks.


Getting Started

using LMKit;
using LMKit.Model;

// Load a model
var model = new LM("path/to/model.gguf");

// Create a conversation
var conversation = new MultiTurnConversation(model);

// Chat
var response = await conversation.SubmitAsync("Explain quantum computing briefly.");
Console.WriteLine(response);

Explore more: