Table of Contents

Local AI Agent Platform for .NET Developers

Your AI. Your Data. On Your Device.

LM-Kit.NET is a very unique full-stack AI framework for .NET that unifies everything you need to build and deploy AI agents with zero cloud dependency and zero external dependencies. It combines the fastest .NET inference engine, production-ready trained models, agent orchestration, RAG pipelines, and MCP-compatible tool calling in a single in-process SDK for C# and VB.NET. That makes LM-Kit.NET a category of one in the .NET ecosystem.

🔒 100% Local    ⚡ No Signup    🌐 Cross-Platform


Why LM-Kit.NET

A complete AI stack with no moving parts. LM-Kit.NET integrates inference, models, orchestration, and RAG into your .NET application as a single NuGet package. No Python runtimes, no containers, no external services, no dependencies to manage. Everything runs in-process.

Built by experts, updated continuously. Our team ships the latest advances in generative AI, symbolic AI, and NLP research directly into the SDK. Check our changelog to see the pace of innovation.

Not every problem requires a massive LLM. Dedicated task agents deliver faster execution, lower costs, and higher accuracy for specific workflows.

  • Complete data sovereignty - sensitive information stays within your infrastructure
  • Zero network latency - responses as fast as your hardware allows
  • No per-token costs - unlimited inference once deployed
  • Offline operation - works without internet connectivity
  • Regulatory compliance - meets GDPR, HIPAA, and data residency requirements by design

What You Can Build

  • Autonomous AI agents that reason, plan, and execute multi-step tasks using 56 built-in tools or your custom APIs
  • Multi-agent systems with pipeline, parallel, router, and supervisor orchestration patterns
  • Research assistants that search the web, analyze results, and synthesize findings using ReAct planning
  • RAG-powered knowledge assistants over local documents, databases, and enterprise data sources
  • PDF chat and document Q&A with retrieval, reranking, and grounded generation
  • Task automation workflows with agent delegation, resilience policies, and comprehensive observability
  • Voice-driven assistants with speech-to-text, reasoning, and function calling
  • OCR and extraction pipelines for invoices, forms, IDs, emails, and scanned documents
  • Compliance-focused text intelligence - PII extraction, NER, classification, sentiment analysis

Core Capabilities

LM-Kit.NET delivers a complete AI stack: the fastest .NET inference engine, domain-tuned models that solve real-world problems out of the box, and a comprehensive orchestration layer for building agents and RAG applications.

🤖 AI Agents and Orchestration

Build autonomous AI agents that reason, plan, and execute complex workflows within your applications.

  • Agent Framework - Complete agent infrastructure with Agent, AgentBuilder, AgentExecutor, and AgentRegistry for building production-ready AI agents
  • Multi-Agent Orchestration - Coordinate multiple agents with PipelineOrchestrator, ParallelOrchestrator, RouterOrchestrator, and SupervisorOrchestrator
  • Planning Strategies - Multiple reasoning approaches including ReAct, Chain-of-Thought, Tree-of-Thought, Plan-and-Execute, and Reflection handlers
  • 56 Built-in Tools - Ready-to-use tools across categories: Data (JSON, XML, CSV, YAML), Text (Diff, Regex, Template), Numeric (Calculator, Stats), Security (Hash, Crypto, JWT), IO (FileSystem, HTTP, Network), and more
  • Agent-to-Agent Delegation - Enable agents to delegate tasks to specialized sub-agents with DelegationManager and DelegateTool
  • Agent Templates - 18 pre-built templates including Chat, Assistant, Code, Research, Analyst, Planner, and more for rapid development
  • Resilience Policies - Production-grade reliability with Retry, Circuit Breaker, Timeout, Rate Limit, Bulkhead, and Fallback policies
  • Streaming Support - Real-time response streaming with buffered, multicast, and delegate handlers
  • Agent Observability - Full tracing and metrics with AgentTracer, AgentMetrics, and JSON export capabilities
  • MCP Client Support - Connect to Model Context Protocol servers for extended capabilities including resources, prompts, and tool discovery
  • Agent Memory - Persistent memory that survives across conversation sessions with RAG-based recall
  • Reasoning Control - Adjust reasoning depth for models that support extended thinking
  • Function Calling - Let models dynamically invoke your application's methods with structured parameters

🔍 Multimodal Intelligence

Process and understand content across text, images, documents, and audio.

  • Vision Language Models (VLM) - Analyze images, extract information, answer questions about visual content
  • VLM-Based OCR - High-accuracy text extraction from images and scanned content
  • Speech-to-Text - Transcribe audio with voice activity detection and multi-language support
  • Document Processing - Native support for PDF, DOCX, XLSX, PPTX, HTML, and image formats
  • Image Embeddings - Generate semantic representations of images for similarity search

📚 Retrieval-Augmented Generation (RAG)

Ground AI responses in your organization's knowledge with a flexible, extensible RAG framework.

  • Modular RAG Architecture - Use built-in pipelines or implement custom retrieval strategies
  • Built-in Vector Database - Store and search embeddings without external dependencies
  • PDF Chat and Document RAG - Chat and retrieve over documents with dedicated workflows
  • Multimodal RAG - Retrieve relevant content from both text and images
  • Advanced Chunking - Markdown-aware, semantic, and layout-based chunking strategies
  • Reranking - Improve retrieval precision with semantic reranking
  • External Vector Store Integration - Connect to Qdrant and other vector databases

📊 Structured Data Extraction

Transform unstructured content into structured, actionable data.

  • Schema-Based Extraction - Define extraction targets using JSON schemas or custom elements
  • Named Entity Recognition (NER) - Extract people, organizations, locations, and custom entity types
  • PII Detection - Identify and classify personal identifiers for privacy compliance
  • Multimodal Extraction - Extract structured data from images and documents
  • Layout-Aware Processing - Detect paragraphs and lines, support region-based workflows

💡 Content Intelligence

Analyze and understand text and visual content.

  • Sentiment and Emotion Analysis - Detect emotional tone from text and images
  • Custom Classification - Categorize text and images into your defined classes
  • Keyword Extraction - Identify key terms and phrases
  • Language Detection - Identify languages from text, images, or audio
  • Summarization - Condense long content with configurable strategies

✍️ Text Generation and Transformation

Generate and refine content with precise control.

  • Conversational AI - Build context-aware chatbots with multi-turn memory
  • Constrained Generation - Guide model outputs using JSON schemas, templates, or custom grammar rules
  • Translation - Convert text between languages with confidence scoring
  • Text Enhancement - Improve clarity, fix grammar, adapt tone

🛠️ Model Customization

Tailor models to your specific domain.

  • Fine-Tuning - Train models on your data with LoRA support
  • Dynamic LoRA Loading - Switch adapters at runtime without reloading base models
  • Quantization - Optimize models for your deployment constraints
  • Training Dataset Tools - Prepare and export datasets in standard formats

Supported Models

LM-Kit.NET includes domain-tuned models optimized for real-world tasks, plus broad compatibility with models from leading providers:

Text Models: LLaMA, Mistral, Mixtral, Qwen, Phi, Gemma, Granite, DeepSeek, Falcon, and more

Vision Models: Qwen-VL, MiniCPM-V, Pixtral, Gemma Vision, LightOnOCR

Embedding Models: BGE, Nomic, Qwen Embedding, Gemma Embedding

Speech Models: Whisper (all sizes), with voice activity detection

Browse production-ready models in the Model Catalog, or load models directly from any Hugging Face repository.


Performance and Hardware

The Fastest .NET Inference Engine

LM-Kit.NET automatically leverages the best available acceleration on any hardware:

  • NVIDIA GPUs - CUDA backends with optimized kernels
  • AMD/Intel GPUs - Vulkan backend for cross-vendor GPU support
  • Apple Silicon - Metal acceleration for M-series chips
  • Multi-GPU - Distribute models across multiple GPUs
  • CPU Fallback - Optimized CPU inference when GPU unavailable

Dual Backend Architecture

Choose the optimal inference engine for your use case:

  • llama.cpp Backend - Broad model compatibility, memory efficiency
  • ONNX Runtime - Optimized inference for supported model formats

Observability

Gain full visibility into AI operations with comprehensive instrumentation:

  • OpenTelemetry Integration - GenAI semantic conventions for distributed tracing and metrics
  • Inference Metrics - Token counts, processing rates, generation speeds, context utilization, perplexity scores, and sampling statistics
  • Event Callbacks - Fine-grained hooks for token sampling, tool invocations, and generation lifecycle

Platform Support

Operating Systems

  • Windows - Windows 7 through Windows 11
  • macOS - macOS 11+ (Intel and Apple Silicon)
  • Linux - glibc 2.27+ (x64 and ARM64)

.NET Frameworks

Compatible from .NET Framework 4.6.2 through the latest .NET releases, with optimized binaries for each version.


Integration

Zero Dependencies

LM-Kit.NET ships as a single NuGet package with absolutely no external dependencies:

dotnet add package LM-Kit.NET

No Python runtime. No containers. No external services. No native libraries to manage separately. The entire AI stack runs in-process within your .NET application, making deployment as simple as any other NuGet package.

Ecosystem Connections

  • Semantic Kernel - Use LM-Kit.NET as a backend for Microsoft Semantic Kernel
  • Vector Databases - Integrate with Qdrant via open-source connectors
  • MCP Servers - Connect to Model Context Protocol servers for extended tool access

Data Privacy and Security

Running inference locally provides inherent security advantages:

  • No data transmission - Content never leaves your network
  • No third-party access - No external services process your data
  • Audit-friendly - Complete visibility into AI operations
  • Air-gapped deployment - Works in disconnected environments

This architecture simplifies compliance with GDPR, HIPAA, SOC 2, and other regulatory frameworks.


Getting Started

Basic Chat

using LMKit.Model;
using LMKit.TextGeneration;

// Load a model
var model = new LM("path/to/model.gguf");

// Create a conversation
var conversation = new MultiTurnConversation(model);

// Chat
var response = await conversation.SubmitAsync("Explain quantum computing briefly.");
Console.WriteLine(response);

AI Agent with Tools

using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;

// Load a model
var model = new LM("path/to/model.gguf");

// Build an agent with built-in tools
var agent = Agent.CreateBuilder(model)
    .WithSystemPrompt("You are a helpful research assistant.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.WebSearch);
        tools.Register(BuiltInTools.Calculator);
        tools.Register(BuiltInTools.DateTime);
    })
    .WithPlanning(PlanningStrategy.ReAct)
    .Build();

// Execute a task
var result = await agent.ExecuteAsync("What is the current population of Tokyo?");
Console.WriteLine(result.Response);

Explore more: