Can LM-Kit.NET Run Completely Offline Without Internet Access?
TL;DR
Yes. LM-Kit.NET is designed for fully local execution. Once a model file is on disk, every capability works without any network connection: text generation, chat, RAG, embeddings, speech-to-text, vision, agents, and document processing. No data is sent to external servers, and no cloud API is called during inference.
What Requires Internet (One Time Only)
The only operation that requires internet access is downloading a model file for the first time. Models are hosted on HuggingFace and are fetched using the LM-Kit.NET model loading API:
using LMKit.Model;
// First run: downloads the model from HuggingFace (~7 GB for qwen3.5:9b)
// Subsequent runs: loads from the local cache instantly
using LM model = LM.LoadFromModelID("qwen3.5:9b");
After the initial download, the model is cached locally and never re-downloaded. You can also pre-download models using the direct URI and distribute them with your application installer.
What Works Offline (Everything Else)
Once the model file is available locally, all LM-Kit.NET features run entirely on your hardware:
| Feature | Offline? | Details |
|---|---|---|
| Text generation and chat | Yes | All inference runs on local CPU or GPU |
| AI agents with tool calling | Yes | Agents reason and call local tools without network access |
| RAG (Retrieval-Augmented Generation) | Yes | Document indexing, embedding, and querying are fully local |
| Embeddings | Yes | Text and image embeddings are computed on-device |
| Speech-to-text | Yes | Whisper models run locally for transcription |
| Document processing | Yes | PDF extraction, OCR, and format conversion use local native libraries |
| Vision and image analysis | Yes | Vision language models process images on-device |
| Classification and NER | Yes | Sentiment analysis, entity extraction, and all NLP tasks run locally |
| Fine-tuning | Yes | Model fine-tuning runs entirely on local hardware |
The Exception: Web-Connected Agent Tools
If you build an AI agent that uses the WebSearchTool or HttpGetTool, those specific tool calls will need internet access because they fetch data from external sources by design. However, the agent's reasoning, planning, and all other tool calls remain fully local. You can also build agents that use only local tools (file system, calculator, document processing) for a completely air-gapped setup.
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;
// This agent works completely offline: no network tools registered
var offlineAgent = Agent.CreateBuilder(model)
.WithPersona("Document Analyst")
.WithTools(tools =>
{
tools.Register(BuiltInTools.Calculator);
tools.Register(BuiltInTools.DateTime);
tools.AddFileSystemTools();
})
.Build();
How to Set Up an Air-Gapped Deployment
For environments with no internet access at all (military, healthcare, industrial), follow these steps:
1. Download models on a connected machine
// On a machine with internet access
using LM model = LM.LoadFromModelID("qwen3.5:9b",
downloadingProgress: (_, total, downloaded) =>
{
if (total.HasValue) Console.Write($"\r {(double)downloaded / total.Value * 100:F1}%");
return true;
});
// The model is now cached in the default model storage directory
2. Copy the model file to the target machine
Transfer the .lmk model file from the cache directory to the target machine using a USB drive, network share, or any secure file transfer method.
3. Load from a local path
// On the air-gapped machine: load directly from a known path
using LM model = new LM(new Uri("file:///C:/models/qwen3.5-9b-Q4_K_M.lmk"));
Why This Matters
Running AI locally without internet access solves critical business requirements:
- Data sovereignty. Regulated industries (healthcare, finance, legal, defense) cannot send sensitive data to cloud APIs. LM-Kit.NET keeps all processing on-premises.
- Operational continuity. Field workers, manufacturing floors, and remote sites need AI capabilities even when connectivity is unavailable or unreliable.
- Latency and cost. Local inference eliminates network round trips and per-token cloud API charges. Once you own the hardware, inference is free.
- Privacy by design. No telemetry, no usage tracking, no data leaving your network. Your documents, conversations, and model outputs stay on your machine.
📚 Related Content
- How much disk space do LM-Kit.NET binaries add to my application?: Plan your deployment footprint, including pre-bundled model files.
- Getting Started: Install LM-Kit.NET and run your first local AI application.
- Build a RAG Pipeline: Create an offline knowledge base that answers questions from your own documents.
- Glossary: Inference: Understand how local model inference works under the hood.
- Understanding Model Loading and Caching: Learn how models are cached and resolved from local storage.