Can LM-Kit.NET Run Completely Offline Without Internet Access?

TL;DR

Yes. LM-Kit.NET is designed for fully local execution. Once a model file is on disk, every capability works without any network connection: text generation, chat, RAG, embeddings, speech-to-text, vision, agents, and document processing. No data is sent to external servers, and no cloud API is called during inference.

What Requires Internet (One Time Only)

The only operation that requires internet access is downloading a model file for the first time. Models are hosted on HuggingFace and are fetched using the LM-Kit.NET model loading API:

using LMKit.Model;

// First run: downloads the model from HuggingFace (~7 GB for qwen3.5:9b)
// Subsequent runs: loads from the local cache instantly
using LM model = LM.LoadFromModelID("qwen3.5:9b");

After the initial download, the model is cached locally and never re-downloaded. You can also pre-download models using the direct URI and distribute them with your application installer.

What Works Offline (Everything Else)

Once the model file is available locally, all LM-Kit.NET features run entirely on your hardware:

Feature	Offline?	Details
Text generation and chat	Yes	All inference runs on local CPU or GPU
AI agents with tool calling	Yes	Agents reason and call local tools without network access
RAG (Retrieval-Augmented Generation)	Yes	Document indexing, embedding, and querying are fully local
Embeddings	Yes	Text and image embeddings are computed on-device
Speech-to-text	Yes	Whisper models run locally for transcription
Document processing	Yes	PDF extraction, OCR, and format conversion use local native libraries
Vision and image analysis	Yes	Vision language models process images on-device
Classification and NER	Yes	Sentiment analysis, entity extraction, and all NLP tasks run locally
Fine-tuning	Yes	Model fine-tuning runs entirely on local hardware

The Exception: Web-Connected Agent Tools

If you build an AI agent that uses the WebSearchTool or HttpGetTool, those specific tool calls will need internet access because they fetch data from external sources by design. However, the agent's reasoning, planning, and all other tool calls remain fully local. You can also build agents that use only local tools (file system, calculator, document processing) for a completely air-gapped setup.

using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;

// This agent works completely offline: no network tools registered
var offlineAgent = Agent.CreateBuilder(model)
    .WithPersona("Document Analyst")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.Calculator);
        tools.Register(BuiltInTools.DateTime);
        tools.AddFileSystemTools();
    })
    .Build();

How to Set Up an Air-Gapped Deployment

For environments with no internet access at all (military, healthcare, industrial), follow these steps:

1. Download models on a connected machine

// On a machine with internet access
using LM model = LM.LoadFromModelID("qwen3.5:9b",
    downloadingProgress: (_, total, downloaded) =>
    {
        if (total.HasValue) Console.Write($"\r  {(double)downloaded / total.Value * 100:F1}%");
        return true;
    });

// The model is now cached in the default model storage directory

2. Copy the model file to the target machine

Transfer the .lmk model file from the cache directory to the target machine using a USB drive, network share, or any secure file transfer method.

3. Load from a local path

// On the air-gapped machine: load directly from a known path
using LM model = new LM(new Uri("file:///C:/models/qwen3.5-9b-Q4_K_M.lmk"));

Why This Matters

Running AI locally without internet access solves critical business requirements:

Data sovereignty. Regulated industries (healthcare, finance, legal, defense) cannot send sensitive data to cloud APIs. LM-Kit.NET keeps all processing on-premises.
Operational continuity. Field workers, manufacturing floors, and remote sites need AI capabilities even when connectivity is unavailable or unreliable.
Latency and cost. Local inference eliminates network round trips and per-token cloud API charges. Once you own the hardware, inference is free.
Privacy by design. No telemetry, no usage tracking, no data leaving your network. Your documents, conversations, and model outputs stay on your machine.

How much disk space do LM-Kit.NET binaries add to my application?: Plan your deployment footprint, including pre-bundled model files.
Getting Started: Install LM-Kit.NET and run your first local AI application.
Build a RAG Pipeline: Create an offline knowledge base that answers questions from your own documents.
Glossary: Inference: Understand how local model inference works under the hood.
Understanding Model Loading and Caching: Learn how models are cached and resolved from local storage.

Table of Contents