Table of Contents

👉 Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/model-optimization/lora-integration/lora_adapter_hot_swap

LoRA Adapter Hot-Swap for C# .NET Applications


🎯 Purpose of the Demo

An interactive console app that loads one base model and swaps multiple LoRA adapters in and out at inference time. Compare baseline vs each adapter on the same prompt, apply/remove/list adapters interactively, or chat with whatever combination is currently active.

All inference runs on-device.


👥 Industry Target Audience

  • Multi-tenant SaaS that ships per-tenant fine-tunes on a shared base.
  • Personas / brand-voice teams swapping stylistic adapters per request.
  • Domain-specialist agents (legal, medical, finance) sharing one base model.
  • Inference servers that need to switch behaviors without reloading the base.

🚀 Problem Solved

A naive multi-persona deployment loads N full models, which blows VRAM and disk. LoRA adapters are kilobytes to a few megabytes each. One base in VRAM + N adapters on disk is a fraction of the cost. The demo turns the API into a menu so the savings story is easy to demonstrate.


💻 Application Overview

Interactive menu (no command-line arguments) with five modes:

Mode What it does
Compare Paste adapter paths + prompt + scale; run baseline and each adapter cleanly.
Apply Apply one adapter at a chosen scale to the running model.
Remove Remove one applied adapter (or all).
List Print currently-applied adapters.
Chat Free-form prompt with whatever adapters are active.
Quit Exit.

The base model loads once at startup. Generate adapters via the LoRA fine-tuning how-to guide, or bring your own.

✨ Key Features

  • LoraAdapterSource(path, scale) constructor.
  • LoraAdapterSource.ValidateFormat(path, throwException) pre-check.
  • LM.ApplyLoraAdapter(LoraAdapterSource) hot-load.
  • LM.Adapters enumeration returning LoraAdapter { Path, Scale, Identifier }.
  • LM.RemoveLoraAdapter(LoraAdapter) to remove one adapter while others stay active.

🧠 Model

  • qwen3.5:0.8b (smallest current Qwen 3.5 dense; pick any compatible base for your adapters).

🛠️ Getting Started

📋 Prerequisites

  • .NET 8.0 or later
  • One or more LoRA .gguf adapters trained on a compatible base.

▶️ Running the Application

git clone https://github.com/LM-Kit/lm-kit-net-samples
cd lm-kit-net-samples/console_net/model-optimization/lora-integration/lora_adapter_hot_swap
dotnet run

Pick a mode from the menu.

Share