Can LM-Kit.NET Work with Models from Hugging Face?
TL;DR
Yes, directly. You can load any GGUF model from Hugging Face by passing its HTTPS URL to the LM constructor. LM-Kit.NET downloads the file, verifies its checksum, caches it locally, and loads it. The built-in model catalog also points to Hugging Face for all its downloads. However, only the GGUF format is supported. Models in safetensors, PyTorch, GPTQ, or AWQ format must be converted to GGUF first.
Loading a Model from Hugging Face
Pass the direct download URL of a GGUF file on Hugging Face:
using LMKit.Model;
// Load any GGUF model directly from Hugging Face
using LM model = new LM(new Uri(
"https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf"
));
The SDK handles:
- Downloading the file with progress callbacks and retry logic.
- Checksum verification against Hugging Face metadata (SHA256).
- Local caching so subsequent loads skip the download.
- Model loading into memory with GPU offloading if available.
You can track download progress with a callback:
using LM model = new LM(
new Uri("https://huggingface.co/..."),
downloadingProgress: (sender, progress) =>
{
Console.Write($"\rDownloading: {progress.ProgressPercentage:F1}%");
}
);
The Built-In Model Catalog
The LM-Kit.NET model catalog is a curated set of models, all hosted on Hugging Face. Loading from the catalog uses the same download mechanism but with a simpler API:
// Load by model ID (downloads from Hugging Face automatically)
using LM model = LM.LoadFromModelID("qwen3.5:9b");
Catalog models use the .lmk file extension, which is standard GGUF with additional LM-Kit metadata (capability flags, recommended settings). Any .lmk file is fully GGUF-compatible.
Supported Architectures
LM-Kit.NET supports a wide range of GGUF model architectures:
| Category | Architectures |
|---|---|
| General text | llama, qwen2, qwen3, gemma2, gemma3, gemma4, mistral3, phi3, deepseek2, falcon-h1, glm4, granite, smollm3, nomic-bert |
| Mixture of Experts | qwen3 MoE variants |
| Vision | qwen2vl, qwen3vl, paddleocr |
| Embeddings | bert, gemma-embedding |
| Speech | whisper |
| Specialized | u2net (image segmentation) |
If your model uses one of these architectures and is in GGUF format, it will work in LM-Kit.NET regardless of where you downloaded it.
Format Requirements
LM-Kit.NET requires GGUF format. Models in other formats need conversion before use:
| Format | How to Convert |
|---|---|
| safetensors | Use llama.cpp's convert_hf_to_gguf.py tool |
| PyTorch (.bin, .pt) | Use llama.cpp conversion tools |
| GPTQ | Download or convert to GGUF quantization instead |
| AWQ | Download or convert to GGUF quantization instead |
All standard GGUF quantizations are supported: Q2_K through Q8_0, and F16/F32 for full precision.
Finding GGUF Models on Hugging Face
Many model publishers provide GGUF versions on Hugging Face. Look for:
- Repositories with "GGUF" in the name (e.g.,
bartowski/Llama-3.2-3B-Instruct-GGUF) - Files ending in
.ggufin the repository's file list - The
Q4_K_Mquantization as a good balance of quality and size
The URL format for direct download is:
https://huggingface.co/{publisher}/{repository}/resolve/main/{filename}.gguf
Version Pinning
To lock a specific model version, use the commit hash instead of main in the URL:
using LM model = new LM(new Uri(
"https://huggingface.co/lm-kit/qwen3.5-9b-instruct-lmk/resolve/abc123def456/Qwen3.5-9B-Instruct-Q4_K_M.lmk"
));
This ensures you always load the exact same file, even if the publisher updates the repository.
📚 Related Content
- Can I use my own GGUF model files with LM-Kit.NET?: All three loading methods (local path, HTTPS URL, catalog).
- What model formats does LM-Kit.NET support?: GGUF, LMK, and ONNX format details.
- How do I handle model versioning and updates?: Version pinning and safe migration workflows.
- Model Catalog: Browse all curated models with hardware recommendations.