👉 Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/local-inference/backends/runtime_diagnostic_report
Hardware Backends Inspection for C# .NET Applications
🎯 Purpose of the Demo
Reports which native backend LM-Kit picked (CPU, CUDA 12/13, Vulkan, Metal, AVX2, AVX, SYCL), enumerates every visible GPU with its total and free VRAM, and shows how a model's layers actually land on devices after LM.LoadFromModelID().
👥 Who Should Use This Demo
- Anyone shipping LM-Kit.NET in a desktop app and wants to log the user's hardware on first run.
- Support engineers diagnosing "the model is slow" tickets.
- DevOps building startup diagnostics for a GPU server fleet.
🚀 What Problem It Solves
When a customer says "performance is bad", you cannot fix it without knowing whether the layers are on GPU at all. This demo is the diagnostic harness you can paste into your own crash handler or first-run setup wizard.
💻 Demo Application Overview
A console app that:
- Calls
LMKit.Global.Runtime.Initialize(). - Prints
Runtime.Backend,Runtime.Version,Runtime.HasGpuSupport,Runtime.EnableCuda,Runtime.EnableVulkan,Runtime.BackendDirectory. - Lists every entry in
GpuDeviceInfo.Deviceswith name, total VRAM, free VRAM. - Loads a small chat model (
gemma3:270m) with auto configuration and printsLM.LayerCountandLM.GpuLayerCount. - Optionally reloads the same model pinned to CPU (
GpuLayerCount = 0) so the placement difference is visible.
✨ Key Features
Runtime.Backendenum:CPU,Cuda12,Cuda13,Metal,Vulkan,Sycl,Avx,Avx2.GpuDeviceInfo.Devicesis aIReadOnlyList<GpuDeviceInfo>withDeviceName,TotalMemorySize,FreeMemorySize,DeviceNumber.LM.DeviceConfiguration.GpuLayerCount = 0forces CPU-only loading.LM.DeviceConfiguration.AutoFitToVram = truelets the loader retry with fewer GPU layers on OOM.
Example Output
LM-Kit runtime version : 2026.5.x
Selected backend : Cuda13
Has GPU support : True
CUDA enabled : True
Vulkan enabled : False
Detected 1 GPU device(s):
# | Device | Total VRAM | Free VRAM
--------------------------------------------------------------------------------
0 | NVIDIA GeForce RTX 4090 | 24.0 GB | 22.4 GB
Layers in model : 26
Layers offloaded GPU : 26
Layers on CPU : 0
⚙️ Getting Started
Run:
cd lm-kit-net-samples/console_net/local-inference/backends/runtime_diagnostic_report
dotnet run
🔧 Troubleshooting
- "No GPU device visible" on a machine with a GPU -> verify the LM-Kit CUDA / Vulkan / Metal backend NuGet was installed and the appropriate driver is present.
Runtime.Backendreports CPU on a machine with CUDA -> the runtime did not find the CUDA shared libraries. CheckRuntime.BackendDirectory.
🚀 Extend the Demo
- Plug the report into your application's About dialog.
- On startup, refuse to load any model larger than
GpuDeviceInfo.Devices[0].FreeMemorySizeand surface the message in your UI. - Add a self-test that runs a 50-token completion and reports tokens/sec by backend.