What Languages Can LM-Kit.NET Models Understand and Generate?

TL;DR

Most models in the LM-Kit.NET catalog are multilingual by training. The Qwen 3.5 and Gemma 4 families support dozens of languages out of the box, including English, Chinese, Japanese, Korean, French, German, Spanish, Arabic, Russian, and many more. LM-Kit.NET also provides dedicated translation and language detection APIs for structured multilingual workflows.

Multilingual Models in the Catalog

The language capabilities depend on the model you choose. Here are the most multilingual families:

Model Family	Languages	Notes
Qwen 3.5 (`qwen3.5:0.8b` to `qwen3.6:27b`)	30+ languages	Strong multilingual instruction following. Excellent for Chinese, Japanese, Korean, and European languages.
Qwen 3 Embedding (`qwen3-embedding:*`)	30+ languages	Multilingual semantic search and RAG.
Gemma 4 (`gemma4:e4b` to `gemma4:26b-a4b`)	30+ languages	Broad multilingual support. Good for European and Asian languages.
Mistral / Magistral (`mistral-small3.2`, `magistral-small1.2`)	20+ languages	Strong for European languages, especially French.
Llama 3.1 (`llama3.1:8b`)	8 languages	English, German, French, Italian, Portuguese, Hindi, Spanish, Thai.
BGE-M3 (`bge-m3`)	100+ languages	Multilingual embedding model. Excellent for cross-language search.
Whisper (`whisper-*`)	99 languages	Speech-to-text transcription across nearly all major languages.

Translation API

LM-Kit.NET includes a dedicated TextTranslation class that handles translation between any supported language pair:

using LMKit.Model;
using LMKit.Translation;

using LM model = LM.LoadFromModelID("qwen3.5:9b");

var translator = new TextTranslation(model);
var result = translator.Translate(
    text: "How do I configure GPU backends?",
    sourceLanguage: Language.English,
    targetLanguage: Language.French
);

Console.WriteLine(result);
// "Comment configurer les backends GPU ?"

The translation API automatically handles text chunking for long documents and preserves formatting.

Language Detection

LM-Kit.NET can automatically detect the language of input text, including specialized refiners for challenging language families:

using LMKit.Translation;

var detector = new LanguageDetection(model);
var result = detector.DetectLanguage("Bonjour, comment allez-vous ?");

Console.WriteLine($"Language: {result.Language}");     // French
Console.WriteLine($"Confidence: {result.Confidence}"); // 0.98

The detection engine includes specialized components for distinguishing between:

CJK languages (Chinese, Japanese, Korean)
Cyrillic languages (Russian, Ukrainian, Bulgarian, Serbian)
Slavic languages (Polish, Czech, Slovak, Croatian)

Multilingual RAG

For multilingual knowledge bases, use a multilingual embedding model so queries in one language can retrieve documents written in another:

using LMKit.Model;
using LMKit.Retrieval;

// BGE-M3 supports 100+ languages for cross-language retrieval
using LM embeddingModel = LM.LoadFromModelID("bge-m3");
var ragEngine = new RagEngine(embeddingModel);

// Index documents in multiple languages
ragEngine.ImportDocument("manual-en.pdf");
ragEngine.ImportDocument("manual-fr.pdf");
ragEngine.ImportDocument("manual-de.pdf");

// Query in any language retrieves relevant passages regardless of document language

Tips for Non-English Use Cases

Choose Qwen 3.5 or Gemma 4 for the broadest language coverage in chat and generation tasks.
Use BGE-M3 for multilingual embeddings and cross-language RAG pipelines.
Use Whisper for multilingual speech-to-text. The whisper-large-turbo3 model offers the best accuracy across all 99 supported languages.
Larger models perform better on non-English languages. If quality in your target language is insufficient with a 4B model, try 8B or larger.
System prompts in the target language often improve output quality for non-English tasks.

How do I choose the right model size for my hardware?: Larger models tend to have better multilingual capabilities.
Model Catalog: Browse all models with their language and capability details.
Can I use my own GGUF model files?: Load specialized multilingual models from HuggingFace.
Can LM-Kit.NET process images, PDFs, and audio in one application?: Combine multilingual text processing with speech and vision.

Table of Contents