What Languages Can LM-Kit.NET Models Understand and Generate?
TL;DR
Most models in the LM-Kit.NET catalog are multilingual by training. The Qwen 3.5 and Gemma 4 families support dozens of languages out of the box, including English, Chinese, Japanese, Korean, French, German, Spanish, Arabic, Russian, and many more. LM-Kit.NET also provides dedicated translation and language detection APIs for structured multilingual workflows.
Multilingual Models in the Catalog
The language capabilities depend on the model you choose. Here are the most multilingual families:
| Model Family | Languages | Notes |
|---|---|---|
Qwen 3.5 (qwen3.5:0.8b to qwen3.5:27b) |
30+ languages | Strong multilingual instruction following. Excellent for Chinese, Japanese, Korean, and European languages. |
Qwen 3 Embedding (qwen3-embedding:*) |
30+ languages | Multilingual semantic search and RAG. |
Gemma 4 (gemma4:e4b to gemma4:26b-a4b) |
30+ languages | Broad multilingual support. Good for European and Asian languages. |
Mistral / Magistral (mistral-small, magistral-small) |
20+ languages | Strong for European languages, especially French. |
Llama 3.1 (llama3.1:8b) |
8 languages | English, German, French, Italian, Portuguese, Hindi, Spanish, Thai. |
BGE-M3 (bge-m3) |
100+ languages | Multilingual embedding model. Excellent for cross-language search. |
Whisper (whisper-*) |
99 languages | Speech-to-text transcription across nearly all major languages. |
Translation API
LM-Kit.NET includes a dedicated TextTranslation class that handles translation between any supported language pair:
using LMKit.Model;
using LMKit.Translation;
using LM model = LM.LoadFromModelID("qwen3.5:9b");
var translator = new TextTranslation(model);
var result = translator.Translate(
text: "How do I configure GPU backends?",
sourceLanguage: Language.English,
targetLanguage: Language.French
);
Console.WriteLine(result);
// "Comment configurer les backends GPU ?"
The translation API automatically handles text chunking for long documents and preserves formatting.
Language Detection
LM-Kit.NET can automatically detect the language of input text, including specialized refiners for challenging language families:
using LMKit.Translation;
var detector = new LanguageDetection(model);
var result = detector.DetectLanguage("Bonjour, comment allez-vous ?");
Console.WriteLine($"Language: {result.Language}"); // French
Console.WriteLine($"Confidence: {result.Confidence}"); // 0.98
The detection engine includes specialized components for distinguishing between:
- CJK languages (Chinese, Japanese, Korean)
- Cyrillic languages (Russian, Ukrainian, Bulgarian, Serbian)
- Slavic languages (Polish, Czech, Slovak, Croatian)
Multilingual RAG
For multilingual knowledge bases, use a multilingual embedding model so queries in one language can retrieve documents written in another:
using LMKit.Model;
using LMKit.Retrieval;
// BGE-M3 supports 100+ languages for cross-language retrieval
using LM embeddingModel = LM.LoadFromModelID("bge-m3");
var ragEngine = new RagEngine(embeddingModel);
// Index documents in multiple languages
ragEngine.ImportDocument("manual-en.pdf");
ragEngine.ImportDocument("manual-fr.pdf");
ragEngine.ImportDocument("manual-de.pdf");
// Query in any language retrieves relevant passages regardless of document language
Tips for Non-English Use Cases
- Choose Qwen 3.5 or Gemma 4 for the broadest language coverage in chat and generation tasks.
- Use BGE-M3 for multilingual embeddings and cross-language RAG pipelines.
- Use Whisper for multilingual speech-to-text. The
whisper-large-turbo3model offers the best accuracy across all 99 supported languages. - Larger models perform better on non-English languages. If quality in your target language is insufficient with a 4B model, try 8B or larger.
- System prompts in the target language often improve output quality for non-English tasks.
📚 Related Content
- How do I choose the right model size for my hardware?: Larger models tend to have better multilingual capabilities.
- Model Catalog: Browse all models with their language and capability details.
- Can I use my own GGUF model files?: Load specialized multilingual models from HuggingFace.
- Can LM-Kit.NET process images, PDFs, and audio in one application?: Combine multilingual text processing with speech and vision.