A model can be loaded by its Model ID using the following code:
var model = LM.LoadFromModelID("gemma3:4b"); // Load the Gemma 3 model (4 billion parameters) by its unique Model ID
LM-Kit Model Catalog
Models
Model & ID | Capabilities | Context | Parameters | Format | License | Download | Details |
---|---|---|---|---|---|---|---|
BAAI bge small en v1.5 id: bge-small |
Text Embeddings | 512 | 0.03 B | GGUF | mit | bge-small-en-v1.5-f16.gguf | details |
DeepSeek Coder V2 Lite id: deepseek-coder-v2:16b |
Code Completion | 163840 | 15.71 B | GGUF | deepseek | DeepSeek-Coder-2-Lite-15.7B-Instruct-Q4_K_M.gguf | details |
DeepSeek R1 Distill Llama id: deepseek-r1:8b |
Text Embeddings Text Generation Chat Code Completion Math | 131072 | 8.03 B | GGUF | mit | DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf | details |
TII Falcon 3 Instruct id: falcon3:3b |
Text Embeddings Text Generation Chat Code Completion Math | 32768 | 3.23 B | GGUF | falcon-llm-license | Falcon3-3B-Instruct-q4_k_m.gguf | details |
TII Falcon 3 Instruct id: falcon3:7b |
Text Embeddings Text Generation Chat Code Completion Math | 32768 | 7.62 B | GGUF | falcon-llm-license | Falcon-3-7.6B-Instruct-Q4_K_M.gguf | details |
TII Falcon 3 Instruct id: falcon3:10b |
Text Embeddings Text Generation Chat Code Completion Math | 32768 | 10.31 B | GGUF | falcon-llm-license | Falcon3-10B-Instruct-q4_k_m.gguf | details |
Google Gemma 3 id: gemma3:1b |
Text Embeddings Text Generation Chat | 32768 | 1.00 B | GGUF | gemma | gemma-3-it-1B-Q4_K_M.gguf | details |
Google Gemma 3 id: gemma3:4b |
Text Embeddings Text Generation Chat Code Completion Math Vision | 131072 | 3.88 B | GGUF | gemma | gemma-3-4b-it-Q4_K_M.lmk | details |
Google Gemma 3 id: gemma3:12b |
Text Embeddings Text Generation Chat Code Completion Math Vision | 131072 | 11.77 B | GGUF | gemma | gemma-3-12b-it-Q4_K_M.lmk | details |
Google Gemma 3 id: gemma3:27b |
Text Embeddings Text Generation Chat Code Completion Math Vision | 131072 | 27.01 B | GGUF | gemma | gemma-3-27b-it-Q4_K_M.lmk | details |
IBM Granite 3.3 Instruct id: granite3.3:2b |
Text Embeddings Text Generation Chat Code Completion | 131072 | 2.53 B | GGUF | apache-2.0 | granite-3.3-8B-Instruct-Q4_K_M.gguf | details |
IBM Granite 3.3 Instruct id: granite3.3:8b |
Text Embeddings Text Generation Chat Code Completion | 131072 | 8.17 B | GGUF | apache-2.0 | granite-3.3-2B-Instruct-Q4_K_M.gguf | details |
Meta Llama 3.1 Instruct id: llama3.1 |
Text Embeddings Text Generation Chat | 131072 | 8.03 B | GGUF | llama3.1 | Llama-3.1-8B-Instruct-Q4_K_M.gguf | details |
Meta Llama 3.2 Instruct id: llama3.2:1b |
Text Embeddings Text Generation Chat | 131072 | 1.24 B | GGUF | llama3.2 | Llama-3.2-1B-Instruct-Q4_K_M.gguf | details |
Meta Llama 3.2 Instruct id: llama3.2:3b |
Text Embeddings Text Generation Chat | 131072 | 3.21 B | GGUF | llama3.2 | Llama-3.2-3B-Instruct-Q4_K_M.gguf | details |
Meta Llama 3.3 Instruct id: llama3.3 |
Text Embeddings Text Generation Chat Code Completion Math | 131072 | 70.55 B | GGUF | llama3.3 | Llama-3.3-70B-Instruct-Q4_K_M.gguf | details |
LM-Kit Sarcasm Detection V1 id: lmkit-sarcasm-detection |
Sentiment Analysis | 2048 | 1.10 B | GGUF | lm-kit | LM-Kit.Sarcasm_Detection-TinyLlama-1.1B-1T-OpenOrca-en-q4.gguf | details |
LM-Kit Sentiment Analysis V2 id: lmkit-sentiment-analysis |
Sentiment Analysis | 131072 | 1.24 B | GGUF | lm-kit | lm-kit-sentiment-analysis-2.0-1b-q4.gguf | details |
OpenBMB MiniCPM o 2.6 Vision id: minicpm-o |
Text Embeddings Text Generation Chat Vision | 32768 | 8.12 B | LMK | OpenBMB | MiniCPM-o-V-2.6-Q4_K_M.lmk | details |
Mistral Nemo Instruct 2407 id: mistral-nemo |
Text Embeddings Text Generation Chat | 1024000 | 12.25 B | GGUF | apache-2.0 | Mistral-Nemo-2407-12.2B-Instruct-Q4_K_M.gguf | details |
Mistral Small 3.1 Instruct 2503 id: mistral-small3.1 |
Text Embeddings Text Generation Chat Code Completion Math | 131072 | 23.57 B | GGUF | apache-2.0 | Mistral-Small-3.1-24B-Instruct-2503-Q4_K_M.gguf | details |
Nomic embed text v1.5 id: nomic-embed-text |
Text Embeddings | 2048 | 0.14 B | GGUF | apache-2.0 | nomic-embed-text-1.5-Q4_K_M.gguf | details |
Nomic embed vision v1.5 id: nomic-embed-vision |
Image Embeddings | 197 | 0.09 B | ONNX | apache-2.0 | nomic-embed-vision-1.5-Q8.lmk | details |
Microsoft Phi 4 Instruct id: phi4 |
Text Embeddings Text Generation Chat Math | 16384 | 14.66 B | GGUF | mit | Phi-4-14.7B-Instruct-Q4_K_M.gguf | details |
Microsoft Phi 4 Mini Instruct id: phi4-mini |
Text Embeddings Text Generation Chat | 131072 | 3.84 B | GGUF | mit | Phi-4-mini-Instruct-Q4_K_M.gguf | details |
Alibaba Qwen 2 Vision id: qwen2-vl:2b |
Text Embeddings Text Generation Chat Vision | 32768 | 2.21 B | LMK | apache-2.0 | Qwen2-VL-2B-Instruct-Q4_K_M.lmk | details |
Alibaba Qwen 2 Vision id: qwen2-vl:8b |
Text Embeddings Text Generation Chat Vision | 32768 | 8.29 B | LMK | apache-2.0 | Qwen2-VL-8.3B-Instruct-Q4_K_M.lmk | details |
Alibaba Qwen 2.5 Instruct id: qwen2.5:0.5b |
Text Embeddings Text Generation Chat | 32768 | 0.49 B | GGUF | apache-2.0 | Qwen-2.5-0.5B-Instruct-Q4_K_M.gguf | details |
Alibaba Qwen 2.5 Instruct id: qwen2.5:3b |
Text Embeddings Text Generation Chat | 32768 | 3.09 B | GGUF | qwen-research | Qwen-2.5-3.1B-Instruct-Q4_K_M.gguf | details |
Alibaba Qwen 2.5 Instruct id: qwen2.5:7b |
Text Embeddings Text Generation Chat | 32768 | 7.62 B | GGUF | apache-2.0 | Qwen-2.5-7B-Instruct-Q4_K_M.gguf | details |
Alibaba Qwen QwQ id: qwq |
Text Embeddings Text Generation Chat Code Completion Math | 40960 | 32.76 B | GGUF | apache-2.0 | QwQ-32B-Q4_K_M.gguf | details |
Model Details
BAAI bge small en v1.5 (bge-small
)
Description:
An efficient, CPU-friendly English embedding model (BAAI General Embedding) designed for lightweight applications.
Specifications:
- Capabilities: Text Embeddings
- Architecture: bert
- Context Length: 512 tokens
- Parameter Count: 33,212,160
- Quantization Precision: 16-bit
- File Size: 64.45 MB
- Format: GGUF
- License: mit
- SHA256:
cd5790da23df71e7e20fe20bb523bd4586a533a4ee813cc562e32b37929141c1
Download:
bge-small-en-v1.5-f16.gguf
DeepSeek Coder V2 Lite 15.7B (deepseek-coder-v2:16b
)
Description:
An open-source mixture-of-experts code model tailored for code completion tasks. Early evaluations indicated competitive performance relative to leading code models.
Specifications:
- Capabilities: Code Completion
- Architecture: deepseek2
- Context Length: 163840 tokens
- Parameter Count: 15,706,484,224
- Quantization Precision: 4-bit
- File Size: 9884.28 MB
- Format: GGUF
- License: deepseek
- SHA256:
ac398e8c1c670d3c362d3c1182614916bab7c364708ec073fcf947f6802d509e
DeepSeek R1 Distill Llama 8B (deepseek-r1:8b
)
Description:
DeepSeek-R1 enhances its predecessor by integrating cold-start data to overcome repetition and readability issues, achieving state-of-the-art performance in math, code, and reasoning tasks, with all models open-sourced.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 8,030,261,312
- Quantization Precision: 4-bit
- File Size: 4692.78 MB
- Format: GGUF
- License: mit
- SHA256:
596fce705423e44831fe63367a30ccc7b36921c1bfdd4b9dfde85a5aa97ac2ef
Download:
DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
TII Falcon 3 Instruct 3.2B (falcon3:3b
)
Description:
Designed for multilingual tasks including chat, text generation, and code completion, supporting extended context lengths.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: llama
- Context Length: 32768 tokens
- Parameter Count: 3,227,655,168
- Quantization Precision: 4-bit
- File Size: 1912.77 MB
- Format: GGUF
- License: falcon-llm-license
- SHA256:
81c6b52d221c2f0eea3db172fc74de28534f2fd15f198ecbfcc55577d20cbf8a
Download:
Falcon3-3B-Instruct-q4_k_m.gguf
TII Falcon 3 Instruct 7.6B (falcon3:7b
)
Description:
Offers robust performance across chat, text generation, and mathematical reasoning tasks with extended context support.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: llama
- Context Length: 32768 tokens
- Parameter Count: 7,615,616,512
- Quantization Precision: 4-bit
- File Size: 4358.03 MB
- Format: GGUF
- License: falcon-llm-license
- SHA256:
4ce1da546d76e04ce77eb076556eb25e1096faf6155ee429245e4bfa3f5ddf5d
Download:
Falcon-3-7.6B-Instruct-Q4_K_M.gguf
TII Falcon 3 Instruct 10.3B (falcon3:10b
)
Description:
A larger variant tailored for multilingual dialogue, code completion, and complex reasoning tasks with extended context support.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: llama
- Context Length: 32768 tokens
- Parameter Count: 10,305,653,760
- Quantization Precision: 4-bit
- File Size: 5996.25 MB
- Format: GGUF
- License: falcon-llm-license
- SHA256:
a0c0edbd35019ff26d972a0373b25b4c8d72315395a3b6036aca5e6bafa3d819
Download:
Falcon3-10B-Instruct-q4_k_m.gguf
Google Gemma 3 1B (gemma3:1b
)
Description:
Gemma is Google's lightweight, multimodal, open AI model family based on Gemini technology, supporting text and image inputs, 128K context windows, multilingual capabilities in over 140 languages, and optimized for resource-limited environments.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: gemma3
- Context Length: 32768 tokens
- Parameter Count: 999,885,952
- Quantization Precision: 4-bit
- File Size: 768.72 MB
- Format: GGUF
- License: gemma
- SHA256:
bacfe3de6eee9fba412d5c0415630172c2a602dae26bb353e1b20dd67194a226
Download:
gemma-3-it-1B-Q4_K_M.gguf
Google Gemma 3 3.9B (gemma3:4b
)
Description:
Gemma is Google's lightweight, multimodal, open AI model family based on Gemini technology, supporting text and image inputs, 128K context windows, multilingual capabilities in over 140 languages, and optimized for resource-limited environments.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math Vision
- Architecture: gemma3
- Context Length: 131072 tokens
- Parameter Count: 3,880,099,328
- Quantization Precision: 4-bit
- File Size: 2938.40 MB
- Format: GGUF
- License: gemma
- SHA256:
abb283e96c0abf58468a18127ce6e8b2bfc98e48f1ec618f658495c09254bdae
Download:
gemma-3-4b-it-Q4_K_M.lmk
Google Gemma 3 11.8B (gemma3:12b
)
Description:
Gemma is Google's lightweight, multimodal, open AI model family based on Gemini technology, supporting text and image inputs, 128K context windows, multilingual capabilities in over 140 languages, and optimized for resource-limited environments.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math Vision
- Architecture: gemma3
- Context Length: 131072 tokens
- Parameter Count: 11,765,788,416
- Quantization Precision: 4-bit
- File Size: 7529.17 MB
- Format: GGUF
- License: gemma
- SHA256:
d6f01cdb4369769ea87c5211a7fd865e12dbb9e2a937b43ef281a5b7e9ba2e35
Download:
gemma-3-12b-it-Q4_K_M.lmk
Google Gemma 3 27.2B (gemma3:27b
)
Description:
Gemma is Google's lightweight, multimodal, open AI model family based on Gemini technology, supporting text and image inputs, 128K context windows, multilingual capabilities in over 140 languages, and optimized for resource-limited environments.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math Vision
- Architecture: gemma3
- Context Length: 131072 tokens
- Parameter Count: 27,009,002,240
- Quantization Precision: 4-bit
- File Size: 16350.05 MB
- Format: GGUF
- License: gemma
- SHA256:
2d0e4382259ae2da28b9c0342e982a58eafbddad7c05bbfe6e104f2b3c165994
Download:
gemma-3-27b-it-Q4_K_M.lmk
IBM Granite 3.3 Instruct 2.5B (granite3.3:2b
)
Description:
A long-context instruct model finetuned with a mix of open source and synthetic datasets. Designed for dialogue and text generation tasks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion
- Architecture: granite
- Context Length: 131072 tokens
- Parameter Count: 2,533,539,840
- Quantization Precision: 4-bit
- File Size: 1473.72 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
dbe4dd51bd6c1e39f96c831bf086454c9b313bd1c279ebb7166f2a37d86598da
Download:
granite-3.3-8B-Instruct-Q4_K_M.gguf
IBM Granite 3.3 Instruct 8.2B (granite3.3:8b
)
Description:
An extended-context model optimized for dialogue and code completion tasks. Developed with diverse training data to enhance long-context understanding.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion
- Architecture: granite
- Context Length: 131072 tokens
- Parameter Count: 8,170,864,640
- Quantization Precision: 4-bit
- File Size: 4713.89 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
1c890e740d7ecb010716a858eda315c01ac5bb0edfaf68bf17118868a26bb8ff
Download:
granite-3.3-2B-Instruct-Q4_K_M.gguf
Meta Llama 3.1 Instruct 8B (llama3.1
)
Description:
A multilingual generative model optimized for dialogue and text generation tasks. Designed for robust performance on common benchmarks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 8,030,261,312
- Quantization Precision: 4-bit
- File Size: 4692.78 MB
- Format: GGUF
- License: llama3.1
- SHA256:
ad00fe50a62d1e009b4e06cd57ab55c9a30cbf5e7f183de09115d75ada73bd5b
Download:
Llama-3.1-8B-Instruct-Q4_K_M.gguf
Meta Llama 3.2 Instruct 1.2B (llama3.2:1b
)
Description:
A multilingual instruct-tuned model optimized for dialogue, retrieval, and summarization tasks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 1,235,814,432
- Quantization Precision: 4-bit
- File Size: 770.28 MB
- Format: GGUF
- License: llama3.2
- SHA256:
88725e821cf35f1a0dbeaa4a3bebeb91e6c6b6a9d50f808ab42d64233284cce1
Download:
Llama-3.2-1B-Instruct-Q4_K_M.gguf
Meta Llama 3.2 Instruct 3.2B (llama3.2:3b
)
Description:
A multilingual dialogue model with robust text generation and summarization capabilities.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 3,212,749,888
- Quantization Precision: 4-bit
- File Size: 1925.83 MB
- Format: GGUF
- License: llama3.2
- SHA256:
6810bf3cce69d440a22b85a3b3e28f57c868f1c98686abd995f1dc5d9b955cfe
Download:
Llama-3.2-3B-Instruct-Q4_K_M.gguf
Meta Llama 3.3 Instruct 70.6B (llama3.3
)
Description:
A large multilingual generative model optimized for dialogue, text tasks, code completion, and mathematical reasoning with extended context support.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 70,553,706,560
- Quantization Precision: 4-bit
- File Size: 40550.61 MB
- Format: GGUF
- License: llama3.3
- SHA256:
57f78fe3b141afa56406278265656524c51c9837edb3537ad43708b6d4ecc04d
Download:
Llama-3.3-70B-Instruct-Q4_K_M.gguf
LM-Kit Sarcasm Detection V1 1.1B (lmkit-sarcasm-detection
)
Description:
Optimized for detecting sarcasm in English text within the LM-Kit framework. Suitable for CPU-based inference.
Specifications:
- Capabilities: Sentiment Analysis
- Architecture: llama
- Context Length: 2048 tokens
- Parameter Count: 1,100,048,384
- Quantization Precision: 4-bit
- File Size: 636.88 MB
- Format: GGUF
- License: lm-kit
- SHA256:
cc82abd224dba9c689b19d368db6078d6167ca84897b21870d7d6a2c0f09d7d0
Download:
LM-Kit.Sarcasm_Detection-TinyLlama-1.1B-1T-OpenOrca-en-q4.gguf
LM-Kit Sentiment Analysis V2 1.2B (lmkit-sentiment-analysis
)
Description:
Designed for multilingual sentiment analysis tasks, this LM-Kit model is optimized for efficient CPU-based inference.
Specifications:
- Capabilities: Sentiment Analysis
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 1,235,814,432
- Quantization Precision: 4-bit
- File Size: 770.28 MB
- Format: GGUF
- License: lm-kit
- SHA256:
e12f4abf6453a8431985ce1d6350c265cd58b25210156a917e3608c850fd7add
Download:
lm-kit-sentiment-analysis-2.0-1b-q4.gguf
OpenBMB MiniCPM o 2.6 Vision 8.1B (minicpm-o
)
Description:
An end-to-end multimodal model supporting real-time speech, image, and text understanding. Offers enhanced performance for conversational tasks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Vision
- Architecture: qwen2
- Context Length: 32768 tokens
- Parameter Count: 8,116,736,752
- Quantization Precision: 4-bit
- File Size: 5120.87 MB
- Format: LMK
- License: OpenBMB
- SHA256:
6fd17ed1f46bfcddb5a3482dd882dd022a46aa8c33cb93d75f809cd4d118ab53
Download:
MiniCPM-o-V-2.6-Q4_K_M.lmk
Mistral Nemo Instruct 2407 12.2B (mistral-nemo
)
Description:
An instruct-tuned variant developed in collaboration with NVIDIA, balancing model size with performance for conversational tasks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: llama
- Context Length: 1024000 tokens
- Parameter Count: 12,247,782,400
- Quantization Precision: 4-bit
- File Size: 7130.82 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
579ab8f5178f5900d0c4e14534929aa0dba97e3f97be76b31ebe537ffd6cf169
Mistral Small 3.1 Instruct 2503 24B (mistral-small3.1
)
Description:
Mistral Small 3.1 (24B) enhances Mistral Small 3 with advanced vision, 128k context, multilingual support, agentic features, and efficient local deployment.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: llama
- Context Length: 131072 tokens
- Parameter Count: 23,572,403,200
- Quantization Precision: 4-bit
- File Size: 13669.88 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
68922ff3a311c81bc4e983f86e665a12213ee84710c210522f10e65ce980bda7
Nomic embed text v1.5 (nomic-embed-text
)
Description:
Provides flexible production embeddings using Matryoshka Representation Learning.
Specifications:
- Capabilities: Text Embeddings
- Architecture: nomic-bert
- Context Length: 2048 tokens
- Parameter Count: 136,731,648
- Quantization Precision: 4-bit
- File Size: 85.86 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
1a60949a331b30bb754ad60b7bdff80d8e563a56b3f7f3f1aed68db8c143003e
Download:
nomic-embed-text-1.5-Q4_K_M.gguf
Nomic embed vision v1.5 (nomic-embed-vision
)
Description:
ViT-B/16-based image embedding model trained on 1.5B image-text pairs using Matryoshka Representation Learning. Outputs 768-dim embeddings aligned with Nomic Embed Text v1.5 for multimodal search, retrieval, and zero-shot classification.
Specifications:
- Capabilities: Image Embeddings
- Architecture: ViT-B/16
- Context Length: 197 tokens
- Parameter Count: 92,384,769
- Quantization Precision: 8-bit
- File Size: 92.26 MB
- Format: ONNX
- License: apache-2.0
- SHA256:
4f6f6a765625a4b74ec3e62141b7b83e1db1fb904afeda1fa00c1fefefbcc714
Download:
nomic-embed-vision-1.5-Q8.lmk
Microsoft Phi 4 Instruct 14.7B (phi4
)
Description:
An enhanced generative model trained on a diverse dataset to improve instruction adherence and reasoning capabilities.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Math
- Architecture: phi3
- Context Length: 16384 tokens
- Parameter Count: 14,659,507,200
- Quantization Precision: 4-bit
- File Size: 8633.72 MB
- Format: GGUF
- License: mit
- SHA256:
03af8f5c5a87d526047f5c20c99e32bbafd5db6dbfdee8d498d0fe1a3c45af55
Download:
Phi-4-14.7B-Instruct-Q4_K_M.gguf
Microsoft Phi 4 Mini Instruct 3.8B (phi4-mini
)
Description:
A lightweight open model from the Phi-4 family that uses synthetic and curated public data for reasoning-dense outputs, supports a 128K token context, and is enhanced through fine-tuning and preference optimization for precise instruction adherence and robust safety.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: phi3
- Context Length: 131072 tokens
- Parameter Count: 3,836,021,856
- Quantization Precision: 4-bit
- File Size: 2376.44 MB
- Format: GGUF
- License: mit
- SHA256:
556492e72efc8d33406b236830ad38d25669482ea7ad91fc643de237e942b9f9
Download:
Phi-4-mini-Instruct-Q4_K_M.gguf
Alibaba Qwen 2 Vision 2.2B (qwen2-vl:2b
)
Description:
A multilingual vision-language model featuring dynamic resolution processing for advanced image and long-video understanding.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Vision
- Architecture: qwen2vl
- Context Length: 32768 tokens
- Parameter Count: 2,208,985,700
- Quantization Precision: 4-bit
- File Size: 1303.99 MB
- Format: LMK
- License: apache-2.0
- SHA256:
b4e546acfd2271f5a0960b64445cae1091e5fc4192d74db72ae57c28729bd0b8
Download:
Qwen2-VL-2B-Instruct-Q4_K_M.lmk
Alibaba Qwen 2 Vision 8.3B (qwen2-vl:8b
)
Description:
An extended variant in the Qwen 2 Vision family for multilingual vision-language tasks, including advanced video analysis.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Vision
- Architecture: qwen2vl
- Context Length: 32768 tokens
- Parameter Count: 8,291,375,716
- Quantization Precision: 4-bit
- File Size: 4835.38 MB
- Format: LMK
- License: apache-2.0
- SHA256:
90b3eb60611559ba7521590ecccdf1d2a4dfab007566221c6a42f19b91b48686
Download:
Qwen2-VL-8.3B-Instruct-Q4_K_M.lmk
Alibaba Qwen 2.5 Instruct 0.5B (qwen2.5:0.5b
)
Description:
A compact variant from the Alibaba Qwen 2.5 family, optimized for instruction following across chat, embeddings, and text generation tasks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: qwen2
- Context Length: 32768 tokens
- Parameter Count: 494,032,768
- Quantization Precision: 4-bit
- File Size: 379.38 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
09b44ff0ef0a160ffe50778c0828754201bb3a40522a941839c23acfbc9ceec0
Download:
Qwen-2.5-0.5B-Instruct-Q4_K_M.gguf
Alibaba Qwen 2.5 Instruct 3.1B (qwen2.5:3b
)
Description:
A mid-sized model from the Alibaba Qwen 2.5 series, designed for diverse tasks including chat, embeddings, and text generation. Performance should be evaluated relative to current benchmarks.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: qwen2
- Context Length: 32768 tokens
- Parameter Count: 3,085,938,688
- Quantization Precision: 4-bit
- File Size: 1840.50 MB
- Format: GGUF
- License: qwen-research
- SHA256:
fb88cca2303e7f7d4d52679d633efe66d9c3e3555573b4444abe5ab8af4a97f7
Download:
Qwen-2.5-3.1B-Instruct-Q4_K_M.gguf
Alibaba Qwen 2.5 Instruct 7.6B (qwen2.5:7b
)
Description:
A larger variant from the Alibaba Qwen 2.5 series that supports extended context and multiple tasks including chat, embeddings, and text generation.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat
- Architecture: qwen2
- Context Length: 32768 tokens
- Parameter Count: 7,615,616,512
- Quantization Precision: 4-bit
- File Size: 4466.13 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
2bf11b8a7d566bddfcc2b222ed7b918afc51239c5f919532de8b9403981ad866
Download:
Qwen-2.5-7B-Instruct-Q4_K_M.gguf
Alibaba Qwen QwQ 32.5B (qwq
)
Description:
QwQ is a reasoning-focused model in the Qwen series that significantly outperforms conventional instruction-tuned models on challenging tasks, with QwQ-32B demonstrating competitive performance compared to top reasoning models like DeepSeek-R1 and o1-mini.
Specifications:
- Capabilities: Text Embeddings Text Generation Chat Code Completion Math
- Architecture: qwen2
- Context Length: 40960 tokens
- Parameter Count: 32,763,876,352
- Quantization Precision: 4-bit
- File Size: 18931.71 MB
- Format: GGUF
- License: apache-2.0
- SHA256:
6c2c72d16bbf5b0c30ac22031e0800b982b7d5c4e4d27daa62b66ee61c565d17
Download:
QwQ-32B-Q4_K_M.gguf