What is Few-Shot Learning in Large Language Models?
TL;DR
Few-shot learning is the ability of a language model to perform a task after seeing just a few examples in the prompt, without any parameter updates. Zero-shot means no examples (just an instruction), one-shot means one example, and few-shot means two or more examples. This capability, known as in-context learning, is an emergent property of large transformer-based models. In LM-Kit.NET, zero-shot learning powers the Categorization class (classify text with just category names), while few-shot patterns can be implemented through prompt engineering, ChatHistory examples, and the Guidance property for steering model behavior without fine-tuning.
What is Few-Shot Learning?
Definition: Few-shot learning refers to a model's ability to generalize from a very small number of examples. In the context of LLMs, this happens entirely through the prompt: you show the model a few input-output pairs, then present a new input, and the model produces the correct output by recognizing the pattern. No weights are updated. No training occurs. The model learns "in context."
The Spectrum
+-------------+ +-------------+ +-------------+ +-------------+
| Zero-Shot | | One-Shot | | Few-Shot | | Many-Shot |
| (0 examples)| | (1 example) | | (2-10 | | (10-100+ |
| | | | | examples) | | examples) |
+-------------+ +-------------+ +-------------+ +-------------+
| | | |
"Classify this "Here is one "Here are Fine-tuning
email as spam example of a five examples may be more
or not spam" spam email..." of spam..." efficient
| Approach | Examples in Prompt | Model Weights Updated | Best For |
|---|---|---|---|
| Zero-shot | None | No | Tasks the model already understands from pre-training |
| One-shot | 1 | No | Demonstrating output format or style |
| Few-shot | 2-10 | No | Complex tasks needing pattern demonstration |
| Fine-tuning | 100-10,000+ | Yes | Domain-specific tasks requiring persistent adaptation |
Why In-Context Learning Works
In-context learning is an emergent property of large transformer models. Several factors explain it:
1. Pattern Recognition at Scale
During pre-training on trillions of tokens, models encounter millions of implicit "tasks" formatted as input-output pairs (Q&A, translations, code comments, etc.). At inference time, the model recognizes a new set of examples as following a similar pattern and continues it.
2. Attention Over Examples
The attention mechanism allows every token in the model's response to attend to every token in the examples. The model can directly copy patterns, formats, and reasoning styles from the examples in its context window.
3. Implicit Task Identification
Even with zero examples (zero-shot), large models can identify the task from the instruction alone, because they have seen similar task descriptions during pre-training. Adding examples (few-shot) reduces ambiguity and improves accuracy by making the task specification more concrete.
Zero-Shot, One-Shot, and Few-Shot in Practice
Zero-Shot: Instruction Only
The model relies entirely on its pre-trained knowledge and the task description:
System: You are a sentiment analyzer.
User: Classify the sentiment of this review as positive, negative, or neutral:
"The food was decent but the service was painfully slow."
Assistant: Negative
Zero-shot works well for tasks that models encounter frequently during pre-training (sentiment analysis, translation, summarization). It struggles with novel formats, domain-specific labels, or ambiguous instructions.
One-Shot: One Example
A single example demonstrates the expected format and behavior:
System: Extract the product name and price from customer reviews.
User: Example:
Review: "Bought the AirPods Pro for $249, great noise cancellation!"
Output: {"product": "AirPods Pro", "price": "$249"}
Now extract from this review:
"The Sony WH-1000XM5 at $348 has the best sound quality I've heard."
Assistant: {"product": "Sony WH-1000XM5", "price": "$348"}
Few-Shot: Multiple Examples
Multiple examples cover edge cases and establish a consistent pattern:
System: Classify support tickets by priority.
User: Examples:
"Server is down, all users affected" → Critical
"Login button color is wrong" → Low
"Payment processing fails for some users" → High
"Add dark mode to settings page" → Low
Classify: "Database backup failed overnight, no data loss yet"
Assistant: High
Each additional example reduces ambiguity. The model learns not just the mapping but the nuances: "server down" = Critical, "fails for some users" = High (not Critical because it is partial).
Practical Application in LM-Kit.NET SDK
Zero-Shot Classification with Categorization
The Categorization class performs zero-shot classification. You provide category names (and optional descriptions) with no labeled examples:
using LMKit.Model;
using LMKit.TextAnalysis;
var model = LM.LoadFromModelID("gemma3:4b");
var categorizer = new Categorization(model);
// Zero-shot: just category names, no examples
var categories = new List<string>
{
"Technical Issue",
"Billing Question",
"Feature Request",
"General Inquiry"
};
string ticket = "My invoice shows a charge for a service I cancelled last month.";
int result = categorizer.GetBestCategory(categories, ticket);
Console.WriteLine($"Category: {categories[result]}"); // "Billing Question"
Console.WriteLine($"Confidence: {categorizer.Confidence:P0}");
Enhanced Zero-Shot with Category Descriptions
Adding descriptions to categories improves zero-shot accuracy by giving the model more context:
var categories = new List<string>
{
"Critical",
"High",
"Medium",
"Low"
};
var descriptions = new List<string>
{
"System-wide outage affecting all users, data loss risk",
"Major feature broken for a significant subset of users",
"Minor feature issue or degraded performance",
"Cosmetic issue, enhancement request, or documentation"
};
int priority = categorizer.GetBestCategory(categories, descriptions, ticketText);
Zero-Shot with Guidance Steering
The Guidance property lets you inject natural-language instructions that steer classification behavior, approximating few-shot reasoning without explicit examples:
categorizer.Guidance = """
When classifying support tickets:
- If the issue mentions 'down', 'outage', or 'all users', classify as Critical
- Payment and authentication failures affecting users are High
- UI bugs and slow performance are Medium
- Feature requests and documentation are Low
""";
int result = categorizer.GetBestCategory(categories, descriptions, ticketText);
Few-Shot via ChatHistory
For tasks where you need explicit input-output examples, build them into the conversation history:
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;
var model = LM.LoadFromModelID("gemma3:12b");
using var chat = new MultiTurnConversation(model);
chat.SystemPrompt = "Extract the company name and role from job postings. " +
"Output as JSON: {\"company\": \"...\", \"role\": \"...\"}";
// Few-shot examples injected as conversation history
chat.ChatHistory.AddMessage(AuthorRole.User,
"Senior Engineer at Anthropic, working on AI safety research.");
chat.ChatHistory.AddMessage(AuthorRole.Assistant,
"{\"company\": \"Anthropic\", \"role\": \"Senior Engineer\"}");
chat.ChatHistory.AddMessage(AuthorRole.User,
"Google DeepMind is hiring a Research Scientist for their London office.");
chat.ChatHistory.AddMessage(AuthorRole.Assistant,
"{\"company\": \"Google DeepMind\", \"role\": \"Research Scientist\"}");
// Now the model follows the established pattern
var result = await chat.SubmitAsync(
"Join our team as a Staff ML Engineer at OpenAI.");
// Output: {"company": "OpenAI", "role": "Staff ML Engineer"}
Embedding-Based Zero-Shot Classification
For large category sets or when embedding similarity is more appropriate than generative classification:
var categorizer = new Categorization(model)
{
UseEmbeddingClassifier = true // Use semantic similarity instead of generation
};
// Embedding-based classification scales to hundreds of categories
// without consuming context window tokens
int result = categorizer.GetBestCategory(largeCategories, text);
Few-Shot vs. Fine-Tuning
| Dimension | Few-Shot Prompting | Fine-Tuning |
|---|---|---|
| Examples needed | 2-10 | 100-10,000+ |
| Setup time | Immediate | Hours to days |
| Weight updates | None | Yes (new LoRA or full weights) |
| Flexibility | Change examples at runtime | Requires retraining |
| Accuracy on domain tasks | Good | Excellent |
| Context window cost | Each example consumes tokens | No runtime token cost |
| Persistence | Per-conversation only | Permanent |
Rule of thumb: Start with zero-shot. If accuracy is insufficient, try few-shot. If few-shot with 5-10 examples still falls short, consider fine-tuning. Each step up the ladder requires more effort but yields higher accuracy for domain-specific tasks.
Key Terms
- Zero-Shot Learning: Performing a task from instructions alone, with no examples.
- One-Shot Learning: Performing a task after seeing exactly one example.
- Few-Shot Learning: Performing a task after seeing a small number of examples (typically 2-10).
- In-Context Learning (ICL): The model's ability to learn from examples placed in the prompt, without updating weights.
- Shot: A single input-output example provided in the prompt.
- Exemplar: A carefully chosen example that demonstrates the desired behavior.
- Task Specification: The combination of instructions and examples that defines what the model should do.
- Embedding-Based Classification: Using vector similarity instead of text generation for categorization.
Related API Documentation
Categorization: Zero-shot classification with category names and descriptionsChatHistory: Conversation history for injecting few-shot examplesAuthorRole: User/Assistant roles for structuring example pairsTextExtraction: Structured data extraction (zero-shot with schema)MultiTurnConversation: Conversation class supporting few-shot via chat history
Related Glossary Topics
- Prompt Engineering: The broader practice that includes few-shot prompting techniques
- Chain-of-Thought (CoT): Few-shot CoT provides reasoning examples alongside input-output pairs
- Fine-Tuning: The next step when few-shot prompting is not sufficient
- LoRA Adapters: Efficient fine-tuning that creates lightweight task-specific adapters
- Classification: A primary use case for zero-shot and few-shot techniques
- Structured Data Extraction: Extraction tasks that benefit from few-shot examples
- Embeddings: Embedding-based zero-shot classification as an alternative to generative classification
- Context Windows: Few-shot examples consume context tokens
- Temperature: Lower temperature improves few-shot consistency
- Hallucination: Few-shot examples reduce hallucination by grounding the model's behavior
External Resources
- Language Models are Few-Shot Learners (Brown et al., 2020): The GPT-3 paper that demonstrated in-context learning at scale
- Rethinking the Role of Demonstrations (Min et al., 2022): Analysis of why few-shot examples work (format matters more than label correctness)
- Chain-of-Thought Prompting Elicits Reasoning (Wei et al., 2022): Few-shot CoT for complex reasoning
- A Survey on In-Context Learning (Dong et al., 2023): Comprehensive survey of in-context learning mechanisms and applications
Summary
Few-shot learning is the ability of large language models to perform new tasks after seeing just a few examples in the prompt, without any weight updates. This in-context learning capability spans a spectrum from zero-shot (instruction only) to few-shot (multiple examples) to fine-tuning (persistent weight adaptation). In LM-Kit.NET, zero-shot learning is built into the Categorization class for classification, while few-shot patterns are implemented through ChatHistory examples, the Guidance property for natural-language steering, and embedding-based classification for large category sets. Understanding when to use each point on the spectrum (zero-shot for common tasks, few-shot for complex patterns, fine-tuning for domain expertise) is fundamental to building effective AI applications with minimal data.