What is Few-Shot Learning in Large Language Models?

TL;DR

Few-shot learning is the ability of a language model to perform a task after seeing just a few examples in the prompt, without any parameter updates. Zero-shot means no examples (just an instruction), one-shot means one example, and few-shot means two or more examples. This capability, known as in-context learning, is an emergent property of large transformer-based models. In LM-Kit.NET, zero-shot learning powers the Categorization class (classify text with just category names), while few-shot patterns can be implemented through prompt engineering, ChatHistory examples, and the Guidance property for steering model behavior without fine-tuning.

What is Few-Shot Learning?

Definition: Few-shot learning refers to a model's ability to generalize from a very small number of examples. In the context of LLMs, this happens entirely through the prompt: you show the model a few input-output pairs, then present a new input, and the model produces the correct output by recognizing the pattern. No weights are updated. No training occurs. The model learns "in context."

The Spectrum

+-------------+    +-------------+    +-------------+    +-------------+
| Zero-Shot   |    | One-Shot    |    | Few-Shot    |    | Many-Shot   |
| (0 examples)|    | (1 example) |    | (2-10       |    | (10-100+    |
|             |    |             |    |  examples)  |    |  examples)  |
+-------------+    +-------------+    +-------------+    +-------------+
       |                  |                  |                  |
  "Classify this         "Here is one      "Here are         Fine-tuning
   email as spam          example of a      five examples     may be more
   or not spam"           spam email..."    of spam..."       efficient

Approach	Examples in Prompt	Model Weights Updated	Best For
Zero-shot	None	No	Tasks the model already understands from pre-training
One-shot	1	No	Demonstrating output format or style
Few-shot	2-10	No	Complex tasks needing pattern demonstration
Fine-tuning	100-10,000+	Yes	Domain-specific tasks requiring persistent adaptation

Why In-Context Learning Works

In-context learning is an emergent property of large transformer models. Several factors explain it:

1. Pattern Recognition at Scale

During pre-training on trillions of tokens, models encounter millions of implicit "tasks" formatted as input-output pairs (Q&A, translations, code comments, etc.). At inference time, the model recognizes a new set of examples as following a similar pattern and continues it.

2. Attention Over Examples

The attention mechanism allows every token in the model's response to attend to every token in the examples. The model can directly copy patterns, formats, and reasoning styles from the examples in its context window.

3. Implicit Task Identification

Even with zero examples (zero-shot), large models can identify the task from the instruction alone, because they have seen similar task descriptions during pre-training. Adding examples (few-shot) reduces ambiguity and improves accuracy by making the task specification more concrete.

Zero-Shot, One-Shot, and Few-Shot in Practice

Zero-Shot: Instruction Only

The model relies entirely on its pre-trained knowledge and the task description:

System: You are a sentiment analyzer.
User: Classify the sentiment of this review as positive, negative, or neutral:
      "The food was decent but the service was painfully slow."
Assistant: Negative

Zero-shot works well for tasks that models encounter frequently during pre-training (sentiment analysis, translation, summarization). It struggles with novel formats, domain-specific labels, or ambiguous instructions.

One-Shot: One Example

A single example demonstrates the expected format and behavior:

System: Extract the product name and price from customer reviews.
User: Example:
      Review: "Bought the AirPods Pro for $249, great noise cancellation!"
      Output: {"product": "AirPods Pro", "price": "$249"}

      Now extract from this review:
      "The Sony WH-1000XM5 at $348 has the best sound quality I've heard."
Assistant: {"product": "Sony WH-1000XM5", "price": "$348"}

Few-Shot: Multiple Examples

Multiple examples cover edge cases and establish a consistent pattern:

System: Classify support tickets by priority.
User: Examples:
      "Server is down, all users affected" → Critical
      "Login button color is wrong" → Low
      "Payment processing fails for some users" → High
      "Add dark mode to settings page" → Low

      Classify: "Database backup failed overnight, no data loss yet"
Assistant: High

Each additional example reduces ambiguity. The model learns not just the mapping but the nuances: "server down" = Critical, "fails for some users" = High (not Critical because it is partial).

Practical Application in LM-Kit.NET SDK

Zero-Shot Classification with Categorization

The Categorization class performs zero-shot classification. You provide category names (and optional descriptions) with no labeled examples:

using LMKit.Model;
using LMKit.TextAnalysis;

var model = LM.LoadFromModelID("gemma4:e4b");
var categorizer = new Categorization(model);

// Zero-shot: just category names, no examples
var categories = new List<string>
{
    "Technical Issue",
    "Billing Question",
    "Feature Request",
    "General Inquiry"
};

string ticket = "My invoice shows a charge for a service I cancelled last month.";
int result = categorizer.GetBestCategory(categories, ticket);

Console.WriteLine($"Category: {categories[result]}");    // "Billing Question"
Console.WriteLine($"Confidence: {categorizer.Confidence:P0}");

Enhanced Zero-Shot with Category Descriptions

Adding descriptions to categories improves zero-shot accuracy by giving the model more context:

var categories = new List<string>
{
    "Critical",
    "High",
    "Medium",
    "Low"
};

var descriptions = new List<string>
{
    "System-wide outage affecting all users, data loss risk",
    "Major feature broken for a significant subset of users",
    "Minor feature issue or degraded performance",
    "Cosmetic issue, enhancement request, or documentation"
};

int priority = categorizer.GetBestCategory(categories, descriptions, ticketText);

Zero-Shot with Guidance Steering

The Guidance property lets you inject natural-language instructions that steer classification behavior, approximating few-shot reasoning without explicit examples:

categorizer.Guidance = """
    When classifying support tickets:
    - If the issue mentions 'down', 'outage', or 'all users', classify as Critical
    - Payment and authentication failures affecting users are High
    - UI bugs and slow performance are Medium
    - Feature requests and documentation are Low
    """;

int result = categorizer.GetBestCategory(categories, descriptions, ticketText);

Few-Shot via ChatHistory

For tasks where you need explicit input-output examples, build them into the conversation history:

using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

var model = LM.LoadFromModelID("gemma4:e4b");
using var chat = new MultiTurnConversation(model);

chat.SystemPrompt = "Extract the company name and role from job postings. " +
                    "Output as JSON: {\"company\": \"...\", \"role\": \"...\"}";

// Few-shot examples injected as conversation history
chat.ChatHistory.AddMessage(AuthorRole.User,
    "Senior Engineer at Anthropic, working on AI safety research.");
chat.ChatHistory.AddMessage(AuthorRole.Assistant,
    "{\"company\": \"Anthropic\", \"role\": \"Senior Engineer\"}");

chat.ChatHistory.AddMessage(AuthorRole.User,
    "Google DeepMind is hiring a Research Scientist for their London office.");
chat.ChatHistory.AddMessage(AuthorRole.Assistant,
    "{\"company\": \"Google DeepMind\", \"role\": \"Research Scientist\"}");

// Now the model follows the established pattern
var result = await chat.SubmitAsync(
    "Join our team as a Staff ML Engineer at OpenAI.");
// Output: {"company": "OpenAI", "role": "Staff ML Engineer"}

Embedding-Based Zero-Shot Classification

For large category sets or when embedding similarity is more appropriate than generative classification:

var model = LM.LoadFromModelID("gemma4:e4b");
var categorizer = new Categorization(model)
{
    UseEmbeddingClassifier = true  // Use semantic similarity instead of generation
};

// Embedding-based classification scales to hundreds of categories
// without consuming context window tokens
int result = categorizer.GetBestCategory(largeCategories, text);

Few-Shot vs. Fine-Tuning

Dimension	Few-Shot Prompting	Fine-Tuning
Examples needed	2-10	100-10,000+
Setup time	Immediate	Hours to days
Weight updates	None	Yes (new LoRA or full weights)
Flexibility	Change examples at runtime	Requires retraining
Accuracy on domain tasks	Good	Excellent
Context window cost	Each example consumes tokens	No runtime token cost
Persistence	Per-conversation only	Permanent

Rule of thumb: Start with zero-shot. If accuracy is insufficient, try few-shot. If few-shot with 5-10 examples still falls short, consider fine-tuning. Each step up the ladder requires more effort but yields higher accuracy for domain-specific tasks.

Key Terms

Zero-Shot Learning: Performing a task from instructions alone, with no examples.
One-Shot Learning: Performing a task after seeing exactly one example.
Few-Shot Learning: Performing a task after seeing a small number of examples (typically 2-10).
In-Context Learning (ICL): The model's ability to learn from examples placed in the prompt, without updating weights.
Shot: A single input-output example provided in the prompt.
Exemplar: A carefully chosen example that demonstrates the desired behavior.
Task Specification: The combination of instructions and examples that defines what the model should do.
Embedding-Based Classification: Using vector similarity instead of text generation for categorization.

Categorization: Zero-shot classification with category names and descriptions
ChatHistory: Conversation history for injecting few-shot examples
AuthorRole: User/Assistant roles for structuring example pairs
TextExtraction: Structured data extraction (zero-shot with schema)
MultiTurnConversation: Conversation class supporting few-shot via chat history

Prompt Engineering: The broader practice that includes few-shot prompting techniques
Chain-of-Thought (CoT): Few-shot CoT provides reasoning examples alongside input-output pairs
Fine-Tuning: The next step when few-shot prompting is not sufficient
LoRA Adapters: Efficient fine-tuning that creates lightweight task-specific adapters
Classification: A primary use case for zero-shot and few-shot techniques
Structured Data Extraction: Extraction tasks that benefit from few-shot examples
Embeddings: Embedding-based zero-shot classification as an alternative to generative classification
Context Windows: Few-shot examples consume context tokens
Temperature: Lower temperature improves few-shot consistency
Hallucination: Few-shot examples reduce hallucination by grounding the model's behavior

External Resources

Language Models are Few-Shot Learners (Brown et al., 2020): The GPT-3 paper that demonstrated in-context learning at scale
Rethinking the Role of Demonstrations (Min et al., 2022): Analysis of why few-shot examples work (format matters more than label correctness)
Chain-of-Thought Prompting Elicits Reasoning (Wei et al., 2022): Few-shot CoT for complex reasoning
A Survey on In-Context Learning (Dong et al., 2023): Comprehensive survey of in-context learning mechanisms and applications

Summary

Few-shot learning is the ability of large language models to perform new tasks after seeing just a few examples in the prompt, without any weight updates. This in-context learning capability spans a spectrum from zero-shot (instruction only) to few-shot (multiple examples) to fine-tuning (persistent weight adaptation). In LM-Kit.NET, zero-shot learning is built into the Categorization class for classification, while few-shot patterns are implemented through ChatHistory examples, the Guidance property for natural-language steering, and embedding-based classification for large category sets. Understanding when to use each point on the spectrum (zero-shot for common tasks, few-shot for complex patterns, fine-tuning for domain expertise) is fundamental to building effective AI applications with minimal data.

Table of Contents

What is Few-Shot Learning in Large Language Models?

TL;DR