What is Chain-of-Thought (CoT) Reasoning?
TL;DR
Chain-of-Thought (CoT) is a prompting and planning technique that guides a language model to reason step by step before producing a final answer. Instead of jumping directly to a conclusion, the model articulates intermediate reasoning steps, dramatically improving accuracy on complex tasks like math, logic, multi-hop question answering, and planning. In LM-Kit.NET, CoT is implemented as the ChainOfThoughtHandler planning strategy, configurable via PlanningStrategy.ChainOfThought on agents, and is complemented by native model reasoning support through ReasoningLevel on conversations.
What is Chain-of-Thought?
Definition: Chain-of-Thought (CoT) is a reasoning technique where a language model generates an explicit sequence of intermediate steps before arriving at a final answer. The term was introduced by Wei et al. (2022) to describe how prompting models to "think step by step" unlocks reasoning capabilities that are latent in the model's weights but not activated by direct question-answer prompting.
Without CoT vs. With CoT
Without CoT:
Q: "If a train travels 120 km in 2 hours, and then 180 km in 3 hours,
what is its average speed for the entire trip?"
A: "72 km/h" ← Wrong (jumped to a guess)
With CoT:
Q: Same question
A: "Let's think through this step by step:
1. Total distance = 120 + 180 = 300 km
2. Total time = 2 + 3 = 5 hours
3. Average speed = 300 / 5 = 60 km/h
Final Answer: 60 km/h" ← Correct
The intermediate steps serve two purposes: they guide the model's own token prediction toward the correct answer, and they make the reasoning transparent and auditable for the developer.
Why CoT Works
1. Decomposition
Complex problems become manageable when broken into smaller sub-problems. Each step is simpler to solve correctly than the entire problem at once. The model's probability of getting each small step right is high, and the chain leads to the correct final answer.
2. Working Memory
LLMs have no persistent memory between tokens. By writing intermediate results into the context window, CoT provides a form of external working memory. The model can "look back" at its own reasoning steps in the same way a human writes intermediate calculations on paper.
3. Latent Capability Activation
Large models contain reasoning capabilities learned during pre-training on mathematical proofs, code, scientific papers, and logical arguments. CoT prompting activates these latent pathways by matching the pattern of step-by-step reasoning the model saw during training.
CoT Variants
| Variant | Description | Best For |
|---|---|---|
| Zero-shot CoT | Add "Let's think step by step" to the prompt. No examples needed. | Quick reasoning boost on any task |
| Few-shot CoT | Provide examples with explicit reasoning chains before the question | Complex domain-specific reasoning |
| Self-Consistency | Generate multiple CoT paths, then vote on the most common answer | High-stakes decisions requiring confidence |
| ReAct | Interleave reasoning (Thought) with actions (tool calls) and observations | Agentic tasks with external data access |
| Tree-of-Thought | Explore multiple reasoning paths with backtracking | Problems with many possible solution paths |
| Plan-and-Execute | Generate a full plan first, then execute steps sequentially | Multi-step workflows with dependencies |
| Reflection | Generate, self-critique, then refine the answer | Quality-sensitive tasks requiring self-correction |
Planning Strategies in LM-Kit.NET
LM-Kit.NET implements CoT and its variants as planning strategies through the PlanningStrategy enum and corresponding handler classes in the LMKit.Agents.Planning namespace:
| Strategy | Handler | Description |
|---|---|---|
None |
NonePlanningHandler |
Direct response, no reasoning overhead |
ChainOfThought |
ChainOfThoughtHandler |
Step-by-step reasoning before a final answer |
ReAct |
ReActHandler |
Thought-Action-Observation loop with tool calls |
PlanAndExecute |
PlanAndExecuteHandler |
Two-phase: create a plan, then execute each step |
Reflection |
ReflectionHandler |
Generate, self-critique, then refine |
TreeOfThought |
TreeOfThoughtHandler |
Multi-path exploration with backtracking |
ChainOfThoughtHandler
The ChainOfThoughtHandler injects a reasoning instruction into the prompt and parses the model's output for a structured final answer:
ReasoningInstruction: Customizable prompt prefix (default:"Let's think through this step by step:")RequireFinalAnswerMarker: Whentrue, the model must emit"Final Answer:"to separate reasoning from the answerInstance: Singleton for default configurationWithInstruction(string): Factory for custom instructionsWithRequiredMarker(): Factory requiring the final answer marker
ReActHandler
The ReActHandler extends CoT with an action loop. The model alternates between:
- Thought: Reasoning about what to do next
- Action: Calling a tool with specific arguments
- Observation: Receiving the tool's result
This cycle continues until the model reaches a Final Answer. ReAct is the standard strategy for agents that need to interact with external systems.
Code Example
Agent with Chain-of-Thought Planning
using LMKit.Model;
using LMKit.Agents;
var model = LM.LoadFromModelID("gemma3:12b");
// Build an agent with CoT planning
var agent = Agent.CreateBuilder(model)
.WithPlanning(PlanningStrategy.ChainOfThought)
.WithSystemPrompt("You are a precise analytical assistant. Always show your reasoning.")
.Build();
// The agent will reason step by step before answering
using var executor = new AgentExecutor();
var result = await executor.ExecuteAsync(agent, "What is 15% of 340, rounded to the nearest integer?");
// Model output includes visible reasoning steps + "Final Answer: 51"
Agent with ReAct (CoT + Tools)
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;
var model = LM.LoadFromModelID("glm4.7-flash");
// ReAct combines step-by-step reasoning with tool calls
var agent = Agent.CreateBuilder(model)
.WithPlanning(PlanningStrategy.ReAct)
.WithTools(tools =>
{
tools.Register(BuiltInTools.WebSearch);
tools.Register(BuiltInTools.Calculator);
})
.Build();
// The agent will: Think → Search → Observe → Think → Calculate → Final Answer
using var executor = new AgentExecutor();
var result = await executor.ExecuteAsync(agent,
"What is the population of Tokyo divided by the population of Paris?");
Custom CoT Instruction
using LMKit.Agents;
using LMKit.Agents.Planning;
// Customize the reasoning instruction for a specific domain
var cotHandler = ChainOfThoughtHandler.WithInstruction(
"Analyze this medical report systematically:\n" +
"1. Identify key findings\n" +
"2. Note any abnormal values\n" +
"3. Consider possible implications\n" +
"4. Provide your assessment");
var agent = Agent.CreateBuilder(model)
.WithPlanning(cotHandler)
.Build();
Native Model Reasoning Level
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;
var model = LM.LoadFromModelID("qwen3:14b");
using var chat = new MultiTurnConversation(model);
// Enable the model's built-in reasoning mode (thinking tokens)
chat.ReasoningLevel = ReasoningLevel.High;
// The model internally generates chain-of-thought tokens
// before producing the visible response
var answer = await chat.SubmitAsync("Solve: x^2 + 5x + 6 = 0");
ReasoningLevel controls the model's native thinking mode (available on models like Qwen 3 that support it). This is distinct from the ChainOfThoughtHandler planning strategy: reasoning level triggers internal thinking tokens at the model level, while CoT planning structures the visible prompt and output format at the application level. Both can be used together.
When to Use CoT
| Scenario | Strategy | Why |
|---|---|---|
| Math, logic, or multi-step reasoning | ChainOfThought | Makes intermediate steps explicit and auditable |
| Tasks requiring external data | ReAct | Combines reasoning with tool calls |
| Complex workflows with ordered steps | PlanAndExecute | Plans the full sequence before acting |
| Quality-critical output | Reflection | Self-correction improves accuracy |
| Problems with many possible paths | TreeOfThought | Explores alternatives before committing |
| Simple, straightforward questions | None | CoT adds unnecessary overhead and latency |
The Cost of CoT
CoT is not free. Each reasoning step consumes tokens from the context window and adds latency. For simple factual questions ("What is the capital of France?"), CoT wastes tokens on unnecessary reasoning. Use CoT selectively for tasks where explicit reasoning demonstrably improves accuracy.
Key Terms
- Chain-of-Thought (CoT): A technique where the model generates intermediate reasoning steps before the final answer.
- Zero-Shot CoT: Adding "Let's think step by step" to the prompt without providing examples.
- Few-Shot CoT: Providing examples with explicit reasoning chains as context before the question.
- ReAct: Reasoning and Acting: interleaving thoughts, tool calls, and observations.
- Tree-of-Thought (ToT): Exploring multiple reasoning branches and backtracking from dead ends.
- Self-Consistency: Generating multiple reasoning chains and selecting the most common final answer.
- Reasoning Level: A model-native parameter controlling how much internal thinking the model performs.
- Final Answer Marker: A delimiter (
"Final Answer:") that separates the reasoning trace from the answer.
Related API Documentation
ChainOfThoughtHandler: CoT planning handler with customizable instructionsReActHandler: ReAct planning handler for reasoning with toolsPlanningStrategy: Enum for selecting agent planning strategiesReasoningLevel: Native model reasoning intensityTreeOfThoughtHandler: Multi-path reasoning with backtrackingReflectionHandler: Generate-critique-refine pattern
Related Glossary Topics
- AI Agent Reasoning: The broader reasoning framework that CoT is part of
- AI Agent Planning: Planning strategies including CoT, ReAct, and Tree-of-Thought
- AI Agent Tools: Tools that ReAct agents invoke during reasoning
- Prompt Engineering: Crafting prompts that elicit step-by-step reasoning
- AI Agents: Autonomous systems that use planning strategies
- AI Agent Reflection: Self-correction as a post-reasoning strategy
- Context Windows: CoT reasoning consumes context tokens
- Hallucination: CoT reduces hallucination on reasoning tasks by making logic explicit
- Few-Shot Learning: Few-shot CoT provides reasoning examples in the prompt
External Resources
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., 2022): The foundational CoT paper
- ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2023): The ReAct pattern for tool-using agents
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Yao et al., 2023): Multi-path reasoning with backtracking
- Self-Consistency Improves Chain of Thought Reasoning (Wang et al., 2023): Voting across multiple reasoning paths
Summary
Chain-of-Thought (CoT) is the technique of guiding language models to reason step by step before producing a final answer. It dramatically improves accuracy on complex tasks by decomposing problems, providing working memory through the context window, and activating latent reasoning capabilities. In LM-Kit.NET, CoT is implemented through the ChainOfThoughtHandler planning strategy, with related strategies for tool-augmented reasoning (ReAct), multi-path exploration (Tree-of-Thought), and self-correction (Reflection). The ReasoningLevel property provides an additional, model-native reasoning knob. CoT is essential for agentic AI applications where accuracy, transparency, and auditability of the reasoning process matter.