Table of Contents

What is Chain-of-Thought (CoT) Reasoning?


TL;DR

Chain-of-Thought (CoT) is a prompting and planning technique that guides a language model to reason step by step before producing a final answer. Instead of jumping directly to a conclusion, the model articulates intermediate reasoning steps, dramatically improving accuracy on complex tasks like math, logic, multi-hop question answering, and planning. In LM-Kit.NET, CoT is implemented as the ChainOfThoughtHandler planning strategy, configurable via PlanningStrategy.ChainOfThought on agents, and is complemented by native model reasoning support through ReasoningLevel on conversations.


What is Chain-of-Thought?

Definition: Chain-of-Thought (CoT) is a reasoning technique where a language model generates an explicit sequence of intermediate steps before arriving at a final answer. The term was introduced by Wei et al. (2022) to describe how prompting models to "think step by step" unlocks reasoning capabilities that are latent in the model's weights but not activated by direct question-answer prompting.

Without CoT vs. With CoT

Without CoT:
  Q: "If a train travels 120 km in 2 hours, and then 180 km in 3 hours,
      what is its average speed for the entire trip?"
  A: "72 km/h"  ← Wrong (jumped to a guess)

With CoT:
  Q: Same question
  A: "Let's think through this step by step:
      1. Total distance = 120 + 180 = 300 km
      2. Total time = 2 + 3 = 5 hours
      3. Average speed = 300 / 5 = 60 km/h
      Final Answer: 60 km/h"  ← Correct

The intermediate steps serve two purposes: they guide the model's own token prediction toward the correct answer, and they make the reasoning transparent and auditable for the developer.


Why CoT Works

1. Decomposition

Complex problems become manageable when broken into smaller sub-problems. Each step is simpler to solve correctly than the entire problem at once. The model's probability of getting each small step right is high, and the chain leads to the correct final answer.

2. Working Memory

LLMs have no persistent memory between tokens. By writing intermediate results into the context window, CoT provides a form of external working memory. The model can "look back" at its own reasoning steps in the same way a human writes intermediate calculations on paper.

3. Latent Capability Activation

Large models contain reasoning capabilities learned during pre-training on mathematical proofs, code, scientific papers, and logical arguments. CoT prompting activates these latent pathways by matching the pattern of step-by-step reasoning the model saw during training.


CoT Variants

Variant Description Best For
Zero-shot CoT Add "Let's think step by step" to the prompt. No examples needed. Quick reasoning boost on any task
Few-shot CoT Provide examples with explicit reasoning chains before the question Complex domain-specific reasoning
Self-Consistency Generate multiple CoT paths, then vote on the most common answer High-stakes decisions requiring confidence
ReAct Interleave reasoning (Thought) with actions (tool calls) and observations Agentic tasks with external data access
Tree-of-Thought Explore multiple reasoning paths with backtracking Problems with many possible solution paths
Plan-and-Execute Generate a full plan first, then execute steps sequentially Multi-step workflows with dependencies
Reflection Generate, self-critique, then refine the answer Quality-sensitive tasks requiring self-correction

Planning Strategies in LM-Kit.NET

LM-Kit.NET implements CoT and its variants as planning strategies through the PlanningStrategy enum and corresponding handler classes in the LMKit.Agents.Planning namespace:

Strategy Handler Description
None NonePlanningHandler Direct response, no reasoning overhead
ChainOfThought ChainOfThoughtHandler Step-by-step reasoning before a final answer
ReAct ReActHandler Thought-Action-Observation loop with tool calls
PlanAndExecute PlanAndExecuteHandler Two-phase: create a plan, then execute each step
Reflection ReflectionHandler Generate, self-critique, then refine
TreeOfThought TreeOfThoughtHandler Multi-path exploration with backtracking

ChainOfThoughtHandler

The ChainOfThoughtHandler injects a reasoning instruction into the prompt and parses the model's output for a structured final answer:

  • ReasoningInstruction: Customizable prompt prefix (default: "Let's think through this step by step:")
  • RequireFinalAnswerMarker: When true, the model must emit "Final Answer:" to separate reasoning from the answer
  • Instance: Singleton for default configuration
  • WithInstruction(string): Factory for custom instructions
  • WithRequiredMarker(): Factory requiring the final answer marker

ReActHandler

The ReActHandler extends CoT with an action loop. The model alternates between:

  1. Thought: Reasoning about what to do next
  2. Action: Calling a tool with specific arguments
  3. Observation: Receiving the tool's result

This cycle continues until the model reaches a Final Answer. ReAct is the standard strategy for agents that need to interact with external systems.


Code Example

Agent with Chain-of-Thought Planning

using LMKit.Model;
using LMKit.Agents;

var model = LM.LoadFromModelID("gemma3:12b");

// Build an agent with CoT planning
var agent = Agent.CreateBuilder(model)
    .WithPlanning(PlanningStrategy.ChainOfThought)
    .WithSystemPrompt("You are a precise analytical assistant. Always show your reasoning.")
    .Build();

// The agent will reason step by step before answering
using var executor = new AgentExecutor();
var result = await executor.ExecuteAsync(agent, "What is 15% of 340, rounded to the nearest integer?");
// Model output includes visible reasoning steps + "Final Answer: 51"

Agent with ReAct (CoT + Tools)

using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Tools.BuiltIn;

var model = LM.LoadFromModelID("glm4.7-flash");

// ReAct combines step-by-step reasoning with tool calls
var agent = Agent.CreateBuilder(model)
    .WithPlanning(PlanningStrategy.ReAct)
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.WebSearch);
        tools.Register(BuiltInTools.Calculator);
    })
    .Build();

// The agent will: Think → Search → Observe → Think → Calculate → Final Answer
using var executor = new AgentExecutor();
var result = await executor.ExecuteAsync(agent,
    "What is the population of Tokyo divided by the population of Paris?");

Custom CoT Instruction

using LMKit.Agents;
using LMKit.Agents.Planning;

// Customize the reasoning instruction for a specific domain
var cotHandler = ChainOfThoughtHandler.WithInstruction(
    "Analyze this medical report systematically:\n" +
    "1. Identify key findings\n" +
    "2. Note any abnormal values\n" +
    "3. Consider possible implications\n" +
    "4. Provide your assessment");

var agent = Agent.CreateBuilder(model)
    .WithPlanning(cotHandler)
    .Build();

Native Model Reasoning Level

using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

var model = LM.LoadFromModelID("qwen3:14b");
using var chat = new MultiTurnConversation(model);

// Enable the model's built-in reasoning mode (thinking tokens)
chat.ReasoningLevel = ReasoningLevel.High;

// The model internally generates chain-of-thought tokens
// before producing the visible response
var answer = await chat.SubmitAsync("Solve: x^2 + 5x + 6 = 0");

ReasoningLevel controls the model's native thinking mode (available on models like Qwen 3 that support it). This is distinct from the ChainOfThoughtHandler planning strategy: reasoning level triggers internal thinking tokens at the model level, while CoT planning structures the visible prompt and output format at the application level. Both can be used together.


When to Use CoT

Scenario Strategy Why
Math, logic, or multi-step reasoning ChainOfThought Makes intermediate steps explicit and auditable
Tasks requiring external data ReAct Combines reasoning with tool calls
Complex workflows with ordered steps PlanAndExecute Plans the full sequence before acting
Quality-critical output Reflection Self-correction improves accuracy
Problems with many possible paths TreeOfThought Explores alternatives before committing
Simple, straightforward questions None CoT adds unnecessary overhead and latency

The Cost of CoT

CoT is not free. Each reasoning step consumes tokens from the context window and adds latency. For simple factual questions ("What is the capital of France?"), CoT wastes tokens on unnecessary reasoning. Use CoT selectively for tasks where explicit reasoning demonstrably improves accuracy.


Key Terms

  • Chain-of-Thought (CoT): A technique where the model generates intermediate reasoning steps before the final answer.
  • Zero-Shot CoT: Adding "Let's think step by step" to the prompt without providing examples.
  • Few-Shot CoT: Providing examples with explicit reasoning chains as context before the question.
  • ReAct: Reasoning and Acting: interleaving thoughts, tool calls, and observations.
  • Tree-of-Thought (ToT): Exploring multiple reasoning branches and backtracking from dead ends.
  • Self-Consistency: Generating multiple reasoning chains and selecting the most common final answer.
  • Reasoning Level: A model-native parameter controlling how much internal thinking the model performs.
  • Final Answer Marker: A delimiter ("Final Answer:") that separates the reasoning trace from the answer.



External Resources


Summary

Chain-of-Thought (CoT) is the technique of guiding language models to reason step by step before producing a final answer. It dramatically improves accuracy on complex tasks by decomposing problems, providing working memory through the context window, and activating latent reasoning capabilities. In LM-Kit.NET, CoT is implemented through the ChainOfThoughtHandler planning strategy, with related strategies for tool-augmented reasoning (ReAct), multi-path exploration (Tree-of-Thought), and self-correction (Reflection). The ReasoningLevel property provides an additional, model-native reasoning knob. CoT is essential for agentic AI applications where accuracy, transparency, and auditability of the reasoning process matter.

Share