Control Reasoning and Chain-of-Thought in Conversations

Reasoning models produce intermediate "thinking" tokens before generating their final answer. This chain-of-thought process improves accuracy on complex tasks like math, logic, and multi-step analysis, but costs extra tokens and time. LM-Kit.NET's ReasoningLevel property gives you a dial to control how much reasoning a model performs. This tutorial shows how to configure reasoning levels for different task types and measure the quality and performance trade-offs.

Why Reasoning Control Matters

Two real-world problems that reasoning control solves:

Balancing speed vs. accuracy per task. A customer support bot answering "What are your business hours?" doesn't need chain-of-thought reasoning. But the same bot solving "Calculate the total cost with the 15% loyalty discount, $8.50 shipping, and the buy-2-get-1-free promotion" benefits from step-by-step reasoning. Setting ReasoningLevel.None for simple queries and ReasoningLevel.High for complex ones optimizes both latency and correctness.
Controlling token budgets in production. Reasoning tokens count toward context usage and generation time. For high-throughput systems, setting ReasoningLevel.Low caps the reasoning overhead while still allowing the model to show its work on tricky inputs.

Prerequisites

Requirement	Minimum
.NET SDK	8.0+
Chat model	A model with reasoning capability (e.g., `qwen3.5:4b`, `qwen3.5:9b`)
VRAM	4 GB+

Models without reasoning support will ignore the ReasoningLevel setting and produce normal completions.

Step 1: Create the Project

dotnet new console -n ReasoningControl
cd ReasoningControl
dotnet add package LM-Kit.NET

Step 2: Understand Reasoning Levels

┌───────────────────────────────────────────────────┐
│              ReasoningLevel Spectrum              │
├──────────┬───────────┬───────────┬────────────────┤
│   None   │    Low    │  Medium   │     High       │
│          │           │           │                │
│ No extra │ Brief     │ Balanced  │ Deep           │
│ thinking │ scratch   │ reasoning │ deliberation   │
│          │ notes     │           │                │
│ Fastest  │ Fast      │ Moderate  │ Slowest        │
│ Simple   │ Moderate  │ Complex   │ Very complex   │
│ queries  │ tasks     │ problems  │ reasoning      │
└──────────┴───────────┴───────────┴────────────────┘

Level	Internal Behavior	Token Overhead	Best For
`None`	No reasoning tokens produced	Zero	Simple Q&A, lookup, greetings
`Low`	Minimal scratch space when helpful	Low	Moderate tasks, formatting
`Medium`	Balanced reasoning depth	Moderate	Multi-step problems, analysis
`High`	Maximum deliberation	High	Math, logic puzzles, complex code

Step 3: Configure Reasoning in SingleTurnConversation

using System.Text;
using System.Diagnostics;
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a reasoning-capable model
// ──────────────────────────────────────
using LM model = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p =>
    {
        Console.Write($"\r  Loading: {p * 100:F0}%   ");
        return true;
    });

Console.WriteLine($"\n  Model loaded: {model.Name}\n");

// ──────────────────────────────────────
// 2. No reasoning: fast, simple answers
// ──────────────────────────────────────
Console.WriteLine("=== ReasoningLevel.None ===\n");

var fastChat = new SingleTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.None,
    SystemPrompt = "You are a helpful assistant. Be concise.",
    MaximumCompletionTokens = 256
};

var sw = Stopwatch.StartNew();
TextGenerationResult fastResult = fastChat.Submit("What is the capital of France?");
sw.Stop();

Console.WriteLine($"Answer: {fastResult.Completion}");
Console.WriteLine($"Tokens: {fastResult.GeneratedTokenCount}, Time: {sw.ElapsedMilliseconds}ms");
Console.WriteLine($"Speed: {fastResult.TokenGenerationRate:F1} tokens/s\n");

// ──────────────────────────────────────
// 3. High reasoning: thorough analysis
// ──────────────────────────────────────
Console.WriteLine("=== ReasoningLevel.High ===\n");

var reasoningChat = new SingleTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.High,
    SystemPrompt = "You are a math tutor. Show your work step by step.",
    MaximumCompletionTokens = 1024
};

sw.Restart();
TextGenerationResult reasoningResult = reasoningChat.Submit(
    "A store offers 20% off all items. If you buy 3 shirts at $45 each " +
    "and 2 pants at $60 each, what is the total after the discount? " +
    "Also calculate the per-item average cost.");
sw.Stop();

Console.WriteLine($"Answer: {reasoningResult.Completion}");
Console.WriteLine($"Tokens: {reasoningResult.GeneratedTokenCount}, Time: {sw.ElapsedMilliseconds}ms");
Console.WriteLine($"Speed: {reasoningResult.TokenGenerationRate:F1} tokens/s\n");

Step 4: Configure Reasoning in MultiTurnConversation

using System.Text;
using System.Diagnostics;
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a reasoning-capable model
// ──────────────────────────────────────
using LM model = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p =>
    {
        Console.Write($"\r  Loading: {p * 100:F0}%   ");
        return true;
    });

// ──────────────────────────────────────
// Multi-turn conversation with reasoning
// ──────────────────────────────────────
Console.WriteLine("=== Multi-Turn with Medium Reasoning ===\n");

var multiTurn = new MultiTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.Medium,
    SystemPrompt = "You are a logical reasoning assistant. Think through problems carefully.",
    MaximumCompletionTokens = 512
};

// First turn: pose the problem
TextGenerationResult turn1 = multiTurn.Submit(
    "I have a 3-gallon jug and a 5-gallon jug. How do I measure exactly 4 gallons?");
Console.WriteLine($"Turn 1: {turn1.Completion}\n");

// Follow-up: ask for verification
TextGenerationResult turn2 = multiTurn.Submit(
    "Can you verify each step by tracking the water level in both jugs?");
Console.WriteLine($"Turn 2: {turn2.Completion}\n");

Console.WriteLine($"Total tokens generated: " +
    $"{turn1.GeneratedTokenCount + turn2.GeneratedTokenCount}");

Step 5: Compare Reasoning Levels Side by Side

using System.Text;
using System.Diagnostics;
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a reasoning-capable model
// ──────────────────────────────────────
using LM model = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p =>
    {
        Console.Write($"\r  Loading: {p * 100:F0}%   ");
        return true;
    });

Console.WriteLine($"\n  Model loaded: {model.Name}\n");

// ──────────────────────────────────────
// 2. No reasoning: fast, simple answers
// ──────────────────────────────────────
Console.WriteLine("=== ReasoningLevel.None ===\n");

var fastChat = new SingleTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.None,
    SystemPrompt = "You are a helpful assistant. Be concise.",
    MaximumCompletionTokens = 256
};

var sw = Stopwatch.StartNew();

// ──────────────────────────────────────
// Benchmark different reasoning levels
// ──────────────────────────────────────
Console.WriteLine("=== Reasoning Level Comparison ===\n");

string testPrompt = "If all roses are flowers and some flowers fade quickly, " +
                    "can we conclude that some roses fade quickly? Explain your reasoning.";

ReasoningLevel[] levels = { ReasoningLevel.None, ReasoningLevel.Low,
                            ReasoningLevel.Medium, ReasoningLevel.High };

Console.WriteLine($"{"Level",-10} {"Tokens",-10} {"Time (ms)",-12} {"Speed (t/s)",-12}");
Console.WriteLine(new string('─', 50));

foreach (ReasoningLevel level in levels)
{
    var chat = new SingleTurnConversation(model)
    {
        ReasoningLevel = level,
        SystemPrompt = "You are a logic instructor.",
        MaximumCompletionTokens = 512
    };

    sw.Restart();
    TextGenerationResult result = chat.Submit(testPrompt);
    sw.Stop();

    Console.WriteLine($"{level,-10} {result.GeneratedTokenCount,-10} " +
                      $"{sw.ElapsedMilliseconds,-12} {result.TokenGenerationRate,-12:F1}");
}

Step 6: Adaptive Reasoning Based on Task Complexity

Build a system that automatically selects the reasoning level based on the input:

using System.Text;
using System.Diagnostics;
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a reasoning-capable model
// ──────────────────────────────────────
using LM model = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p =>
    {
        Console.Write($"\r  Loading: {p * 100:F0}%   ");
        return true;
    });

// ──────────────────────────────────────
// Adaptive reasoning selector
// ──────────────────────────────────────
Console.WriteLine("\n=== Adaptive Reasoning ===\n");

string[] prompts =
{
    "Hello, how are you?",
    "Summarize the key differences between TCP and UDP.",
    "Prove that the square root of 2 is irrational.",
    "What color is the sky?"
};

foreach (string prompt in prompts)
{
    ReasoningLevel level = SelectReasoningLevel(prompt);

    var chat = new SingleTurnConversation(model)
    {
        ReasoningLevel = level,
        MaximumCompletionTokens = 512
    };

    TextGenerationResult result = chat.Submit(prompt);

    Console.WriteLine($"[{level}] \"{prompt}\"");
    Console.WriteLine($"  → {result.Completion.Split('\n')[0]}...");
    Console.WriteLine($"  Tokens: {result.GeneratedTokenCount}\n");
}

// ──────────────────────────────────────
// Helper: heuristic reasoning level selector
// ──────────────────────────────────────
static ReasoningLevel SelectReasoningLevel(string prompt)
{
    string lower = prompt.ToLowerInvariant();

    // High reasoning for math, proofs, and logic
    if (lower.Contains("prove") || lower.Contains("calculate") ||
        lower.Contains("solve") || lower.Contains("irrational") ||
        lower.Contains("equation"))
    {
        return ReasoningLevel.High;
    }

    // Medium for analysis and comparison tasks
    if (lower.Contains("compare") || lower.Contains("analyze") ||
        lower.Contains("differences") || lower.Contains("explain why") ||
        lower.Contains("summarize"))
    {
        return ReasoningLevel.Medium;
    }

    // Low for moderate complexity
    if (lower.Length > 100 || lower.Contains("describe") || lower.Contains("list"))
    {
        return ReasoningLevel.Low;
    }

    // None for simple queries
    return ReasoningLevel.None;
}

Step 7: Run the Application

dotnet run

Expected output pattern:

=== ReasoningLevel.None ===

Answer: The capital of France is Paris.
Tokens: 8, Time: 45ms
Speed: 177.8 tokens/s

=== ReasoningLevel.High ===

Answer: Let me work through this step by step...
Tokens: 186, Time: 1240ms
Speed: 150.0 tokens/s

=== Reasoning Level Comparison ===

Level      Tokens     Time (ms)    Speed (t/s)
──────────────────────────────────────────────────
None       42         280          150.0
Low        78         510          152.9
Medium     134        870          154.0
High       198        1340         147.8

ReasoningLevel Reference

Level	Enum Value	Description
`None`	0	No reasoning tokens requested or exposed. The model produces only the final answer
`Low`	1	Minimal internal deliberation. Brief scratch space used only when helpful
`Medium`	2	Balanced speed and quality. Moderate reasoning depth
`High`	3	Maximum reasoning depth. May trade off speed for thoroughness

Common Issues

Problem	Cause	Fix
`ReasoningLevel` has no visible effect	Model doesn't support reasoning	Use a reasoning-capable model (e.g., `qwen3.5:4b` or larger)
Tokens much higher than expected	`ReasoningLevel.High` produces many thinking tokens	Reduce to `Medium` or `Low` for less overhead
Error setting `ReasoningLevel` on `MultiTurnConversation`	Chat history already has messages	Set `ReasoningLevel` before the first `Submit` call
Same output regardless of level	Simple prompt doesn't need reasoning	Test with complex prompts (math, logic, multi-step analysis)

Next Steps

Control Token Sampling with Dynamic Strategies: fine-tune generation beyond reasoning level.
Build a Conversational Assistant with Memory: reasoning in long-running conversations.
Orchestrate Multi-Agent Workflows with Patterns: combine reasoning agents in pipelines.

Table of Contents