Table of Contents

Control Reasoning and Chain-of-Thought in Conversations

Reasoning models produce intermediate "thinking" tokens before generating their final answer. This chain-of-thought process improves accuracy on complex tasks like math, logic, and multi-step analysis, but costs extra tokens and time. LM-Kit.NET's ReasoningLevel property gives you a dial to control how much reasoning a model performs. This tutorial shows how to configure reasoning levels for different task types and measure the quality and performance trade-offs.


Why Reasoning Control Matters

Two real-world problems that reasoning control solves:

  1. Balancing speed vs. accuracy per task. A customer support bot answering "What are your business hours?" doesn't need chain-of-thought reasoning. But the same bot solving "Calculate the total cost with the 15% loyalty discount, $8.50 shipping, and the buy-2-get-1-free promotion" benefits from step-by-step reasoning. Setting ReasoningLevel.None for simple queries and ReasoningLevel.High for complex ones optimizes both latency and correctness.
  2. Controlling token budgets in production. Reasoning tokens count toward context usage and generation time. For high-throughput systems, setting ReasoningLevel.Low caps the reasoning overhead while still allowing the model to show its work on tricky inputs.

Prerequisites

Requirement Minimum
.NET SDK 8.0+
Chat model A model with reasoning capability (e.g., qwen3:4b, qwen3:8b)
VRAM 4 GB+

Models without reasoning support will ignore the ReasoningLevel setting and produce normal completions.


Step 1: Create the Project

dotnet new console -n ReasoningControl
cd ReasoningControl
dotnet add package LM-Kit.NET

Step 2: Understand Reasoning Levels

┌───────────────────────────────────────────────────┐
│              ReasoningLevel Spectrum              │
├──────────┬───────────┬───────────┬────────────────┤
│   None   │    Low    │  Medium   │     High       │
│          │           │           │                │
│ No extra │ Brief     │ Balanced  │ Deep           │
│ thinking │ scratch   │ reasoning │ deliberation   │
│          │ notes     │           │                │
│ Fastest  │ Fast      │ Moderate  │ Slowest        │
│ Simple   │ Moderate  │ Complex   │ Very complex   │
│ queries  │ tasks     │ problems  │ reasoning      │
└──────────┴───────────┴───────────┴────────────────┘
Level Internal Behavior Token Overhead Best For
None No reasoning tokens produced Zero Simple Q&A, lookup, greetings
Low Minimal scratch space when helpful Low Moderate tasks, formatting
Medium Balanced reasoning depth Moderate Multi-step problems, analysis
High Maximum deliberation High Math, logic puzzles, complex code

Step 3: Configure Reasoning in SingleTurnConversation

using System.Text;
using System.Diagnostics;
using LMKit.Model;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a reasoning-capable model
// ──────────────────────────────────────
using LM model = LM.LoadFromModelID("qwen3:4b",
    loadingProgress: p =>
    {
        Console.Write($"\r  Loading: {p * 100:F0}%   ");
        return true;
    });

Console.WriteLine($"\n  Model loaded: {model.ModelName}\n");

// ──────────────────────────────────────
// 2. No reasoning: fast, simple answers
// ──────────────────────────────────────
Console.WriteLine("=== ReasoningLevel.None ===\n");

var fastChat = new SingleTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.None,
    SystemPrompt = "You are a helpful assistant. Be concise.",
    MaximumCompletionTokens = 256
};

var sw = Stopwatch.StartNew();
TextGenerationResult fastResult = fastChat.Submit("What is the capital of France?");
sw.Stop();

Console.WriteLine($"Answer: {fastResult.Completion}");
Console.WriteLine($"Tokens: {fastResult.GeneratedTokenCount}, Time: {sw.ElapsedMilliseconds}ms");
Console.WriteLine($"Speed: {fastResult.TokenGenerationRate:F1} tokens/s\n");

// ──────────────────────────────────────
// 3. High reasoning: thorough analysis
// ──────────────────────────────────────
Console.WriteLine("=== ReasoningLevel.High ===\n");

var reasoningChat = new SingleTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.High,
    SystemPrompt = "You are a math tutor. Show your work step by step.",
    MaximumCompletionTokens = 1024
};

sw.Restart();
TextGenerationResult reasoningResult = reasoningChat.Submit(
    "A store offers 20% off all items. If you buy 3 shirts at $45 each " +
    "and 2 pants at $60 each, what is the total after the discount? " +
    "Also calculate the per-item average cost.");
sw.Stop();

Console.WriteLine($"Answer: {reasoningResult.Completion}");
Console.WriteLine($"Tokens: {reasoningResult.GeneratedTokenCount}, Time: {sw.ElapsedMilliseconds}ms");
Console.WriteLine($"Speed: {reasoningResult.TokenGenerationRate:F1} tokens/s\n");

Step 4: Configure Reasoning in MultiTurnConversation

// ──────────────────────────────────────
// Multi-turn conversation with reasoning
// ──────────────────────────────────────
Console.WriteLine("=== Multi-Turn with Medium Reasoning ===\n");

var multiTurn = new MultiTurnConversation(model)
{
    ReasoningLevel = ReasoningLevel.Medium,
    SystemPrompt = "You are a logical reasoning assistant. Think through problems carefully.",
    MaximumCompletionTokens = 512
};

// First turn: pose the problem
TextGenerationResult turn1 = multiTurn.Submit(
    "I have a 3-gallon jug and a 5-gallon jug. How do I measure exactly 4 gallons?");
Console.WriteLine($"Turn 1: {turn1.Completion}\n");

// Follow-up: ask for verification
TextGenerationResult turn2 = multiTurn.Submit(
    "Can you verify each step by tracking the water level in both jugs?");
Console.WriteLine($"Turn 2: {turn2.Completion}\n");

Console.WriteLine($"Total tokens generated: " +
    $"{turn1.GeneratedTokenCount + turn2.GeneratedTokenCount}");

Step 5: Compare Reasoning Levels Side by Side

// ──────────────────────────────────────
// Benchmark different reasoning levels
// ──────────────────────────────────────
Console.WriteLine("=== Reasoning Level Comparison ===\n");

string testPrompt = "If all roses are flowers and some flowers fade quickly, " +
                    "can we conclude that some roses fade quickly? Explain your reasoning.";

ReasoningLevel[] levels = { ReasoningLevel.None, ReasoningLevel.Low,
                            ReasoningLevel.Medium, ReasoningLevel.High };

Console.WriteLine($"{"Level",-10} {"Tokens",-10} {"Time (ms)",-12} {"Speed (t/s)",-12}");
Console.WriteLine(new string('─', 50));

foreach (ReasoningLevel level in levels)
{
    var chat = new SingleTurnConversation(model)
    {
        ReasoningLevel = level,
        SystemPrompt = "You are a logic instructor.",
        MaximumCompletionTokens = 512
    };

    sw.Restart();
    TextGenerationResult result = chat.Submit(testPrompt);
    sw.Stop();

    Console.WriteLine($"{level,-10} {result.GeneratedTokenCount,-10} " +
                      $"{sw.ElapsedMilliseconds,-12} {result.TokenGenerationRate,-12:F1}");
}

Step 6: Adaptive Reasoning Based on Task Complexity

Build a system that automatically selects the reasoning level based on the input:

// ──────────────────────────────────────
// Adaptive reasoning selector
// ──────────────────────────────────────
Console.WriteLine("\n=== Adaptive Reasoning ===\n");

string[] prompts =
{
    "Hello, how are you?",
    "Summarize the key differences between TCP and UDP.",
    "Prove that the square root of 2 is irrational.",
    "What color is the sky?"
};

foreach (string prompt in prompts)
{
    ReasoningLevel level = SelectReasoningLevel(prompt);

    var chat = new SingleTurnConversation(model)
    {
        ReasoningLevel = level,
        MaximumCompletionTokens = 512
    };

    TextGenerationResult result = chat.Submit(prompt);

    Console.WriteLine($"[{level}] \"{prompt}\"");
    Console.WriteLine($"  → {result.Completion.Split('\n')[0]}...");
    Console.WriteLine($"  Tokens: {result.GeneratedTokenCount}\n");
}

// ──────────────────────────────────────
// Helper: heuristic reasoning level selector
// ──────────────────────────────────────
static ReasoningLevel SelectReasoningLevel(string prompt)
{
    string lower = prompt.ToLowerInvariant();

    // High reasoning for math, proofs, and logic
    if (lower.Contains("prove") || lower.Contains("calculate") ||
        lower.Contains("solve") || lower.Contains("irrational") ||
        lower.Contains("equation"))
    {
        return ReasoningLevel.High;
    }

    // Medium for analysis and comparison tasks
    if (lower.Contains("compare") || lower.Contains("analyze") ||
        lower.Contains("differences") || lower.Contains("explain why") ||
        lower.Contains("summarize"))
    {
        return ReasoningLevel.Medium;
    }

    // Low for moderate complexity
    if (lower.Length > 100 || lower.Contains("describe") || lower.Contains("list"))
    {
        return ReasoningLevel.Low;
    }

    // None for simple queries
    return ReasoningLevel.None;
}

Step 7: Run the Application

dotnet run

Expected output pattern:

=== ReasoningLevel.None ===

Answer: The capital of France is Paris.
Tokens: 8, Time: 45ms
Speed: 177.8 tokens/s

=== ReasoningLevel.High ===

Answer: Let me work through this step by step...
Tokens: 186, Time: 1240ms
Speed: 150.0 tokens/s

=== Reasoning Level Comparison ===

Level      Tokens     Time (ms)    Speed (t/s)
──────────────────────────────────────────────────
None       42         280          150.0
Low        78         510          152.9
Medium     134        870          154.0
High       198        1340         147.8

ReasoningLevel Reference

Level Enum Value Description
None 0 No reasoning tokens requested or exposed. The model produces only the final answer
Low 1 Minimal internal deliberation. Brief scratch space used only when helpful
Medium 2 Balanced speed and quality. Moderate reasoning depth
High 3 Maximum reasoning depth. May trade off speed for thoroughness

Common Issues

Problem Cause Fix
ReasoningLevel has no visible effect Model doesn't support reasoning Use a reasoning-capable model (e.g., qwen3:4b or larger)
Tokens much higher than expected ReasoningLevel.High produces many thinking tokens Reduce to Medium or Low for less overhead
Error setting ReasoningLevel on MultiTurnConversation Chat history already has messages Set ReasoningLevel before the first Submit call
Same output regardless of level Simple prompt doesn't need reasoning Test with complex prompts (math, logic, multi-step analysis)

Next Steps