Route Prompts Across Models with RouterOrchestrator

Not every prompt needs the same model. A simple greeting wastes resources on a 12B parameter model, while a complex reasoning task fails on a 1B model. LM-Kit.NET's RouterOrchestrator lets you define named routes, each backed by a different agent (and potentially a different model), then direct each incoming prompt to the right agent automatically. Combined with FallbackAgentExecutor for resilience, you can build heterogeneous architectures that cut inference costs by 80%+ while maintaining quality where it matters. This guide builds a production-ready prompt routing system from scratch.

Why Prompt Routing Matters

Two production problems that prompt routing solves:

Inference cost optimization at scale. Running every request through your largest model is the default approach, but 70% of production prompts are simple lookups, greetings, or short answers. Routing these to a small, fast model and reserving the large model for complex reasoning tasks reduces GPU utilization and latency dramatically. Organizations running thousands of requests per minute see immediate impact.
Specialized quality across domains. A model trained for code generation might underperform on creative writing, and vice versa. Routing code questions to a code-specialized agent and general questions to a chat-optimized agent ensures each domain gets the best available model, without requiring a single model that excels at everything.

Prerequisites

Requirement	Minimum
.NET SDK	8.0+
VRAM	8+ GB (two models loaded simultaneously)
Disk	~6 GB free for model downloads

Note: This guide loads two models simultaneously. If VRAM is limited, you can use the same model for all agents and focus on the routing logic. The architecture remains identical.

Step 1: Create the Project

dotnet new console -n PromptRouter
cd PromptRouter
dotnet add package LM-Kit.NET

Step 2: Load Multiple Models and Create Specialized Agents

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

Agent codeAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("code-specialist")
    .WithInstruction(
        "You are a coding assistant specialized in C# and .NET. " +
        "Provide clean, well-documented code with explanations.")
    .WithMaxIterations(3)
    .Build();

Console.WriteLine("All agents ready.\n");

Step 3: Configure the RouterOrchestrator

The RouterOrchestrator maps named routes to agents and uses a routing function to direct each prompt:

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

Agent codeAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("code-specialist")
    .WithInstruction(
        "You are a coding assistant specialized in C# and .NET. " +
        "Provide clean, well-documented code with explanations.")
    .WithMaxIterations(3)
    .Build();

// ──────────────────────────────────────
// 4. Build the router with named routes
// ──────────────────────────────────────
var router = new RouterOrchestrator()
    .AddRoute("quick", quickAgent)
    .AddRoute("reasoning", reasoningAgent)
    .AddRoute("code", codeAgent)
    .WithDefaultRoute("quick");

Console.WriteLine("Router configured with 3 routes: quick, reasoning, code\n");

Step 4: Implement Keyword-Based Routing

The simplest routing strategy uses keyword matching to classify prompts. This works well for clear-cut categories:

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

Agent codeAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("code-specialist")
    .WithInstruction(
        "You are a coding assistant specialized in C# and .NET. " +
        "Provide clean, well-documented code with explanations.")
    .WithMaxIterations(3)
    .Build();

// ──────────────────────────────────────
// 4. Build the router with named routes
// ──────────────────────────────────────
var router = new RouterOrchestrator()
    .AddRoute("quick", quickAgent)
    .AddRoute("reasoning", reasoningAgent)
    .AddRoute("code", codeAgent)
    .WithDefaultRoute("quick");

// ──────────────────────────────────────
// 5. Keyword-based routing function
// ──────────────────────────────────────
string[] codeKeywords = { "code", "function", "class", "implement", "debug", "compile",
                          "syntax", "C#", "csharp", "method", "algorithm", "API" };

string[] reasoningKeywords = { "explain", "why", "analyze", "compare", "calculate",
                               "evaluate", "pros and cons", "step by step", "reasoning",
                               "complex", "strategy", "tradeoff" };

router.WithRoutingFunction((input, agents) =>
{
    string lower = input.ToLowerInvariant();

    // Check code keywords first (most specific)
    foreach (string keyword in codeKeywords)
    {
        if (lower.Contains(keyword.ToLowerInvariant()))
            return "code";
    }

    // Check reasoning keywords
    foreach (string keyword in reasoningKeywords)
    {
        if (lower.Contains(keyword.ToLowerInvariant()))
            return "reasoning";
    }

    // Short prompts go to the fast model
    if (input.Split(' ').Length < 15)
        return "quick";

    // Default to reasoning for longer prompts
    return "reasoning";
});

Step 5: Test the Router

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

Agent codeAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("code-specialist")
    .WithInstruction(
        "You are a coding assistant specialized in C# and .NET. " +
        "Provide clean, well-documented code with explanations.")
    .WithMaxIterations(3)
    .Build();

// ──────────────────────────────────────
// 4. Build the router with keyword routing
// ──────────────────────────────────────
string[] codeKeywords = { "code", "function", "class", "implement", "debug", "compile",
                          "syntax", "C#", "csharp", "method", "algorithm", "API" };

string[] reasoningKeywords = { "explain", "why", "analyze", "compare", "calculate",
                               "evaluate", "pros and cons", "step by step", "reasoning",
                               "complex", "strategy", "tradeoff" };

var router = new RouterOrchestrator()
    .AddRoute("quick", quickAgent)
    .AddRoute("reasoning", reasoningAgent)
    .AddRoute("code", codeAgent)
    .WithDefaultRoute("quick")
    .WithRoutingFunction((input, agents) =>
    {
        string lower = input.ToLowerInvariant();

        foreach (string keyword in codeKeywords)
        {
            if (lower.Contains(keyword.ToLowerInvariant()))
                return "code";
        }

        foreach (string keyword in reasoningKeywords)
        {
            if (lower.Contains(keyword.ToLowerInvariant()))
                return "reasoning";
        }

        if (input.Split(' ').Length < 15)
            return "quick";

        return "reasoning";
    });

// ──────────────────────────────────────
// 5. Route various prompts
// ──────────────────────────────────────
string[] testPrompts = {
    "What is the capital of France?",
    "Explain why transformer architectures use multi-head attention instead of single attention.",
    "Implement a binary search algorithm in C# with generic type support.",
    "Hello!",
    "Compare the pros and cons of microservices versus monolithic architectures for a team of 5.",
    "Write a C# method that validates email addresses using regex."
};

foreach (string prompt in testPrompts)
{
    Console.ForegroundColor = ConsoleColor.Cyan;
    Console.WriteLine($"User: {prompt}");
    Console.ResetColor();

    var result = await router.ExecuteAsync(prompt);

    Console.ForegroundColor = ConsoleColor.Green;
    Console.WriteLine($"  [Routed to: {result.AgentResults.LastOrDefault()?.AgentName}]");
    Console.ResetColor();
    Console.WriteLine($"  {result.Content}\n");
}

Step 6: Use an LLM as the Router

For more nuanced routing, replace the keyword function with a lightweight LLM that classifies prompts. The fast model itself can serve as the router:

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

Agent codeAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("code-specialist")
    .WithInstruction(
        "You are a coding assistant specialized in C# and .NET. " +
        "Provide clean, well-documented code with explanations.")
    .WithMaxIterations(3)
    .Build();

// ──────────────────────────────────────
// 7. LLM-powered routing
// ──────────────────────────────────────
Agent routerAgent = Agent.CreateBuilder(fastModel)
    .WithInstruction(
        "You are a prompt classifier. Given a user message, respond with exactly " +
        "one word: 'quick', 'reasoning', or 'code'. " +
        "Use 'quick' for greetings, simple facts, and short answers. " +
        "Use 'reasoning' for analysis, comparisons, and explanations. " +
        "Use 'code' for programming, implementation, and debugging tasks. " +
        "Respond with the single word only, nothing else.")
    .WithMaxIterations(1)
    .Build();

var smartRouter = new RouterOrchestrator()
    .AddRoute("quick", quickAgent)
    .AddRoute("reasoning", reasoningAgent)
    .AddRoute("code", codeAgent)
    .WithDefaultRoute("quick")
    .WithRouterAgent(routerAgent);

Console.WriteLine("── LLM-Powered Router ──\n");

var smartResult = await smartRouter.ExecuteAsync(
    "Can you help me optimize a LINQ query that's causing N+1 database calls?");

Console.WriteLine($"Routed to: {smartResult.AgentResults.LastOrDefault()?.AgentName}");
Console.WriteLine($"Response: {smartResult.Content}\n");

The WithRouterAgent method lets the router LLM decide the route based on semantic understanding rather than simple keywords.

Step 7: Add Fallback Resilience

Wrap the router with FallbackAgentExecutor so that if the primary agent fails (timeout, out-of-memory, model error), the system automatically falls back to an alternative:

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

// ──────────────────────────────────────
// 8. Fallback chain: capable → fast → error message
// ──────────────────────────────────────
var fallbackExecutor = new FallbackAgentExecutor()
    .AddAgent(reasoningAgent)
    .AddAgent(quickAgent)
    .OnFallback((agent, ex, attempt) =>
    {
        Console.ForegroundColor = ConsoleColor.Yellow;
        Console.WriteLine($"  [FALLBACK] Agent '{agent.Identity.Persona}' failed on attempt {attempt}: {ex.Message}");
        Console.WriteLine($"  [FALLBACK] Trying next agent...");
        Console.ResetColor();
    });

Console.WriteLine("── Fallback Execution ──");
var fallbackResult = await fallbackExecutor.ExecuteAsync(
    "Explain the difference between async and parallel programming in .NET.");

Console.WriteLine($"Result from: {fallbackResult.AgentName}");
Console.WriteLine($"Response: {fallbackResult.Content}\n");

Step 8: Interactive Router with Metrics

Build a complete interactive loop that tracks routing statistics:

using System.Text;
using LMKit.Model;
using LMKit.Agents;
using LMKit.Agents.Orchestration;
using LMKit.Agents.Resilience;
using LMKit.Agents.Tools.BuiltIn;

LMKit.Licensing.LicenseManager.SetLicenseKey("");

Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;

// ──────────────────────────────────────
// 1. Load a fast model for simple tasks
// ──────────────────────────────────────
Console.WriteLine("Loading fast model (1B)...");
using LM fastModel = LM.LoadFromModelID("gemma3:1b",
    loadingProgress: p => { Console.Write($"\rFast model: {p * 100:F0}%   "); return true; });
Console.WriteLine();

// ──────────────────────────────────────
// 2. Load a capable model for complex tasks
// ──────────────────────────────────────
Console.WriteLine("Loading capable model (4B)...");
using LM capableModel = LM.LoadFromModelID("qwen3.5:4b",
    loadingProgress: p => { Console.Write($"\rCapable model: {p * 100:F0}%   "); return true; });
Console.WriteLine("\n");

// ──────────────────────────────────────
// 3. Create specialized agents
// ──────────────────────────────────────
Agent quickAgent = Agent.CreateBuilder(fastModel)
    .WithPersona("quick-responder")
    .WithInstruction(
        "You are a fast, concise assistant for simple questions. " +
        "Answer in 1 to 2 sentences maximum. Be direct.")
    .WithMaxIterations(1)
    .Build();

Agent reasoningAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("reasoning-expert")
    .WithInstruction(
        "You are a thorough reasoning assistant for complex questions. " +
        "Think step by step, provide detailed explanations.")
    .WithTools(tools =>
    {
        tools.Register(BuiltInTools.CalcArithmetic);
        tools.Register(BuiltInTools.DateTimeNow);
    })
    .WithMaxIterations(5)
    .Build();

Agent codeAgent = Agent.CreateBuilder(capableModel)
    .WithPersona("code-specialist")
    .WithInstruction(
        "You are a coding assistant specialized in C# and .NET. " +
        "Provide clean, well-documented code with explanations.")
    .WithMaxIterations(3)
    .Build();

// ──────────────────────────────────────
// 4. Build the LLM-powered router
// ──────────────────────────────────────
Agent routerAgent = Agent.CreateBuilder(fastModel)
    .WithInstruction(
        "You are a prompt classifier. Given a user message, respond with exactly " +
        "one word: 'quick', 'reasoning', or 'code'. " +
        "Use 'quick' for greetings, simple facts, and short answers. " +
        "Use 'reasoning' for analysis, comparisons, and explanations. " +
        "Use 'code' for programming, implementation, and debugging tasks. " +
        "Respond with the single word only, nothing else.")
    .WithMaxIterations(1)
    .Build();

var smartRouter = new RouterOrchestrator()
    .AddRoute("quick", quickAgent)
    .AddRoute("reasoning", reasoningAgent)
    .AddRoute("code", codeAgent)
    .WithDefaultRoute("quick")
    .WithRouterAgent(routerAgent);

// ──────────────────────────────────────
// 5. Interactive loop with route tracking
// ──────────────────────────────────────
var routeStats = new Dictionary<string, int>
{
    ["quick"] = 0,
    ["reasoning"] = 0,
    ["code"] = 0
};

Console.WriteLine("── Interactive Router ──");
Console.WriteLine("Type a message (or 'quit' to exit, 'stats' for metrics):\n");

while (true)
{
    Console.ForegroundColor = ConsoleColor.White;
    Console.Write("You: ");
    string? input = Console.ReadLine();
    Console.ResetColor();

    if (string.IsNullOrWhiteSpace(input)) continue;
    if (input.Equals("quit", StringComparison.OrdinalIgnoreCase)) break;

    if (input.Equals("stats", StringComparison.OrdinalIgnoreCase))
    {
        int total = routeStats.Values.Sum();
        Console.WriteLine("\n── Routing Statistics ──");
        foreach (var stat in routeStats)
        {
            double pct = total > 0 ? (double)stat.Value / total * 100 : 0;
            Console.WriteLine($"  {stat.Key}: {stat.Value} requests ({pct:F1}%)");
        }
        Console.WriteLine($"  Total: {total} requests\n");
        continue;
    }

    var response = await smartRouter.ExecuteAsync(input);

    string routedTo = response.AgentResults.LastOrDefault()?.AgentName ?? "unknown";
    if (routeStats.ContainsKey(routedTo))
        routeStats[routedTo]++;

    Console.ForegroundColor = ConsoleColor.DarkGray;
    Console.WriteLine($"  [Route: {routedTo}]");
    Console.ResetColor();
    Console.WriteLine($"Assistant: {response.Content}\n");
}

// Final stats
Console.WriteLine("\n── Final Session Statistics ──");
foreach (var stat in routeStats)
    Console.WriteLine($"  {stat.Key}: {stat.Value} requests");

Common Issues

Problem	Cause	Fix
Out-of-memory with two models	Insufficient VRAM for simultaneous loading	Use a single model for all agents, or use smaller quantizations
Router always picks the same route	Routing function logic too broad	Add more specific keywords or use `WithRouterAgent` for semantic routing
LLM router returns unexpected text	Router agent instruction not strict enough	Use grammar constraints with `CreateGrammarFromStringList` to force valid route names
Fallback not triggered	Exception type not caught by default handler	Use `HandleException` on `FallbackAgentExecutor` to specify which exceptions trigger fallback
Slow routing with LLM router	Router model is too large	Use the smallest available model (1B) for the router agent

Next Steps

Build a Multi-Agent Workflow: use Pipeline, Parallel, and Supervisor orchestration alongside routing.
Build a Resilient Production Agent: add retry policies and circuit breakers to your routed agents.
Enforce Structured Output with Grammar-Constrained Decoding: use grammar constraints on the router agent for deterministic route selection.
Stream Agent Responses in Real Time: add token streaming to your routed agents for responsive UIs.

Table of Contents