👉 Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/agents/filter_pipeline

Filter / Middleware Pipeline for C# .NET Applications

🎯 Purpose of the Demo

The Filter Pipeline demo shows how to use LM-Kit.NET's FilterPipeline to attach middleware-style filters that intercept the prompt, completion, and tool invocation stages of text generation. Filters follow the ASP.NET Core onion (middleware) pattern: code before await next(context) runs on the way in, code after runs on the way out.

The sample shows how to:

Attach prompt filters to log, rewrite, or short-circuit prompts before inference.
Attach completion filters to collect telemetry, enforce quality gates, or transform results.
Attach tool invocation filters to log, rate-limit, cache, cancel, or override individual tool calls.
Share cross-filter state via the Properties dictionary that flows through all three stages.
Use both inline lambdas and class-based filters (IPromptFilter, ICompletionFilter, IToolInvocationFilter).

Why Filters?

Separation of concerns: telemetry, security, caching, and moderation logic stays out of your main application code.
Composable: stack multiple filters in any order; each wraps the next like middleware layers.
Non-invasive: existing events (BeforeToolInvocation, AfterToolInvocation) continue to work alongside filters.
Portable: the same FilterPipeline instance can be shared between MultiTurnConversation and Agent.

👥 Who Should Use This Demo

Enterprise developers who need audit logging, content moderation, or compliance checks on every LLM interaction.
Platform engineers building shared AI infrastructure who want pluggable middleware for telemetry, rate limiting, and caching.
Agent builders who need fine-grained control over the tool-calling loop (cancel, override, or terminate).
Anyone familiar with ASP.NET Core middleware who wants the same composable pattern for LLM pipelines.

🚀 What Problem It Solves

Without filters, cross-cutting concerns like logging, caching, rate limiting, and prompt rewriting get tangled into application logic. The FilterPipeline provides clean interception points at three stages:

Prompt stage: modify or validate the user prompt before it reaches the model.
Completion stage: inspect, transform, or replace the model's output after inference.
Tool invocation stage: intercept each individual tool call during the automatic tool-calling loop.

Each filter receives a context object and a next delegate. Calling next passes control to the next filter (or the core operation). Not calling next short-circuits the pipeline.

💻 Demo Application Overview

Console app with three parts:

Part 1: Prompt & Completion Filters

Logger filter: logs every prompt and measures inference time.
Rewriter filter: appends a brevity constraint to the prompt.
Telemetry filter: reports token count, generation speed, and quality score after completion.

Part 2: Tool Invocation Filters with Agent

Logger filter: logs tool name, arguments, and batch position for every tool call.
Rate limiter filter: blocks tool calls after a configurable threshold.
Cache filter: stores and returns cached results for identical tool calls.

Part 3: Interactive Chat

Combines all three filter types in a live multi-turn chat.
Shows per-turn timing, tool invocation logs, and session statistics via /stats.

✨ Key Features

Lambda-friendly API: AddPromptFilter(), AddCompletionFilter(), AddToolInvocationFilter() accept inline delegates.
Method chaining: new FilterPipeline().AddPromptFilter(...).AddCompletionFilter(...).AddToolInvocationFilter(...).
Cross-filter state: Properties dictionary shared between prompt, completion, and tool filters within a single request.
Short-circuit support: set ctx.Result in a prompt filter to skip inference entirely (e.g., for caching).
Tool loop control: set ctx.Cancel to skip a tool, ctx.Terminate to stop the tool-calling loop.

🏗️ Architecture

┌────────────────────────────────────────────────────────┐
│                   FilterPipeline                       │
├────────────────────────────────────────────────────────┤
│                                                        │
│  User Prompt                                           │
│       │                                                │
│       ▼                                                │
│  ┌─────────────────┐  ┌─────────────────┐              │
│  │ Prompt Filter 1 │──│ Prompt Filter 2 │──► Inference │
│  │  (Logger)       │  │  (Rewriter)     │              │
│  └─────────────────┘  └─────────────────┘              │
│                                                │       │
│                                                ▼       │
│                                   ┌────────────────┐   │
│                                   │ Completion     │   │
│                                   │ Filter         │   │
│                                   │ (Telemetry)    │   │
│                                   └────────────────┘   │
│                                                        │
│  During tool-calling loop:                             │
│  ┌────────────┐ ┌────────────┐ ┌────────────┐         │
│  │ Tool Filter│─│ Tool Filter│─│ Tool Filter│─► Tool   │
│  │ (Logger)   │ │ (RateLimit)│ │ (Cache)    │          │
│  └────────────┘ └────────────┘ └────────────┘         │
│                                                        │
│  Onion pattern: code before next() runs inward,        │
│  code after next() runs outward.                       │
└────────────────────────────────────────────────────────┘

⚙️ Getting Started

📋 Prerequisites

.NET 8.0 or later
LM-Kit.NET SDK
6 to 18 GB VRAM (depending on model choice)

📥 Download the Project

.NET Console Demo

▶️ Running the Application

cd console_net/agents/filter_pipeline
dotnet run

Select a model, then the demo walks through all three parts automatically before entering interactive mode.

💡 Example Usage

Inline Prompt and Completion Filters

chat.Filters = new FilterPipeline()
    .AddPromptFilter(async (ctx, next) =>
    {
        Console.WriteLine($"Prompt: {ctx.Prompt}");
        await next(ctx);
    })
    .AddCompletionFilter(async (ctx, next) =>
    {
        await next(ctx);
        Console.WriteLine($"Tokens: {ctx.Result.GeneratedTokens.Count}");
    });

Tool Invocation Filter with Caching

var cache = new Dictionary<string, ToolCallResult>();

pipeline.AddToolInvocationFilter(async (ctx, next) =>
{
    string key = $"{ctx.ToolCall.Name}:{ctx.ToolCall.ArgumentsJson}";
    if (cache.TryGetValue(key, out var cached))
    {
        ctx.Result = cached;
        return; // skip actual tool call
    }

    await next(ctx);
    if (ctx.Result != null) cache[key] = ctx.Result;
});

Agent-Level Filters via AgentBuilder

var agent = Agent.CreateBuilder(model)
    .WithInstruction("You are a helpful assistant.")
    .WithTools(tools => tools.Register(BuiltInTools.CalcArithmetic))
    .WithFilters(filters =>
    {
        filters.AddPromptFilter(async (ctx, next) => { /* ... */ await next(ctx); });
        filters.AddToolInvocationFilter(async (ctx, next) => { /* ... */ await next(ctx); });
    })
    .Build();

🔧 Troubleshooting

Issue	Solution
Filters not executing	Ensure `chat.Filters` or `AgentBuilder.WithFilters()` is set before calling `Submit()` or `ExecuteAsync()`
Prompt filter skips inference	A filter is setting `ctx.Result` without calling `next()`, which short-circuits the pipeline
Tool filter blocks all calls	Check for a rate-limiting filter that sets `ctx.Cancel = true`
Properties bag empty in completion filter	The `Properties` dictionary is shared by reference; ensure prompt and completion contexts use the same instance

🚀 Extend the Demo

Implement class-based filters (IPromptFilter, ICompletionFilter, IToolInvocationFilter) for reusable, testable middleware
Add a semantic cache prompt filter that checks embedding similarity before running inference
Build a content moderation prompt filter that rejects harmful inputs
Combine with ToolPermissionPolicy for defense-in-depth: policies for access control, filters for cross-cutting logic

📚 What to Read Next

Add Middleware Filters to Agents and Conversations: step-by-step how-to guide
Intercept and Control Tool Invocations: event-based tool interception (complementary to filters)
Secure Agent Tool Access with Permission Policies: policy-based access control
Tool Calling Assistant Demo: custom tool implementation

Table of Contents