Table of Contents

πŸ‘‰ Try the demo: https://github.com/LM-Kit/lm-kit-net-samples/tree/main/console_net/agents/filter_pipeline

Filter / Middleware Pipeline for C# .NET Applications


🎯 Purpose of the Demo

The Filter Pipeline demo shows how to use LM-Kit.NET's FilterPipeline to attach middleware-style filters that intercept the prompt, completion, and tool invocation stages of text generation. Filters follow the ASP.NET Core onion (middleware) pattern: code before await next(context) runs on the way in, code after runs on the way out.

The sample shows how to:

  • Attach prompt filters to log, rewrite, or short-circuit prompts before inference.
  • Attach completion filters to collect telemetry, enforce quality gates, or transform results.
  • Attach tool invocation filters to log, rate-limit, cache, cancel, or override individual tool calls.
  • Share cross-filter state via the Properties dictionary that flows through all three stages.
  • Use both inline lambdas and class-based filters (IPromptFilter, ICompletionFilter, IToolInvocationFilter).

Why Filters?

  • Separation of concerns: telemetry, security, caching, and moderation logic stays out of your main application code.
  • Composable: stack multiple filters in any order; each wraps the next like middleware layers.
  • Non-invasive: existing events (BeforeToolInvocation, AfterToolInvocation) continue to work alongside filters.
  • Portable: the same FilterPipeline instance can be shared between MultiTurnConversation and Agent.

πŸ‘₯ Who Should Use This Demo

  • Enterprise developers who need audit logging, content moderation, or compliance checks on every LLM interaction.
  • Platform engineers building shared AI infrastructure who want pluggable middleware for telemetry, rate limiting, and caching.
  • Agent builders who need fine-grained control over the tool-calling loop (cancel, override, or terminate).
  • Anyone familiar with ASP.NET Core middleware who wants the same composable pattern for LLM pipelines.

πŸš€ What Problem It Solves

Without filters, cross-cutting concerns like logging, caching, rate limiting, and prompt rewriting get tangled into application logic. The FilterPipeline provides clean interception points at three stages:

  1. Prompt stage: modify or validate the user prompt before it reaches the model.
  2. Completion stage: inspect, transform, or replace the model's output after inference.
  3. Tool invocation stage: intercept each individual tool call during the automatic tool-calling loop.

Each filter receives a context object and a next delegate. Calling next passes control to the next filter (or the core operation). Not calling next short-circuits the pipeline.


πŸ’» Demo Application Overview

Console app with three parts:

Part 1: Prompt & Completion Filters

  • Logger filter: logs every prompt and measures inference time.
  • Rewriter filter: appends a brevity constraint to the prompt.
  • Telemetry filter: reports token count, generation speed, and quality score after completion.

Part 2: Tool Invocation Filters with Agent

  • Logger filter: logs tool name, arguments, and batch position for every tool call.
  • Rate limiter filter: blocks tool calls after a configurable threshold.
  • Cache filter: stores and returns cached results for identical tool calls.

Part 3: Interactive Chat

  • Combines all three filter types in a live multi-turn chat.
  • Shows per-turn timing, tool invocation logs, and session statistics via /stats.

✨ Key Features

  • Lambda-friendly API: AddPromptFilter(), AddCompletionFilter(), AddToolInvocationFilter() accept inline delegates.
  • Method chaining: new FilterPipeline().AddPromptFilter(...).AddCompletionFilter(...).AddToolInvocationFilter(...).
  • Cross-filter state: Properties dictionary shared between prompt, completion, and tool filters within a single request.
  • Short-circuit support: set ctx.Result in a prompt filter to skip inference entirely (e.g., for caching).
  • Tool loop control: set ctx.Cancel to skip a tool, ctx.Terminate to stop the tool-calling loop.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   FilterPipeline                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                        β”‚
β”‚  User Prompt                                           β”‚
β”‚       β”‚                                                β”‚
β”‚       β–Ό                                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚ Prompt Filter 1 │──│ Prompt Filter 2 │──► Inference β”‚
β”‚  β”‚  (Logger)       β”‚  β”‚  (Rewriter)     β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚                                                β”‚       β”‚
β”‚                                                β–Ό       β”‚
β”‚                                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚                                   β”‚ Completion     β”‚   β”‚
β”‚                                   β”‚ Filter         β”‚   β”‚
β”‚                                   β”‚ (Telemetry)    β”‚   β”‚
β”‚                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                        β”‚
β”‚  During tool-calling loop:                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚  β”‚ Tool Filter│─│ Tool Filter│─│ Tool Filter│─► Tool   β”‚
β”‚  β”‚ (Logger)   β”‚ β”‚ (RateLimit)β”‚ β”‚ (Cache)    β”‚          β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
β”‚                                                        β”‚
β”‚  Onion pattern: code before next() runs inward,        β”‚
β”‚  code after next() runs outward.                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

βš™οΈ Getting Started

πŸ“‹ Prerequisites

  • .NET 8.0 or later
  • LM-Kit.NET SDK
  • 6 to 18 GB VRAM (depending on model choice)

πŸ“₯ Download the Project

▢️ Running the Application

cd console_net/agents/filter_pipeline
dotnet run

Select a model, then the demo walks through all three parts automatically before entering interactive mode.


πŸ’‘ Example Usage

Inline Prompt and Completion Filters

chat.Filters = new FilterPipeline()
    .AddPromptFilter(async (ctx, next) =>
    {
        Console.WriteLine($"Prompt: {ctx.Prompt}");
        await next(ctx);
    })
    .AddCompletionFilter(async (ctx, next) =>
    {
        await next(ctx);
        Console.WriteLine($"Tokens: {ctx.Result.GeneratedTokens.Count}");
    });

Tool Invocation Filter with Caching

var cache = new Dictionary<string, ToolCallResult>();

pipeline.AddToolInvocationFilter(async (ctx, next) =>
{
    string key = $"{ctx.ToolCall.Name}:{ctx.ToolCall.ArgumentsJson}";
    if (cache.TryGetValue(key, out var cached))
    {
        ctx.Result = cached;
        return; // skip actual tool call
    }

    await next(ctx);
    if (ctx.Result != null) cache[key] = ctx.Result;
});

Agent-Level Filters via AgentBuilder

var agent = Agent.CreateBuilder(model)
    .WithInstruction("You are a helpful assistant.")
    .WithTools(tools => tools.Register(BuiltInTools.CalcArithmetic))
    .WithFilters(filters =>
    {
        filters.AddPromptFilter(async (ctx, next) => { /* ... */ await next(ctx); });
        filters.AddToolInvocationFilter(async (ctx, next) => { /* ... */ await next(ctx); });
    })
    .Build();

πŸ”§ Troubleshooting

Issue Solution
Filters not executing Ensure chat.Filters or AgentBuilder.WithFilters() is set before calling Submit() or ExecuteAsync()
Prompt filter skips inference A filter is setting ctx.Result without calling next(), which short-circuits the pipeline
Tool filter blocks all calls Check for a rate-limiting filter that sets ctx.Cancel = true
Properties bag empty in completion filter The Properties dictionary is shared by reference; ensure prompt and completion contexts use the same instance

πŸš€ Extend the Demo

  • Implement class-based filters (IPromptFilter, ICompletionFilter, IToolInvocationFilter) for reusable, testable middleware
  • Add a semantic cache prompt filter that checks embedding similarity before running inference
  • Build a content moderation prompt filter that rejects harmful inputs
  • Combine with ToolPermissionPolicy for defense-in-depth: policies for access control, filters for cross-cutting logic