Table of Contents

Interface ICompletionFilter

Namespace
LMKit.TextGeneration.Filters
Assembly
LM-Kit.NET.dll

A filter that intercepts the completion pipeline after inference produces a result.

public interface ICompletionFilter

Examples

Telemetry filter that records token usage:

using LMKit.TextGeneration.Filters;
using System;
using System.Threading.Tasks;

public sealed class TokenTelemetryFilter : ICompletionFilter
{
    public async Task OnCompletionAsync(
        CompletionFilterContext context,
        Func<CompletionFilterContext, Task> next)
    {
        await next(context);

        Console.WriteLine(
            $"Prompt tokens: {context.Result.PromptTokenCount}, " +
            $"Generated: {context.Result.GeneratedTokenCount}, " +
            $"Quality: {context.Result.QualityScore:F2}");
    }
}

Quality gate that rejects low-quality completions:

using LMKit.TextGeneration.Filters;
using System;
using System.Threading.Tasks;

public sealed class QualityGateFilter : ICompletionFilter
{
    private readonly float _minScore;

    public QualityGateFilter(float minimumQualityScore = 0.5f)
    {
        _minScore = minimumQualityScore;
    }

    public async Task OnCompletionAsync(
        CompletionFilterContext context,
        Func<CompletionFilterContext, Task> next)
    {
        await next(context);

        if (context.Result != null &&
            context.Result.QualityScore < _minScore)
        {
            throw new InvalidOperationException(
                $"Completion quality {context.Result.QualityScore:F2} " +
                $"is below threshold {_minScore:F2}.");
        }
    }
}

Remarks

Completion filters execute in a middleware (onion) pattern around the point where the inference result is finalized and returned. Code before await next(context) runs before the result is finalized; code after runs with the result available on Result.

Common use cases:

  • Quality validation: check the completion meets minimum quality thresholds.
  • Output transformation: sanitize, format, or post-process the generated text.
  • Telemetry: record token usage, latency, and quality metrics.
  • Response caching: store the result for future semantic cache lookups.
  • Guardrails: reject completions that violate output policies.

Methods

OnCompletionAsync(CompletionFilterContext, Func<CompletionFilterContext, Task>)

Called when a completion result is being finalized.