Table of Contents

How Do I Prevent an AI Agent from Misusing Tools in Production?


TL;DR

Use ToolPermissionPolicy to define exactly which tools an agent can use, with wildcard patterns for domain-level control. Every built-in tool exposes security metadata (risk level, side effects, read-only flag) that policies can filter on. For sensitive operations, require human approval via the ToolApprovalRequired event. Denied tool calls are reported back to the model so it can adapt its approach. This gives you enterprise-grade control without limiting agent intelligence.


The Permission Policy

ToolPermissionPolicy is a centralized allow/deny system that evaluates every tool call before execution:

using LMKit.Agents;
using LMKit.Agents.Tools;

var policy = new ToolPermissionPolicy()
    // Allow safe categories
    .AllowCategory("data", "text", "numeric", "utility", "security")

    // Deny dangerous categories entirely
    .DenyCategory("io", "net")

    // Fine-grained: allow filesystem reads, deny writes and deletes
    .Allow("filesystem_read", "filesystem_list")
    .Deny("filesystem_write", "filesystem_delete")

    // Allow read-only HTTP, deny mutations
    .Allow("http_get", "http_head")
    .Deny("http_post", "http_put", "http_delete")

    // Require human approval for process execution
    .RequireApproval("process_*")

    // Set maximum risk level
    .SetMaxRiskLevel(ToolRiskLevel.High);

var agent = Agent.CreateBuilder(model)
    .WithTools(tools =>
    {
        foreach (var tool in BuiltInTools.GetAll())
            tools.Register(tool);
    })
    .WithPermissionPolicy(policy)
    .Build();

Evaluation Order

When an agent tries to call a tool, the policy evaluates rules in this order:

  1. Explicit deny by name (highest priority)
  2. Deny by wildcard pattern (e.g., filesystem_*)
  3. Deny by category (e.g., DenyCategory("io"))
  4. Risk level gate (if SetMaxRiskLevel is configured)
  5. Approval required check
  6. Explicit allow by name
  7. Allow by wildcard pattern
  8. Allow by category
  9. Default action (configurable: Allow or Deny)

Deny always wins over allow at the same specificity level.


Tool Security Metadata

Every built-in tool exposes IToolMetadata with security-relevant properties:

Property Values Purpose
RiskLevel Low, Medium, High, Critical How dangerous is this tool?
SideEffect None, LocalRead, LocalWrite, NetworkRead, NetworkWrite, Irreversible What does the tool change?
IsReadOnly true/false Does the tool mutate any state?
IsIdempotent true/false Is it safe to retry?
DefaultApproval Never, Conditional, Always Does the tool suggest needing approval?
// Query tools by metadata
var safeTools = BuiltInTools.GetByMaxRisk(ToolRiskLevel.Low);    // Pure computation only
var readOnly = BuiltInTools.GetReadOnly();                        // No state mutation
var ioTools = BuiltInTools.GetByCategory("io");                   // File system, process

Human-in-the-Loop Approval

For tools that require approval, subscribe to the ToolApprovalRequired event:

agent.ToolApprovalRequired += (sender, args) =>
{
    Console.WriteLine($"Agent wants to call: {args.ToolCall.Name}");
    Console.WriteLine($"  Arguments: {args.ToolCall.Arguments}");
    Console.WriteLine($"  Risk level: {args.RiskLevel}");
    Console.WriteLine($"  Side effect: {args.SideEffect}");

    Console.Write("Allow? (y/n): ");
    if (Console.ReadLine()?.Trim().ToLower() == "y")
    {
        args.Approved = true;
    }
    else
    {
        args.DenialReason = "User declined this operation";
    }
};

If no handler is subscribed, tool calls that require approval are automatically denied. The model receives a ToolCallResultType.Denied response with the reason, so it can try an alternative approach.


Common Policy Profiles

Safe Chat (No I/O, No Network)

var chatPolicy = new ToolPermissionPolicy()
    .AllowCategory("data", "text", "numeric", "utility", "security")
    .DenyCategory("io", "net")
    .SetMaxRiskLevel(ToolRiskLevel.Low);

Developer Assistant (Read + Controlled Write)

var devPolicy = new ToolPermissionPolicy()
    .AllowCategory("data", "text", "numeric", "utility", "security")
    .Allow("filesystem_read", "filesystem_list", "filesystem_stat")
    .Deny("filesystem_delete")
    .RequireApproval("filesystem_write", "filesystem_create")
    .Allow("http_get", "http_head")
    .Allow("websearch")
    .RequireApproval("process_*");

Research Agent (Read-Only + Web)

var researchPolicy = new ToolPermissionPolicy()
    .AllowCategory("data", "text", "numeric", "utility")
    .Allow("websearch", "http_get")
    .DenyCategory("io")
    .SetMaxRiskLevel(ToolRiskLevel.Medium);

What Happens When a Tool Is Denied

When the policy denies a tool call:

  1. The tool is not executed.
  2. A ToolCallResultType.Denied result is sent back to the model.
  3. The denial reason is included so the model understands why.
  4. The model can adapt: try a different tool, ask the user, or change its approach.

This feedback loop prevents the agent from repeatedly hitting the same denial.


Share