What Is Agent Governance? And Why Your LLM Needs It
What Is Agent Governance? And Why Your LLM Needs It
"Governance" in a software context usually means policy documentation, access reviews, and the occasional audit. For AI agents, governance means something more immediate: defining in code exactly what an autonomous system is allowed to do, then enforcing it on every action it takes.
The distinction matters because an AI agent does not wait for a human to approve each step. It reasons, decides, and executes — often faster than any review process could intervene. By the time a human notices something went wrong, the agent may have read files it should not have read, called APIs it should not have called, or passed data to an external system in ways that were not anticipated when the original task was assigned.
Agent governance is the set of mechanisms that prevent that outcome.
What "Governance" Means When the Agent Is Autonomous
In a traditional software system, access control is enforced at the authentication boundary. A service account has a role; the role has permissions; the permissions gate what the service can do. Audits happen after the fact, but the boundary is at least well-defined.
An AI agent operating over MCP does not have a fixed call graph. The model decides which tools to invoke based on what it has been asked to do and what tools it can see. A sufficiently broad tool surface means a sufficiently broad blast radius if the model reasons incorrectly, follows an injected instruction, or encounters an edge case the developer did not anticipate.
Governance for autonomous agents means shifting the enforcement point from "what the developer intended when they wrote the code" to "what the policy says at the moment the tool call is made." Those two things are not the same, and the gap between them is where most incidents happen.
The Three Layers Where Things Go Wrong
Security failures in AI agent deployments tend to cluster in three places.
The model layer covers how the model itself interprets instructions. Prompt injection — where attacker-controlled content in the environment overrides the intended system prompt — is the canonical example. Defenses here include sandboxed execution, careful prompt construction, and limiting what user-controlled content can influence tool selection.
The tool surface is the set of MCP tools the agent can invoke. If the agent can read arbitrary files, call external APIs without constraint, or execute shell commands, then any flaw in reasoning or any injection that reaches the model has a large action space to exploit. This is the layer Navil addresses: by wrapping the MCP server configuration and enforcing a YAML policy on every tool call, the tool surface visible to the agent is scoped to exactly what the current task requires.
The data layer covers what data the agent can retrieve and where it can send it. Even a well-scoped tool surface can leak data if the agent is allowed to pass retrieved content to external services without review. Data-layer governance typically involves output filtering, egress controls, and logging.
Most teams, when they start thinking about agent security, focus on the model layer because that is where the interesting attacks are described. The tool surface is less glamorous but often more straightforward to harden — and hardening it limits the consequences of failures at the model layer.
Policy-First vs. Detect-Only Approaches
There are two philosophies for handling agent behavior at the tool layer: define what is allowed and block everything else, or allow everything and try to detect anomalies.
Detect-only approaches are better than nothing. If you log every tool call and alert on unusual patterns, you will catch some incidents. But detection has an inherent lag — you are responding to something that already happened — and in an automated environment, a lot can happen in the time between the anomaly and the alert.
Policy-first means writing an explicit allowlist of what the agent is permitted to do before it runs, enforcing that allowlist at the call layer, and treating everything outside the policy as blocked by default. The navil.yaml format supports this directly:
policy:
allow:
- tool: search_codebase
- tool: read_file
scope: "./src/**"
- tool: create_pr
deny:
- tool: "*"
default: trueThe advantage of policy-first is that the blast radius of any incident — injection, model error, or supply chain compromise — is bounded by the policy. The agent literally cannot do what the policy does not permit.
That said, policy-first and detect-only are not mutually exclusive. Navil enforces policy at the call layer and also evaluates each call against a pattern library of 568 detection signatures across 36 categories. Blocks happen at the policy layer; anomalies that pass policy are logged for review. Both signals are useful.
What a Governance Proxy Actually Does
A governance proxy sits between the agent and the MCP server. Every tool call the agent makes passes through the proxy before it reaches the server.
At each call, the proxy does three things: it checks the call against the configured policy and blocks it if it does not match, it evaluates the call against the detection pattern library and flags anomalies, and it logs the decision — allowed or blocked — with enough context to reconstruct what happened and why.
This is distinct from wrapping tool calls in application code. Application-layer controls require every developer who ships an agent to remember to add them. A proxy-level control applies uniformly regardless of how the agent was written, which MCP server it talks to, or which model is driving it.
The Navil proxy wraps your existing MCP server configuration files — for example ~/.cursor/mcp.json or Claude Code's config — without requiring modifications to the server itself. Policy is defined in navil.yaml, versioned alongside your code, and evaluated at runtime on every call.
The Runtime Gap That SCA Tools Do Not Cover
Software Composition Analysis tools — npm audit, pip-audit, Dependabot, Snyk — are a standard part of the modern security stack. They scan your dependency tree and flag packages with known CVEs. They should be in your CI pipeline regardless of whether you are using AI agents.
What they cannot do is enforce policy at runtime or detect behavioral anomalies in a running agent. An SCA tool will tell you that a package in your MCP server's dependency tree has a known ReDoS vulnerability. It will not tell you that an agent is currently making a series of tool calls that match the signature of a data exfiltration sequence.
The runtime gap is the space between "this package has a known vulnerability" and "this agent is exploiting that vulnerability right now." Static scanning closes the first gap. Runtime enforcement and monitoring close the second. Both gaps are real and both need to be addressed.
For the current CVE landscape across the MCP ecosystem, the State of MCP Security 2026 report has the full data.
How to Start
Agent governance does not require a large upfront investment. The first step is to get runtime visibility — know what tool calls your agents are making before you write policy to control them.
pip install navil
navil securenavil secure wraps your existing MCP server configuration and completes setup in approximately 47 seconds. From that point, every tool call is logged and evaluated. You can start in observe mode — no blocking, just logging — to understand your agents' actual call patterns before you write policy.
Once you have visibility, write an initial navil.yaml that allows only the tools each agent demonstrably needs. Review the logs after a week of production traffic. Tighten the policy where the data supports it.
For the full setup walkthrough, see the quickstart. For policy options, integration patterns, and the enterprise audit log format, see the features page and the enterprise page.
Enforce policy on every tool call
Navil wraps your MCP servers in under 60 seconds — no changes to agent code. 568 detection patterns, 2.7 µs overhead.