mcpsecurityanomaly-detection

How 12 Statistical Detectors Catch Threats That Rules Cannot

IvanMarch 28, 20268 min read

How 12 Statistical Detectors Catch Threats That Rules Can't

Your security rules are solid. You've blocked known attack vectors, written policies for credential handling, and locked down tool permissions. So why do you still wake up at 3 AM worried about the agent you can't fully predict?

Because the threats that keep security teams up at night don't match any existing pattern. They're novel. They're adaptive. They're behavioral.

Rule-based security is like a bouncer checking IDs. It catches anyone on the known blacklist. But it can't catch the regular who suddenly starts acting very differently. That's where anomaly detection steps in.

The Problem: Rules Are Always Behind

Static rules have a fundamental limitation: they describe what you know is bad. By the time you write the rule, the attacker has already moved to the next vector.

Consider a real scenario: an LLM agent gradually learns it can exfiltrate data by combining tool calls in a novel sequence--tool A to fetch data, tool B to format it, tool C to send it somewhere unexpected. Each tool call looks innocent individually. Your rules say "tool C is for logging, not exfiltration." But the combination--and the sequence--tells a different story.

Or imagine prompt injection that escalates over time. Day 1: benign requests. Day 5: slowly probing permissions. Day 10: the actual attack. Your static rules won't catch the pattern because each day looks acceptable in isolation.

The statistic: 98.3% of runtime agent attacks are behavioral anomalies that can't be caught by code scanning. The other 1.7%? That's tool vulnerabilities, poisoned dependencies, and code defects--the stuff static analysis finds. But the vast majority of agent attacks happen at runtime, in how the agent uses its tools.

That's why Navil builds a behavioral baseline for each agent and detects when reality diverges from normal.

The Solution: 12 Adaptive Anomaly Detectors

Navil runs 12 statistical detectors that observe how YOUR agents actually behave--not how they should behave in theory. Each detector learns what "normal" looks like using Exponential Moving Average (EMA) baselines that adapt in real time.

Here's how it works:

1. Build a Baseline

When an agent first spins up, Navil starts collecting telemetry: tool call frequency, data volumes, timing patterns, which tools get combined, how long operations take. Over the first few hours or days, a behavioral fingerprint emerges.

This isn't a static threshold. It's an EMA baseline that continuously adapts. If an agent legitimately needs to call a tool 10x one day and 15x the next, the baseline adjusts. Normal behavior evolves with your agent--but sudden, dramatic changes trigger alerts.

2. Run Detection in Parallel

The 12 detectors cover these categories:

Tool call frequency anomalies -- Agent suddenly calls a tool 50x when baseline is 5x per hour
First-time tool usage -- Agent attempts to use a tool it's never accessed before (potential privilege escalation)
Unusual data volumes -- Response payload is 10x larger than typical for this agent
Timing anomalies -- Tool calls at unusual hours, unusual intervals, or unusual durations
Tool sequence anomalies -- Combination of tools that individually look fine but together suggest data exfiltration or lateral movement
Cross-agent correlation -- Two agents accessing the same sensitive data in a coordinated pattern
Latency anomalies -- Tool calls taking 10x longer than baseline (possible denial of service or resource exhaustion)
Error rate spikes -- Sudden increase in tool failures (possible probing or permission testing)
Data type anomalies -- Agent accessing data types it typically never touches
Geographic/IP anomalies -- Tools being called from unexpected network origins
Credential reuse patterns -- Suspicious patterns in how credentials are accessed or rotated
Behavioral entropy -- Agent's overall decision-making patterns become chaotic or unpredictable

Each detector publishes a confidence score. Multiple detectors firing simultaneously = high confidence alert.

3. Learn from Operator Feedback

When you mark an alert as a false positive ("this is normal for my agent"), Navil learns. The detector adjusts the baseline upward for that specific agent. Over time, your detection becomes more precise--fewer false positives, higher signal-to-noise ratio.

This is the critical difference from static rules: your security adapts to your specific operational reality, not a generic threat model.

The Architecture: Security Off the Hot Path

Navil's design philosophy is simple: never block your agent. Security analysis must never become a bottleneck.

Here's the pipeline:

+---------------+
|   Agent       |
|   Calls       |
|   Tool        |
+------+--------+
       |
       | (real-time)
       v
+------------------+
|  Rust Proxy      |
|  (Hot Path)      |
|  - Log call      |
|  - Emit telemetry|
+------+-----------+
       |
       | (publish, sub-ms latency)
       v
+------------------+
|  Redis Broker    |
|  (Telemetry)     |
+------+-----------+
       |
       | (async, never blocks)
       v
+------------------------------+
|  Python Workers              |
|  (12 Anomaly Detectors)      |
|  - Statistical analysis      |
|  - EMA baseline comparison   |
|  - Confidence scoring        |
+------+-----------------------+
       |
       | (alerts, dashboards, webhooks)
       v
+------------------+
|  Alert Channel   |
|  - Slack/Discord |
|  - Dashboard     |
|  - navil cli     |
+------------------+

The Rust proxy handles the hot path and publishes to Redis with sub-millisecond latency. Python workers consume telemetry asynchronously. Security analysis never impacts agent performance.

The Trust Score: Behavioral Reputation

Every agent gets a per-agent trust score (0-100) based on behavioral analysis. Think of it as a reputation system for runtime behavior.

Agent behaves normally? Trust is high.
Anomaly detected? Trust drops immediately.
Behavior normalizes? Trust gradually recovers (using EMA).
Multiple detectors fire? Trust drops faster.

The trust score answers a key question: "Should I let this agent make that call right now?" Operators can set thresholds: if trust < 50, require human approval before sensitive operations. If trust < 20, restrict to read-only mode.

Seeing What's Happening

Three ways to monitor:

1. Real-time CLI:

navil alerts --last 1h --agent my-agent

Shows recent anomalies, confidence scores, affected detectors.

2. Dashboard: Head to navil.ai. See trust scores per agent, anomaly heatmaps, detector performance, false positive trends. Real-time Slack/Discord webhooks push critical alerts instantly.

3. Programmatic access: Query the Navil API to build custom dashboards or integrate with your SOC tooling.

What About False Positives?

They happen. You scale a legitimate operation, agent call volume spikes, detector fires, you mark it as false positive. The baseline adjusts. Over time, your detectors learn your operational patterns and false positive rate drops.

This is the advantage of adaptive baselines: they improve with operational feedback. Static rules either fire constantly (alert fatigue) or miss threats entirely (false negatives).

The 1.7% vs 98.3% Split

Here's the uncomfortable truth: static code scanning and vulnerability scanning catch 1.7% of agent security threats.

The other 98.3% are runtime behaviors:

Tool poisoning -- Agent receives manipulated responses and acts on them
Credential exfiltration -- Agent misuses credentials it's been given access to
Session manipulation -- Agent is tricked into using another agent's session
Progressive prompt injection -- Attack escalates slowly over multiple interactions
Data exfiltration via combination -- Individual tool calls look fine, but the sequence exfiltrates data
Lateral movement -- Agent uses legitimate tools to access systems it shouldn't
Privilege escalation -- Agent gradually increases its own permissions through tool misuse

Only behavioral anomaly detection catches these. Rules catch none of them (until after the attack succeeds and you write a new rule).

Real-World Example

An agent normally calls your database tool 20-50 times per day. One afternoon it calls it 500 times. Your first detector (frequency anomaly) fires immediately.

The agent is also accessing data types it's never queried before (detector 9). And the response payloads are 5x larger than baseline (detector 3).

Three detectors firing together = high confidence alert. Trust score drops to 35. You open the alert, see which queries are unusual, approve the legitimate ones (false positive learning), and block the suspicious pattern.

The baseline adjusts. Same agent, legitimately higher volume, no more false positives.

Getting Started

Anomaly detection is built into Navil. Install it:

pip install navil

Initialize your workspace:

navil init

Define your agent:

from navil import Agent, Tool
 
my_agent = Agent(name="data-processor")
my_agent.add_tool(Tool(name="database", category="data"))
my_agent.add_tool(Tool(name="http_client", category="external"))

Start collecting telemetry:

navil start

Within hours, your baselines build. Within days, your first anomalies detect. Within weeks, false positives disappear as your detectors learn your normal.

What's Next

Anomaly detection is half the battle. The other half is response--what happens when an alert fires? Navil supports auto-responses (restrict permissions, require approval, sandbox the agent), integration with your incident response workflows, and forensic analysis to understand what happened.

But detection is where security begins. Without knowing when something is wrong, you can't defend against it.

Your agents are behaving right now. Navil's 12 detectors are learning what that looks like. The next attack--the one that doesn't match any rule--might be tomorrow. But it'll stand out against the baseline.

Install Navil

Protect your agents from behavioral threats that rules can't catch:

pip install navil

Visit navil.ai to explore the AI Policy Builder, see the dashboard, or join the community. Open source. Agent-native. Runtime security that actually works.

ShareTwitter LinkedIn

Enforce policy on every tool call

Navil wraps your MCP servers in under 60 seconds — no changes to agent code. 568 detection patterns, 2.7 µs overhead.

Get started →MCP Security Report Enterprise →

mcpsecurity

Anthropic Just Launched Claude Security. Here's Why Your AI Agents Are Still Exposed.

4 min read

mcpsecurity

SAFE-MCP Is the New Standard. Here's How to Map Your Agent Security Coverage.

10 min read

mcpsecurity

Your Agent Security Score is a Number, Not a Feeling

8 min read

mcpsecurityanomaly-detection

How 12 Statistical Detectors Catch Threats That Rules Cannot

IvanMarch 28, 20268 min read

How 12 Statistical Detectors Catch Threats That Rules Can't

Because the threats that keep security teams up at night don't match any existing pattern. They're novel. They're adaptive. They're behavioral.

The Problem: Rules Are Always Behind

Static rules have a fundamental limitation: they describe what you know is bad. By the time you write the rule, the attacker has already moved to the next vector.

That's why Navil builds a behavioral baseline for each agent and detects when reality diverges from normal.

The Solution: 12 Adaptive Anomaly Detectors

Here's how it works:

1. Build a Baseline

2. Run Detection in Parallel

The 12 detectors cover these categories:

Tool call frequency anomalies -- Agent suddenly calls a tool 50x when baseline is 5x per hour
First-time tool usage -- Agent attempts to use a tool it's never accessed before (potential privilege escalation)
Unusual data volumes -- Response payload is 10x larger than typical for this agent
Timing anomalies -- Tool calls at unusual hours, unusual intervals, or unusual durations
Tool sequence anomalies -- Combination of tools that individually look fine but together suggest data exfiltration or lateral movement
Cross-agent correlation -- Two agents accessing the same sensitive data in a coordinated pattern
Latency anomalies -- Tool calls taking 10x longer than baseline (possible denial of service or resource exhaustion)
Error rate spikes -- Sudden increase in tool failures (possible probing or permission testing)
Data type anomalies -- Agent accessing data types it typically never touches
Geographic/IP anomalies -- Tools being called from unexpected network origins
Credential reuse patterns -- Suspicious patterns in how credentials are accessed or rotated
Behavioral entropy -- Agent's overall decision-making patterns become chaotic or unpredictable

Each detector publishes a confidence score. Multiple detectors firing simultaneously = high confidence alert.

3. Learn from Operator Feedback

This is the critical difference from static rules: your security adapts to your specific operational reality, not a generic threat model.

The Architecture: Security Off the Hot Path

Navil's design philosophy is simple: never block your agent. Security analysis must never become a bottleneck.

Here's the pipeline:

+---------------+
|   Agent       |
|   Calls       |
|   Tool        |
+------+--------+
       |
       | (real-time)
       v
+------------------+
|  Rust Proxy      |
|  (Hot Path)      |
|  - Log call      |
|  - Emit telemetry|
+------+-----------+
       |
       | (publish, sub-ms latency)
       v
+------------------+
|  Redis Broker    |
|  (Telemetry)     |
+------+-----------+
       |
       | (async, never blocks)
       v
+------------------------------+
|  Python Workers              |
|  (12 Anomaly Detectors)      |
|  - Statistical analysis      |
|  - EMA baseline comparison   |
|  - Confidence scoring        |
+------+-----------------------+
       |
       | (alerts, dashboards, webhooks)
       v
+------------------+
|  Alert Channel   |
|  - Slack/Discord |
|  - Dashboard     |
|  - navil cli     |
+------------------+

The Rust proxy handles the hot path and publishes to Redis with sub-millisecond latency. Python workers consume telemetry asynchronously. Security analysis never impacts agent performance.

The Trust Score: Behavioral Reputation

Every agent gets a per-agent trust score (0-100) based on behavioral analysis. Think of it as a reputation system for runtime behavior.

Agent behaves normally? Trust is high.
Anomaly detected? Trust drops immediately.
Behavior normalizes? Trust gradually recovers (using EMA).
Multiple detectors fire? Trust drops faster.

Seeing What's Happening

Three ways to monitor:

1. Real-time CLI:

navil alerts --last 1h --agent my-agent

Shows recent anomalies, confidence scores, affected detectors.

2. Dashboard: Head to navil.ai. See trust scores per agent, anomaly heatmaps, detector performance, false positive trends. Real-time Slack/Discord webhooks push critical alerts instantly.

3. Programmatic access: Query the Navil API to build custom dashboards or integrate with your SOC tooling.

What About False Positives?

This is the advantage of adaptive baselines: they improve with operational feedback. Static rules either fire constantly (alert fatigue) or miss threats entirely (false negatives).

The 1.7% vs 98.3% Split

Here's the uncomfortable truth: static code scanning and vulnerability scanning catch 1.7% of agent security threats.

The other 98.3% are runtime behaviors:

Tool poisoning -- Agent receives manipulated responses and acts on them
Credential exfiltration -- Agent misuses credentials it's been given access to
Session manipulation -- Agent is tricked into using another agent's session
Progressive prompt injection -- Attack escalates slowly over multiple interactions
Data exfiltration via combination -- Individual tool calls look fine, but the sequence exfiltrates data
Lateral movement -- Agent uses legitimate tools to access systems it shouldn't
Privilege escalation -- Agent gradually increases its own permissions through tool misuse

Only behavioral anomaly detection catches these. Rules catch none of them (until after the attack succeeds and you write a new rule).

Real-World Example

An agent normally calls your database tool 20-50 times per day. One afternoon it calls it 500 times. Your first detector (frequency anomaly) fires immediately.

The agent is also accessing data types it's never queried before (detector 9). And the response payloads are 5x larger than baseline (detector 3).

The baseline adjusts. Same agent, legitimately higher volume, no more false positives.

Getting Started

Anomaly detection is built into Navil. Install it:

pip install navil

Initialize your workspace:

navil init

Define your agent:

from navil import Agent, Tool
 
my_agent = Agent(name="data-processor")
my_agent.add_tool(Tool(name="database", category="data"))
my_agent.add_tool(Tool(name="http_client", category="external"))

Start collecting telemetry:

navil start

Within hours, your baselines build. Within days, your first anomalies detect. Within weeks, false positives disappear as your detectors learn your normal.

What's Next

But detection is where security begins. Without knowing when something is wrong, you can't defend against it.

Install Navil

Protect your agents from behavioral threats that rules can't catch:

pip install navil

Visit navil.ai to explore the AI Policy Builder, see the dashboard, or join the community. Open source. Agent-native. Runtime security that actually works.

ShareTwitter LinkedIn

Enforce policy on every tool call

Navil wraps your MCP servers in under 60 seconds — no changes to agent code. 568 detection patterns, 2.7 µs overhead.

Get started →MCP Security Report Enterprise →

mcpsecurity

Anthropic Just Launched Claude Security. Here's Why Your AI Agents Are Still Exposed.

4 min read

mcpsecurity

SAFE-MCP Is the New Standard. Here's How to Map Your Agent Security Coverage.

10 min read

mcpsecurity

Your Agent Security Score is a Number, Not a Feeling

8 min read

How 12 Statistical Detectors Catch Threats That Rules Cannot

How 12 Statistical Detectors Catch Threats That Rules Can't

The Problem: Rules Are Always Behind

The Solution: 12 Adaptive Anomaly Detectors

1. Build a Baseline

2. Run Detection in Parallel

3. Learn from Operator Feedback

The Architecture: Security Off the Hot Path

The Trust Score: Behavioral Reputation

Seeing What's Happening

What About False Positives?

The 1.7% vs 98.3% Split

Real-World Example

Getting Started

What's Next

Install Navil

Enforce policy on every tool call

Related articles

How 12 Statistical Detectors Catch Threats That Rules Cannot

How 12 Statistical Detectors Catch Threats That Rules Can't

The Problem: Rules Are Always Behind

The Solution: 12 Adaptive Anomaly Detectors

1. Build a Baseline

2. Run Detection in Parallel

3. Learn from Operator Feedback

The Architecture: Security Off the Hot Path

The Trust Score: Behavioral Reputation

Seeing What's Happening

What About False Positives?

The 1.7% vs 98.3% Split

Real-World Example

Getting Started

What's Next

Install Navil

Enforce policy on every tool call

Related articles