v0.16.0 — 49 rules · 10/10 OWASP coverage · 1,142 tests passing

Security linting
for AI agents

Find vulnerabilities in your agent code before they reach production.
Static analysis built for tool-calling, MCP configs, and prompt flows.

49
Detection Rules
94.6%
Recall (AVB)
0.91
F1 Score
10/10
OWASP Coverage

Up and running in 6 lines

Install, scan, and gate your CI — all from the terminal.

agent-audit — bash
# Install $ pip install agent-audit
# Scan your project $ agent-audit scan ./your-agent-project
# Gate CI on high+ findings $ agent-audit scan . --fail-on high
# SARIF for GitHub Security tab $ agent-audit scan . --format sarif --output results.sarif
# Inspect a live MCP server (read-only) $ agent-audit inspect stdio -- npx -y @modelcontextprotocol/server-filesystem /tmp

What it catches

Seven categories of agent-specific vulnerabilities that general SAST tools miss entirely.

Injection Attacks

User input flows to exec(), subprocess, or SQL without sanitization. Taint-tracked from source to sink.

AGENT-001 AGENT-041 AGENT-049
🎯

Prompt Injection

User input concatenated into system prompts via f-strings, .format(), or string concatenation.

AGENT-010 AGENT-027
🔑

Leaked Secrets

API keys hardcoded in source or MCP configs. Three-stage filtering: regex → entropy + placeholder → context.

AGENT-004 AGENT-031
🛡️

Missing Input Validation

@tool functions accept raw strings from LLM output without type checks, allowlists, or sanitization.

AGENT-034 AGENT-026
🔌

Unsafe MCP Servers

No auth on SSE/HTTP transport, no version pinning, overly broad filesystem access, sensitive env leakage.

AGENT-029 AGENT-030 AGENT-033
⚙️

No Guardrails

Agents running without iteration limits, human approval, or kill switches. Cascading failures waiting to happen.

AGENT-028 AGENT-037 AGENT-024

Benchmark results

Evaluated on Agent-Vuln-Bench (19 samples, 3 vulnerability categories) against Bandit and Semgrep.

Tool Recall Precision F1
agent-audit 94.6% 87.5% 0.91
Bandit 1.8 29.7%
100% 0.46
Semgrep 1.x 27.0%
100% 0.43
Category agent-audit Bandit Semgrep
Set A — Injection / RCE 100% 68.8% 56.2%
Set B — MCP Config 100% 0% 0%
Set C — Data / Auth 84.6% 0% 7.7%
Validation snapshot as of 2026-02-19, v0.16 benchmark set.
Benchmark Details →   Competitive Comparison →
9
OSS Targets
10/10
OWASP Categories

How it works

Four specialized scanners feed into a unified rule engine. ~17,600 lines of Python.

Input
Source Files (.py, .json, .yaml, .env ...)
Scanners
PythonScanner
AST analysis · taint tracking · 3,973 lines
SecretScanner
40+ regex · 3-stage semantic filter · 835 lines
MCPConfigScanner
JSON/YAML parsing · provenance · auth · 1,082 lines
PrivilegeScanner
Daemon · sudoers · sandbox · 1,138 lines
Analysis Layer
TaintTracker
Source → sink reachability · BFS
SemanticAnalyzer
Entropy · placeholder · framework context
DangerousOpAnalyzer
Heuristic gating · safe patterns
Rule Engine → Output
RuleEngine
49 rules · dedup · confidence tiers
SARIF
JSON
Terminal

Full OWASP Agentic Top 10 coverage

49 detection rules mapped to every category of the OWASP Agentic Top 10 (2026).

01

Agent Goal Hijack

4 rules · prompt injection via f-string, concat
02

Tool Misuse

9 rules · @tool input to subprocess, SQL, SSRF
03

Identity & Privilege

4 rules · daemon escalation, excessive servers
04

Supply Chain

5 rules · unverified MCP source, unpinned npx
05

Code Execution

3 rules · eval/exec in tool without sandbox
06

Memory Poisoning

2 rules · unsanitized vector store upsert
07

Inter-Agent Comm

1 rule · multi-agent HTTP without TLS
08

Cascading Failures

3 rules · AgentExecutor without max_iterations
09

Trust Exploitation

6 rules · critical ops without human_in_the_loop
10

Rogue Agents

3 rules · no kill switch, no monitoring

Who is this for

👩‍💻

Agent Developers

Building with LangChain, CrewAI, AutoGen, OpenAI Agents SDK, or raw function-calling. Run it before every deploy.

🔒

Security Engineers

Reviewing agent codebases. Get structured SARIF reports for the GitHub Security tab. Integrate into existing CI gates.

🔌

MCP Server Operators

Validate your mcp.json and claude_desktop_config.json for secrets, auth gaps, and supply chain risks.