Find vulnerabilities in your agent code before they reach production.
Static analysis built for tool-calling, MCP configs, and prompt flows.
Install, scan, and gate your CI — all from the terminal.
Seven categories of agent-specific vulnerabilities that general SAST tools miss entirely.
User input flows to exec(), subprocess, or SQL without sanitization. Taint-tracked from source to sink.
User input concatenated into system prompts via f-strings, .format(), or string concatenation.
API keys hardcoded in source or MCP configs. Three-stage filtering: regex → entropy + placeholder → context.
@tool functions accept raw strings from LLM output without type checks, allowlists, or sanitization.
No auth on SSE/HTTP transport, no version pinning, overly broad filesystem access, sensitive env leakage.
Agents running without iteration limits, human approval, or kill switches. Cascading failures waiting to happen.
Evaluated on Agent-Vuln-Bench (19 samples, 3 vulnerability categories) against Bandit and Semgrep.
| Tool | Recall | Precision | F1 |
|---|---|---|---|
| agent-audit | 94.6% | 87.5% | 0.91 |
| Bandit 1.8 | 29.7% | 100% | 0.46 |
| Semgrep 1.x | 27.0% | 100% | 0.43 |
| Category | agent-audit | Bandit | Semgrep |
|---|---|---|---|
| Set A — Injection / RCE | 100% | 68.8% | 56.2% |
| Set B — MCP Config | 100% | 0% | 0% |
| Set C — Data / Auth | 84.6% | 0% | 7.7% |
Four specialized scanners feed into a unified rule engine. ~17,600 lines of Python.
49 detection rules mapped to every category of the OWASP Agentic Top 10 (2026).
Building with LangChain, CrewAI, AutoGen, OpenAI Agents SDK, or raw function-calling. Run it before every deploy.
Reviewing agent codebases. Get structured SARIF reports for the GitHub Security tab. Integrate into existing CI gates.
Validate your mcp.json and claude_desktop_config.json for secrets, auth gaps, and supply chain risks.