MCPShield is the most comprehensive MCP security scanner available. Here are our benchmark results, detection methodology, and public dataset.
Malicious + safe tool descriptions
50K samples from academic benchmark dataset
Poisoned tool descriptions
All 485 malicious descriptions detected by regex scanner
Crafted evasion descriptions
4 evasion techniques: semantic rephrasing, legitimacy framing, multi-step, jargon obfuscation
In-the-wild malicious MCP server
Throwaway account, .exe lure, 23 zero-width characters detected
We tested both our regex scanner and LLM judge (Claude Haiku) against all 485 MCPTox malicious tool descriptions to measure the marginal value of each detection layer.
The regex engine catches all known attack patterns. The LLM judge is conservative by design (low false positives) and provides value on novel semantic attacks that regex patterns haven't been written for yet.
We crafted 20 adversarial tool descriptions using evasion techniques from the AutoMalTool paper (arxiv 2509.21011). Both regex and LLM caught every single one.
"compliance validation" instead of "exfiltrate"
"built-in telemetry for quality assurance"
Attack split across tool chain references
"inode metadata traversal" for filesystem theft
Static pattern matching against tool descriptions, source code, and metadata. Covers OWASP MCP Top 10.
Strengths: Fast, deterministic, zero false negatives on known patterns
Deep inspection of every text surface in tool metadata: descriptions, annotations, parameter defaults, enum values, comments, examples, and nested properties.
Strengths: Catches injection in overlooked fields like inputSchema.$comment
AI-powered semantic analysis of tool descriptions. Detects manipulation that regex cannot catch through understanding intent.
Strengths: Catches novel semantic attacks, provides human-readable explanations
We scanned 1,901 MCP server repositories from GitHub and 60 live HTTP endpoints. Results are available as an open dataset.