Testing Methodology
How we validate that Vigile catches real vulnerabilities without false alarms.
Security tools have two failure modes: missing real vulnerabilities (false negatives) and flagging safe configurations (false positives). Both erode trust. We test for both using a five-layer testing pyramid adapted from established security testing standards.
Security Tool Testing Pyramid
Adversarial Testing
Can an attacker evade detection? We test obfuscation, encoding tricks, and novel attack patterns that attempt to bypass the scanner.
- -Multi-layer obfuscation and encoding bypass attempts
- -Novel exfiltration channels not covered by existing patterns
- -Context-dependent payloads that change behavior based on parser
- -Time-delayed activation designed to evade initial scans
Real-World Validation
Does the scanner work on unknown targets? We test against real-world MCP servers and deployed applications to measure accuracy at scale.
- -Scan public MCP servers from community registries
- -Validate findings against manually audited ground truth
- -Measure detection rate vs. false positive rate at scale
- -Cross-reference with other security tools for coverage gaps
Regression Corpus
Does fixing one thing break another? Every confirmed vulnerability becomes a permanent test case. The corpus only grows.
- -Confirmed findings become permanent regression test cases
- -Pattern changes must pass the full regression suite before merge
- -Historical findings re-tested against every scanner release
- -Corpus includes both true positives and confirmed false positives
Known-Vulnerable Fixtures
Does the scanner catch what it claims to catch? We maintain deliberately misconfigured targets that trigger every detection rule, alongside secure baselines that must produce zero findings.
- -Purpose-built vulnerable targets covering all BaaS detection categories
- -Secure baseline projects validated for zero false positives
- -Both Supabase and Firebase configurations tested end-to-end
- -Tests run against deployed cloud infrastructure, not mocks
Unit Tests
Does the detection logic work in isolation? Every pattern, parser, and scanner module has dedicated unit tests covering both positive and negative cases.
- -Comprehensive coverage across pattern matching, probe logic, and bundle analysis
- -Per-module test isolation with mocked API responses
- -All tests run in CI on every commit — zero flaky tests policy
BaaS Scanner Coverage
Vigile tests Supabase and Firebase projects for security misconfigurations that expose data or allow unauthorized access.
Supabase — 7 security checks
| Category | Severity |
|---|---|
Data Access Control Tables with RLS disabled, exposing data to anonymous reads | critical |
Credential Exposure Privileged keys leaked in client-side bundles | critical |
Write Access Unauthorized mutations allowed through the public API | critical |
Authentication Config Weak signup and confirmation settings | medium |
Transport Security Overly permissive CORS policies | medium |
Firebase — 5 security checks
| Category | Severity |
|---|---|
Database Rules Firestore and Realtime Database with public read/write access | critical |
Config Exposure Firebase project config leaked in client-side JavaScript | high |
Storage Security Cloud Storage buckets with public listing or access | high |
Hosting Headers Missing security headers on Firebase Hosting | medium |
Standards Alignment
Our methodology draws from established security testing practices. Where no standard exists for AI agent security tooling, we derive from the closest analogue.
| Established Standard | Vigile Equivalent |
|---|---|
| OWASP WebGoat / Juice Shop | Purpose-built vulnerable fixtures for BaaS platforms |
| Nuclei Template Testing | Per-pattern unit tests with fixture matching |
| Snyk Vulnerability DB Methodology | Regression corpus with ground truth validation |
| NIST SP 800-53 (CA-7) | Continuous monitoring via automated test suite |
Transparency
The Vigile CLI scanner is open source under Apache 2.0. Security tools earn trust through transparency — you can verify how the scanner operates on your infrastructure. Our detection engine is continuously refined using proprietary threat intelligence and real-world validation data.