Suppose you hire a consultancy to perform a black-box assessment of your software. After executing the test, the firm produces a report outlining several vulnerabilities with your application. You remediate the vulnerabilities, submit the application for re-testing, and the next report comes back “clean” — i.e. without any vulnerabilities. At best, this simply tells you that your application can’t be broken into by the same testers in the same time frame. On the other hand, it doesn’t tell you:
- What are the potential threats to your application?
- Which threats is your application “not vulnerable” to?
- Which threats did the testers not assess your application for? Which threats were not possible to test from a runtime perspective?
- How did time and other constraints on the test affect the reliability of results? For example, if the testers had 5 more days, what other security tests would they have executed?
- What was the skill level of the testers and would you get the same set of results from a different tester or another consultancy?
In our experience, organizations aren’t able to answer most of these questions. The tester doesn’t understand application internals and the organization requesting the test doesn’t know much about the security posture of their software. We’re not the only ones who acknowledge this issue: Haroon Meer discussed the challenges of penetration testing at 44con. Most of these issues apply to every form of verification: automated dynamic testing, automated static testing, manual penetration testing, and manual code review. In fact a recent paper describes similar challenges in source code review.
The opaque nature of verification means effective management of software security requirements is essential. With requirements listed, testers can specify both whether they have assessed a particular requirement and the techniques they used to do so. Critics argue that penetration testers shouldn’t follow a “checklist approach to auditing” because no checklist can cover the breadth of obscure and domain-specific vulnerabilities. Yet the flexibility to find unique issues does not obviate the need to verify well understood requirements. The situation is very similar for standard software Quality Assurance (QA): good QA testers both verify functional requirements AND think outside the box about creative ways to break functionality. Simply testing blindly and reporting defects without verifying functional requirements would dramatically reduce the utility of quality assurance. Why accept a lower standard from security testing?
Before you perform your next security verification activity, make sure you have software security requirements to measure against and that you define which requirements are in-scope for the verification. If you engage manual penetration testers or source code reviewers, it should be relatively simple for them to specify which requirements they tested for. If you use an automated tool or service, work with your vendor to find out what requirements their tool or service cannot reliably test for. Your tester/product/service is unlikely to guarantee an absence of false negatives (i.e. certify that your application is not vulnerable to SQL injection), but knowing what they did and did not test for can dramatically help increase the confidence that your system does not contain known, preventable security flaws.