Resumen
Static analysis is nowadays an essential component of many software development toolsets, attracting significant research interest and practical application. Unfortunately, the ever-increasing complexity of static analyzers makes their coding error-prone. At the same time, the correctness and reliability of software analyzers is critical if they are to be inserted in production compilers and development environments. While there have been some notorious successes in the validation of compilers, comparatively little work exists on the systematic validation of static analyzers. Contributing factors here may be the intrinsic difficulty of formally verifying code that is quite complex and of finding suitable oracles for testing it. In this paper, we propose a simple, automatic method for testing abstract interpretation-based static analyzers. Broadly, it consists in checking, over a suite of benchmarks, that the properties inferred statically are satisfied dynamically. The main advantage of our approach is its simplicity, which stems directly from framing it within the Ciao assertion-based validation framework, and its blended static/dynamic assertion checking approach. We show that in this setting, the analysis can be tested with little effort by combining the following components already present in the framework: 1) the static analyzer, which outputs its results as the original program source with assertions interspersed; 2) the assertion run-time checking mechanism, which instruments a program to ensure that no assertion is violated at run time; 3) the random test case generator, which generates random test cases satisfying the properties present in assertion preconditions; and 4) the unit-test framework, which executes those test cases. We show how a combination of these elements and a trivial program transformation work together to compose a tool that can effectively discover and locate errors in the different components of the static analyzer. We apply our approach to test some of CiaoPP’s analysis domains over a wide range of programs, successfully finding non-trivial, previously undetected bugs, with a low degree of effort.