GrammaTalk

How Sound Static Analysis Complements Heuristic Analysis

Posted on

by

static-ai

Not all static analysis tools work the same, there are in fact a spectrum of tools that use a variety of techniques ranging from relatively simple syntactic analysis through very sophisticated abstract interpretation-like algorithms that reason about potential executions. Each approach has its strengths and weaknesses and often tools, like GrammaTech CodeSonar, use a combination of syntax analysis and abstract interpretation to achieve their goals. Sound static analyzers are a lesser known flavor of analyzers that aim to prove the absence of runtime errors in code. These tools use formal methods to prove that certain aspects of a program are true (e.g., “the program is memory safe”). However, there are limitations to this approach in areas such as ease of use, scalability, applicability to code with dynamic memory allocation, threading, third party libraries, and more. So, the more productive approach is to consider how to merge the benefits that multiple static analysis tools provide, such that they can complement each other.

Program Soundness Analysis

The term “soundness” as it applies to static analysis means that if a bug exists in the code, a sound analysis is guaranteed to report it. The key property of these tools is that if they cannot rigorously prove that a defect is absent, they must report that potential defect as a warning. In many cases this leads to either a high rate of false positives, or the need to provide annotations to the code to satisfy the proof system. The user analyzing the results of the tool then needs to review the generated warnings and may update those annotations as needed. In small bodies of code this is manageable. In large bodies of code, this can quickly become overwhelming.

One example of such as tool is Frama-C, “a suite of tools dedicated to the analysis of the source code of software written in C.” Frama-C supports plug-ins and of interest here are the deductive verification plug-ins such as (weakest precondition) WP. The tool uses code annotations in ACSL which are the proofs the programmer wishes to establish about a section of code from which the tool generates proof obligations which, in turn, require a “prover” plug-in to fulfil the proofs.

This paper by the French National Cybersecurity Agency (ANSSI) documents the application of Frama-C to show the absence of run-time errors in an X.509 parser. The results show that there is some learning curve for using the tool and using annotations. However, the benefits are worth it, for this important subsection of a full application. There are some limitations on code usage that might limit the tool’s usage on larger code bases, such as variable length arrays, dynamic memory allocation, function pointers and external dependencies among others. Sound tools, when conditions are met, have the ability to prove the absence of certain runtime errors which is valuable in safety and security critical applications. Tools, like CodeSonar, using abstract interpretation, warn developers of potential issues but the lack of reported errors in code is not proof of the absence of errors; the big advantage is that such tools are easier to apply and scale well.

Heuristic Analysis

Commercial tools such as CodeSonar must deal with real world C and C++ code in projects of millions of lines of code, that interface with hundreds of APIs, and that may be compiled with dozens of different compiler toolchains for many platforms. As well as finding the deep semantic errors that lead to serious bugs, these tools can also find where the programmer has misused an API, or violated the rules of safety and secure coding standards. The analysis is supported by features to help teams manage the results and collaborate to fix the discovered problems.

In order to handle such a wide variety of programs at such a large scale and deliver real value, CodeSonar necessarily makes approximations. The analysis techniques it uses are very similar to those used by the sound tools, but they additionally employ heuristics to generate actionable results in a reasonable amount of time.

In contrast to the sound tools that focus on verifying that everything in the code is perfectly correct, CodeSonar instead looks for positive evidence that there is a mistake in the code. On the basis that most code works perfectly well most of the time, this heuristic approach is in many cases the most effective way to find those lingering bugs.

Working together in a commercial environment

In a previous post, we discussed integrating the results from Clang into CodeSonar, using SARIF as the interchange format. As with Clang and other tools that report warnings or errors on source code, it’s possible to integrate the results into CodeSonar’s repository, web portal and reports. It is important to realize that reporting warnings is one aspect of a static analysis tool. Being able to track those warnings during the software development process and allowing software development teams to collaborate on reviewing those warnings is another important aspect.

Consider, for example, a safety critical project that relies on a soundness analyzer to verify the safety functions of an industrial robot. This will be contained in a relatively small piece of code and suited to soundness analysis and requires proof that the safety function is error free. Not all parts of the system require this rigor, still the code needs to be checked for coding standard compliance (e.g. MISRA C) and for serious runtime errors which can be handled by another tool such as CodeSonar. This approach leverages the strengths of each tool and provides the project infrastructure needed for analysis and reporting. CodeSonar is well suited to combine these results using its SARIF results interchange format capability.

 

Related Posts

Check out all of GrammaTech’s resources and stay informed.

view all posts

Contact Us

Get a personally guided tour of our solution offerings. 

Contact US