CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively

File Description SizeFormat 
Even-Mendoza2022_Article_CsmithEdgeMoreEffectiveCompile.pdfPublished version1.72 MBAdobe PDFView/Open
Title: CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively
Authors: Even-Mendoza, K
Cadar, C
Donaldson, A
Item Type: Journal Article
Abstract: Compiler fuzzing techniques require a means of generating programs that are free from undefined behaviour (UB) to reliably reveal miscompilation bugs. Existing program generators such as CSMITH achieve UB-freedom by heavily restricting the form of generated programs. The idiomatic nature of the resulting programs risks limiting the test coverage they can offer, and thus the compiler bugs they can discover. We investigate the idea of adapting existing fuzzers to be less restrictive concerning UB, in the practical setting of C compiler testing via a new tool, CSMITHEDGE, which extends CSMITH. CSMITHEDGE probabilistically weakens the constraints used to enforce UB-freedom, thus generated programs are no longer guaranteed to be UB-free. It then employs several off-the-shelf UB detection tools and a novel dynamic analysis to (a) detect cases where the generated program exhibits UB and (b) determine where CSMITH has been too conservative in its use of safe math wrappers that guarantee UB-freedom for arithmetic operations, removing the use of redundant ones. The resulting UB-free programs can be used to test for miscompilation bugs via differential testing. The non-UB-free programs can still be used to check that the compiler under test does not crash or hang. Our experiments on recent versions of GCC, LLVM and the Microsoft Visual Studio Compiler show that CSMITHEDGE was able to discover 7 previously unknown miscompilation bugs (5 already fixed in response to our reports) that could not be found via intensive testing using CSMITH, and 2 compiler-hang bugs that were fixed independently shortly before we considered reporting them.
Issue Date: 8-Jul-2022
Date of Acceptance: 6-Apr-2022
URI: http://hdl.handle.net/10044/1/96987
DOI: 10.1007/s10664-022-10146-1
ISSN: 1382-3256
Publisher: Springer
Journal / Book Title: Empirical Software Engineering: an international journal
Volume: 27
Copyright Statement: © The Author(s) 2022. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
http://creativecommons.org/licenses/by/4.0/
Sponsor/Funder: Engineering & Physical Science Research Council (EPSRC)
European Research Council (ERC)
Engineering & Physical Science Research Council (E
Funder's Grant Number: EP/R011605/1
819141
Ref: 542716
Keywords: Science & Technology
Technology
Computer Science, Software Engineering
Computer Science
Compilers
Fuzzing
Csmith
GCC
LLVM
MSVC
AUTOMATIC-GENERATION
BUGS
Software Engineering
0803 Computer Software
Publication Status: Published
Article Number: ARTN 129
Appears in Collections:Computing
Faculty of Engineering