A CIL Tutorial - Department of Computer Science - ETH Zürich
A CIL Tutorial - Department of Computer Science - ETH Zürich A CIL Tutorial - Department of Computer Science - ETH Zürich
CHAPTER 15. AUTOMATED TEST GENERATION 126 void (autotest foo)(int input a, int input b) { int c, d, e; c = a * b; d = a + b; e = c - d; if (e == 14862436) explode(); if (d == 700) explode(); return; } Figure 15.1: Example let a = a0 in let b = b0 in let c = a * b in let d = a + b in let e = c-d in (e != 14862436) && (d != 700) Figure 15.2: Path Condition
CHAPTER 15. AUTOMATED TEST GENERATION 127 15.1 Background This approach goes by many names including directed automated random testing, concolic testing, whitebox fuzzing, and smart fuzzing. Further, researchers have created a number of tools implementing the approach, and variations on it, for a number of dierent languages. These include DART [3], CUTE [4], CREST [1], and PEX [5]. In particular, CREST also uses CIL as compiler front-end, and Yices [2] for the SMT solver as we will here, and is much more complete than the implementation in this tutorial. Therefore, as a starting point for further investigation of automated test generation based on CIL, CREST is likely to be a more appropriate choice. However, one possible advantage to this simpler tutorial implementation is that it uses OCaml rather than C++ to implement the calls to the SMT solver by using the features of the OCaml runtime that allow OCaml calls to be made from C code. Since these more complete tools exist, for the purposes of this tutorial we'll make some simplifying assumptions. In particular, this implementation will handle only scalar values, and regular and null-terminated arrays of scalar values. That is, struct and union types are not handled. Also, Only functions annotated autotest and instrument will be instrumented for symbolic execution. If a non-instrumented function is called from within an autotest function, only its concrete return value will be used in the path condition. In other words, inputs will not be generated to explore functions not annotated autotest or instrument. Finally, our path-exploration algorithm will give up when the SMT solver is unable to generate a new model for any of the available branches whose sense could be ipped. A more complete implementation would avoid getting stuck in this case. 15.2 Organization A bit more code than in previous tutorials is required to implement these features, so instead of listing and commenting on all of it, we'll take a short tour through a few select functions, types, and modules to get an idea of how the code works, and the high-level ideas behind it. 15.2.1 Instrumentation The code using CIL to instrument a program with calls to the SMT solver is in source le src/tut15.ml. Before carrying out the instrumentation however, we use CIL's Simplify module to break down complex expressions and l-values. A full description of its eects can be found in the CIL documentation. For now, it suces to point out that expressions are simplied to the extent that all binary and unary operations operate only on constants or l-values. This is achieved by the Simplify module by introducing additional temporary variables and assignments. The instrumentation calls notify the automated testing runtime of a number of important events: assignments, conditionals, function calls and returns, and entering and leaving an autotest function. For assignments and conditionals, the calls are passed both the addresses and values of the operands and results. Including the concrete values allows the symbolic execution to underapproximate the concrete execution when the SMT solver lacks a theory for some operation performed by the program. In particular, instead of representing the operation symbolically, the SMT solver can under-approximate the program's behavior by using the concrete values. This under-
- Page 77 and 78: CHAPTER 9. TYPE QUALIFIER INFERENCE
- Page 79 and 80: CHAPTER 9. TYPE QUALIFIER INFERENCE
- Page 81 and 82: CHAPTER 9. TYPE QUALIFIER INFERENCE
- Page 83 and 84: Chapter 10 Adding a New Kind of Sta
- Page 85 and 86: CHAPTER 10. ADDING A NEW KIND OF ST
- Page 87 and 88: CHAPTER 10. ADDING A NEW KIND OF ST
- Page 89 and 90: CHAPTER 10. ADDING A NEW KIND OF ST
- Page 91 and 92: CHAPTER 10. ADDING A NEW KIND OF ST
- Page 93 and 94: Chapter 11 Program Verication In th
- Page 95 and 96: CHAPTER 11. PROGRAM VERIFICATION 93
- Page 97 and 98: CHAPTER 11. PROGRAM VERIFICATION 95
- Page 99 and 100: CHAPTER 11. PROGRAM VERIFICATION 97
- Page 101 and 102: CHAPTER 11. PROGRAM VERIFICATION 99
- Page 103 and 104: CHAPTER 11. PROGRAM VERIFICATION 10
- Page 105 and 106: CHAPTER 11. PROGRAM VERIFICATION 10
- Page 107 and 108: Chapter 12 Comments CIL has a very
- Page 109 and 110: CHAPTER 12. COMMENTS 107 let printC
- Page 111 and 112: References [1] Lin Tan, Ding Yuan,
- Page 113 and 114: CHAPTER 13. WHOLE-PROGRAM ANALYSIS
- Page 115 and 116: CHAPTER 13. WHOLE-PROGRAM ANALYSIS
- Page 117 and 118: CHAPTER 14. IMPLEMENTING A SIMPLE D
- Page 119 and 120: CHAPTER 14. IMPLEMENTING A SIMPLE D
- Page 121 and 122: CHAPTER 14. IMPLEMENTING A SIMPLE D
- Page 123 and 124: CHAPTER 14. IMPLEMENTING A SIMPLE D
- Page 125 and 126: CHAPTER 14. IMPLEMENTING A SIMPLE D
- Page 127: Chapter 15 Automated Test Generatio
- Page 131 and 132: CHAPTER 15. AUTOMATED TEST GENERATI
- Page 133 and 134: CHAPTER 15. AUTOMATED TEST GENERATI
- Page 135 and 136: Index A (module), 10, 13, 15, 10, 1
- Page 137 and 138: INDEX 135 isCacheReportType, 11, 11
CHAPTER 15. AUTOMATED TEST GENERATION 127<br />
15.1 Background<br />
This approach goes by many names including directed automated random testing, concolic testing,<br />
whitebox fuzzing, and smart fuzzing. Further, researchers have created a number <strong>of</strong> tools implementing<br />
the approach, and variations on it, for a number <strong>of</strong> dierent languages. These include DART [3],<br />
CUTE [4], CREST [1], and PEX [5].<br />
In particular, CREST also uses <strong>CIL</strong> as compiler front-end, and Yices [2] for the SMT solver<br />
as we will here, and is much more complete than the implementation in this tutorial. Therefore,<br />
as a starting point for further investigation <strong>of</strong> automated test generation based on <strong>CIL</strong>, CREST is<br />
likely to be a more appropriate choice. However, one possible advantage to this simpler tutorial<br />
implementation is that it uses OCaml rather than C++ to implement the calls to the SMT solver<br />
by using the features <strong>of</strong> the OCaml runtime that allow OCaml calls to be made from C code.<br />
Since these more complete tools exist, for the purposes <strong>of</strong> this tutorial we'll make some simplifying<br />
assumptions. In particular, this implementation will handle only scalar values, and regular and<br />
null-terminated arrays <strong>of</strong> scalar values. That is, struct and union types are not handled. Also,<br />
Only functions annotated autotest and instrument will be instrumented for symbolic execution.<br />
If a non-instrumented function is called from within an autotest function, only its concrete return<br />
value will be used in the path condition. In other words, inputs will not be generated to explore<br />
functions not annotated autotest or instrument. Finally, our path-exploration algorithm will give<br />
up when the SMT solver is unable to generate a new model for any <strong>of</strong> the available branches whose<br />
sense could be ipped. A more complete implementation would avoid getting stuck in this case.<br />
15.2 Organization<br />
A bit more code than in previous tutorials is required to implement these features, so instead <strong>of</strong><br />
listing and commenting on all <strong>of</strong> it, we'll take a short tour through a few select functions, types,<br />
and modules to get an idea <strong>of</strong> how the code works, and the high-level ideas behind it.<br />
15.2.1 Instrumentation<br />
The code using <strong>CIL</strong> to instrument a program with calls to the SMT solver is in source le src/tut15.ml.<br />
Before carrying out the instrumentation however, we use <strong>CIL</strong>'s Simplify module to break down<br />
complex expressions and l-values. A full description <strong>of</strong> its eects can be found in the <strong>CIL</strong> documentation.<br />
For now, it suces to point out that expressions are simplied to the extent that all<br />
binary and unary operations operate only on constants or l-values. This is achieved by the Simplify<br />
module by introducing additional temporary variables and assignments.<br />
The instrumentation calls notify the automated testing runtime <strong>of</strong> a number <strong>of</strong> important events:<br />
assignments, conditionals, function calls and returns, and entering and leaving an autotest function.<br />
For assignments and conditionals, the calls are passed both the addresses and values <strong>of</strong> the<br />
operands and results. Including the concrete values allows the symbolic execution to underapproximate<br />
the concrete execution when the SMT solver lacks a theory for some operation performed<br />
by the program. In particular, instead <strong>of</strong> representing the operation symbolically, the SMT<br />
solver can under-approximate the program's behavior by using the concrete values. This under-