-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9a97713
commit f418ed9
Showing
12 changed files
with
164 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
## Definition | ||
A type of [[Symbolic Execution 1]], that obtains a set of test cases that hit all possible branches in a program. | ||
1. Start with concrete input. | ||
2. run it, and collect path conditions. | ||
3. choose a given path condition, and negate it. | ||
4. solve the new path conditions to get a new concrete input to try. | ||
5. repeat with the new concrete input, and record that input as a test case for the user. | ||
It is a type of [[White-Box Fuzzing]]. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
## Aims |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
## Definition | ||
Static instrumentation of source code. | ||
- Takes in source files, instruments code by converting operations into calls to symbolic operations. | ||
- [EXE: Automatically Generating Inputs of Death](https://web.stanford.edu/~engler/exe-ccs-06.pdf) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
## [Website](https://klee.github.io/) | ||
## Summary | ||
A symbolic execution tool acting on [[LLVM IR]] (meaning llvm targeting languages such as C/C++ through `clang`, or Rust through `rustc` can be analysed). | ||
|
||
Dynamically instruments the [[LLVM IR]] as it interprets it. | ||
|
||
```bash | ||
clang –c –emit-llvm bpf.c # generate llvm ir | ||
klee bpf.bc # symbolically execute | ||
``` | ||
### Paths | ||
Each path in the program is part of an execution tree. The executor manages a set of states at different parts of the tree that incrementally explore this. | ||
|
||
| Name | Definition | | ||
| ---- | ---- | | ||
| Infeasable Path | Where the constraints on a symbolic value are unsatisfiable. These paths are not explored as they cannot be reached in execution. | | ||
| Path Condition | The conjunction of constraints gathered on an execution path. | | ||
### All-Value Checks | ||
Can implicitly check for [[Generic Bugs]] such as issue with: | ||
- pointer dereferencing | ||
- array indexing | ||
- division/modulo operations | ||
It can also check [[Functional Bugs]] such as assert statements. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
## Definition | ||
Humans writing tests. | ||
### Pros | ||
- High feature coverage | ||
- Good oracles (the developer) | ||
- Can run other analyses on the tests (i.e. [[Compiler Sanitizers]]) | ||
### Cons | ||
- High effort (developer time is expensive) | ||
- Missed corner cases (cannot cover cases the developer did not consider) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
## Definition | ||
Real world programs contain many conditions resulting in a huge number of (often exponential in terms of conditions) paths. | ||
- When using [[Symbolic Execution 1]]to find bugs, we should prioritise the most important paths | ||
## Solutions | ||
### Search Heuristics | ||
|
||
#### Depth First Search | ||
Only need to keep one active state (the current), can potentially miss out on other important code paths (when using time limit) by going exhaustively through one large path. | ||
#### Breadth First Search | ||
Need to keep many states of incomplete paths in memory. | ||
#### Best-First Heuristic | ||
Used by [[EXE]]. In [[EXE]] each fork is run as a separate process, it chooses the *best-first candidate* as the suspended process on the line executed the fewest number of times. | ||
The *best-first candidate* is then executed as depth-first for some time, before making the *best-first candidate* selection again. | ||
|
||
This avoids exploding paths in loops (those instructions are run many times, and hence de-prioritised as *best-first candidates*). | ||
#### MD2U | ||
Available in [[KLEE]] | ||
$$MD2U(s) = \min\left(\text{distance from s} \to \text{uncovered instruction} \right)$$ | ||
States are selected according to weight $\cfrac{1}{MD2U(s)^2}$ to prioritise sattes close to uncovered instructions. | ||
#### Random Path | ||
Given an execution tree, assuming each subtree has an equal probability of hitting uncovered code (size not considered). | ||
- Each tree's weighting of being chosen is based on the depth (e.g. higher has larger weighting) | ||
- for depth $n$ of binary tree $weight(n) = 2^{-n}$ | ||
This helps avoid starvation, we can never get stuck choosing the same path due to random nature of path selection. | ||
#### Round Robin | ||
Can take multiple heuristics and apply them in a round robin fashion. | ||
### Eliminating redundant Paths | ||
- If two paths reach the same [[Program Point]] with the same set of constraints, we can merge the states / prune one path. | ||
- We can discard constraints for memory that is never read again (e.g. freed, destroyed at end of scope) | ||
The basic algorithm is as follows: | ||
```python | ||
|
||
# The set of reduced Path Conditions visited before by Program Point P | ||
cache_set: dict[ProgramPoint, set[PathCond]] = ... | ||
|
||
# Locations read from P when reached with PC | ||
read_set: dict[tuple[ProgramPoint, PathCond], set[Location]] = ... | ||
trans_cl: dict[tuple[PathCond, ReadSet], Constraint] = ... | ||
|
||
# when program point p reached with path condition pc | ||
def reachedwith(p: ProgramPoint, path_cond: PathCond) -> None: | ||
for pc in cache_set[p]: | ||
# If the conditions involving this path and the read set at | ||
# this point overlap with all path conds in the cache set | ||
if trans_cl[(path_cond, read_set[p, path_cond])] == pc | ||
halt_exploration() | ||
|
||
# get the read locations at this program point and path_cond | ||
read_set[p] = compute_read_set(p, path_cond) | ||
# update the path conditions at this point | ||
cache_set[p] = cache_set[p].union( | ||
trans_cl[path_cond, read_set[p, path_cond]] | ||
) | ||
``` | ||
### Statically Merging Paths | ||
Converting branching into an expression. | ||
```rust | ||
let x: i32; | ||
if c { | ||
x = A(); | ||
} else { | ||
x = B(); | ||
} | ||
``` | ||
Can be converted to: | ||
```rust | ||
let x: i32 = if c { A() } else { B() }; | ||
``` | ||
This is called *phi-node folding* , the expression must be side-affect less. | ||
- There is no overhead runtime overhead (single path) | ||
### Test-Suite-Based Prioritisation | ||
Many applications already have large test suites. | ||
- Designed by developers to test features important to them / prioritised | ||
- Typically have good coverage | ||
- Often missing corner cases / cases not considered by developers | ||
These test suites can be used to guide the tool, and prioritise which paths to check first. | ||
|
||
For example [ZESTI](https://www.doc.ic.ac.uk/~cristic/papers/zesti-icse-12.pdf) uses an existing regression suite to determine which paths are interesting. |
7 changes: 5 additions & 2 deletions
7
70024 - Software Reliability/Software Reliability Techniques.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,10 @@ | ||
## Human Driven | ||
### [[Manual Testing]] | ||
### Coding Standards | ||
### Code Review | ||
### Tool Support | ||
### [[Manual Testing]] | ||
|
||
## Automated | ||
## Automated | ||
### [[Fuzzing]] | ||
### [[Static Analysis]] | ||
### [[Symbolic Execution 1]] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
## Definition | ||
By running a program on symbolic input (representing any value, can be logically constrained), every path through a program is analysed by constraining symbolic values to symbols representing the sets of possible values on the paths. | ||
- Program instructions act on symbolic values. | ||
- At conditionals, fork the symbolic values and constrain by the branch condition. | ||
- On termination generate a test case by solving the constraints on the path. | ||
### Advantages | ||
| Advantage | Description | | ||
| ---- | ---- | | ||
| Automatic | Requires no input test cases, initial seeds, configuration (like [[Fuzzing]]/[[Swarm Testing]]). | | ||
| Systematic | Can reach all possible paths in the program, and reason about all possible values on these paths. | | ||
| Deep Bugs | Cam reason about all values, hence including extremely rare edge case values & memory layouts. | | ||
| [[Functional Bugs]] | Can reason about logical statements given appropriate oracles. e.g. checking for crashes, given assert statements cause crashes. | | ||
| Test Cases | Generates test cases for developers to allow them to reproduce and debug. | | ||
### Disadvantages | ||
| Disadvantage | Description | | ||
| ---- | ---- | | ||
| Complex Toolchain | Requires the source to be available, and compiled to some form the symbolic execution tool can interpret. | | ||
| Constrain Solving | Constraint solving is expensive. | | ||
## Mixed Concrete & Symbolic Execution | ||
Many values are concrete (e.g. constants in the code, or concrete inputs set by the user). | ||
- Only operations that include symbolic values need to be symbolically executed, the rest can be executed as normal. | ||
- Allows interaction with outside environment, (e.g. operating system, un-instrumented libraries, etc.) | ||
## Challenges | ||
### [[Path Explosion]] | ||
## Examples | ||
KLEE, CREST, SPF, FuzzBall | ||
### [[KLEE]] | ||
### [[EXE]] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters