BASIL IR is the intermediate representation used during static analysis. This is on contrast to Boogie IR which is used for specification annotation, and output to textual boogie syntax that can be run through the Boogie verifier.
The grammar is described below, note that the IR is a data-structure, without a concrete textual representation so the below grammar only represents the structure. We omit the full description of the expression language because it is relatively standard.
The IR has a completely standard simple type system that is enforced at construction.
- The
GoTo
jump is a multi-target jump reprsenting non-deterministic choice between its targets. Conditional structures are represented by these with a guard (an assume statement) beginning each target. - The
Unreachable
jump is used to signify the absence of successors, it has the semantics ofassume false
. - The
Return
jump passes control to the calling function, often this is over-approximated to all functions which call the statement's parent procedure.
- Immediately after loading the IR return statements may appear in any block, or may be represented by indirect calls. The transform pass below replaces all calls to the link register (R30) with return statements. In the future, more proof is required to implement this soundly.
cilvisitor.visit_prog(transforms.ReplaceReturns(), ctx.program)
transforms.addReturnBlocks(ctx.program, true) // add return to all blocks because IDE solver expects it
cilvisitor.visit_prog(transforms.ConvertSingleReturn(), ctx.program)
This ensures that all returning, non-stub procedures have exactly one return statement residing in their returnBlock
.
- The structure of the IR allows a call may appear anywhere in the block but for all the analysis passes we hold the invariant that it
only appears as the last statement. This is checked with the function
singleCallBlockEnd(p: Program)
. And it means for any call statementc
we mayassert(c.parent.statements.lastOption.contains(c))
.
The 'DSL' is a set of convenience functions for constructing correct IR programs in Scala source files. This provides a simple way to construct IR programs for use in unit tests. Its source code can be found here.
An example can be seen below:
var program: Program = prog(
proc("main",
block("first_call",
Assign(R0, bv64(1), None)
Assign(R1, bv64(1), None)
directCall("callee1"),
goto("second_call"))
),
block("second_call",
directCall("callee2"),
goto("returnBlock")
),
block("returnBlock",
ret
)
),
// ... other procedures
)
As we can see, the syntax is
program ::= prog ( procedure+ )
procedure ::= proc (procname, block+)
block ::= block(blocklabel, statement+, jump)
statement ::= <BASIL IR Statement>
jump ::= goto_s | ret | unreachable
call_s ::= directCall (procedurename, None | Some(blocklabel)) // target, fallthrough
goto_s ::= goto(blocklabel+) // targets
procname ::= String
blocklabel ::= String
If a block or procedure name is referenced in a target position, but a block or procedure is not defined with that label, the dsl constructor will likely throw a match error.
Some additional constants are defined for convenience, Eg. R0 = Register(R0, 64)
, see the source file for the full list.
- For static analysis the Il-CFG-Iterator is the current well-supported way to iterate the IR. This currently uses the TIP framework, so you do not need to interact with the IR visitor directly. See BasicIRConstProp.scala for an example on its useage.
- This visits all procedures, blocks and statements in the IR program.
src/main/scala/ir/Visitor.scala defines visitors which can be used for extracting specific features from an IR program. This is useful if you want to modify all instances of a specific IR construct.
The cfg is a control-flow graph constructed from the IR, it wraps each statement in a Node
.