REVEN-Axion 2015v1.1-r3
Terminology

This documentation of REVEN-Axion sometimes refers to its own internal terminology, which is explained here.

General

Project

A project is the basic work unit: it contains a recorded scenario which may be based on a binary, and possibly multiple saved execution traces. A project lives on the REVEN server, to which Axion client simply connect.

Scenario

Given an initial context, a program is first executed once in a controlled environment. All hardware events that occur during this initial execution are monitored. This becomes the scenario of the project.

When an analysis is done in Reven, it always refers to the scenario. If an execution drifts from the scenario by any means (typically, altering the state of execution using the inspector alter_exec), then the scenario will probably not match anymore.

For example, let's say that some file should be opened by the program in the scenario. If the execution is modified to open another file, or not to open any file at all, then the execution will not match the scenario anymore.

dot_inline_dotgraph_1.png

expression translation

Reven implements a symbolic version of a processor. This is a way to translate the electrical operations of a real hardware processor to mathematical formulas.

dot_inline_dotgraph_2.png

For more information, see http://en.wikipedia.org/wiki/Symbolic_execution and http://en.wikipedia.org/wiki/Symbolic_computation

Inspector

A Reven analysis typically consists of one or more execution, or exploration processes. Inspectors are pluggable components that can be used to alter an execution or an exploration process.

Currently, we can split the inspectors into two categories:

Behavioral inspectors

Those will modify the behavior of Reven. For example, the inspector alter_execution modifies registers or memories at a specified point in time.

Data inspectors

Those will record some data, to help to collect some information later. For example, memory_range_history records all dereferencings so that you can later build an history of memory accesses. string_access will analyze dereferencings to detect strings that are built or read.

Execution Trace

The result of an execution. An execution trace can be persistant if it is saved on disk, otherwise it will be discarded when the corresponding project is closed.

Preset

The configuration used for performing an analysis.

Dynamic analysis

Execution

Dynamic analysis of a set of binaries.

A Reven execution consists of running the specified scenario inside the symbolic processor emulated by Reven.

Once an execution is started, it can be paused, and resumed. Some data can be modified during the execution so as to drift from the original scenario.

As Reven emulates the whole processor, you can view the OS code as well the user code, extract some data and so on.

Run

A single flow of execution; it is a walk in the master graph. When perfoming an execution, several runs are generated.

A run can be interrupted by a hardware event (IRQ…) or a fault (page fault, general protection fault…). In this case, Reven generates another run, which is nested in the first one. This allows to prevent these usually unwanted events from polluting the trace of execution.

In this case, the containing sequence will be split, because at the end of the run, the execution will not necessarily continue in the current sequence. An example is if we have a pagefault that the OS cannot handle, and the OS kills the process we were executing.

Sequence in run

Each executed sequence has its own identifier. For example, in a loop, we have several "sequence in run" for the same sequence. For each run, the identifier of the sequence in run starts at 0 and is incremented at each sequence.

The identifier of a sequence in run is also frequently called its timestamp.

Execution point/range

When working on a execution, the need to refer to a specific instruction in time is frequent. This is represented using an execution point.

An execution point is an instruction inside a sequence in a run. It can be defined by a run, a timestamp and an instruction.

An execution range is a range of sequence or instructions located between two execution points.

Static analysis

Exploration

Static analysis of a set of binaries.

Given an input program and an environment in which it could be executed, the exploration will detect all possible code path that the input program could take.

The binary code of the program and its libraries is decoded into a certain number of sequence of instructions. Then some inspectors can be plugged in, and work on the sequences to find bugs or to give additional information through the Python API.

Master graph

The graph of sequences built by analyzing a set of binaries.

The master graph contains any sequence that has been executed or explored.

Sequence

A Reven sequence is a list of consecutive, unbreakable instructions (not taking faults or interruptions into account). This is similar to what is often called a basic block.

The main difference is that in Reven, some dynamic information can split a valid sequence into two smaller sequences. Typically, loops cause sequences breaks: the initialization part will be located in a different sequence:

dot_inline_dotgraph_3.png