REVEN-Axion 2017v1.4.0
OS semantic information

To improve understandability, REVEN can provide OS semantic information as binary and symbols. This page explains the various pieces of information and how it appears in AXION.

Binary information

What is called binary information is all information related to a segment of memory that is mapped into a process address space. Most of the time, a segment of memory is a binary loaded in memory but it can be a stack, a heap, a part of memory allocated by a process, ... A segment of memory is valid for a process and defined by a base address (=start address), a size and a name. (The base address is important for the management of symbols)

Where does binary information come from?

Information is retrieved from process map files generated by the dump_process tool during the generation of scenario. A process map file has the same format of linux map file and is indexed by pid.

Example:

1 addresses perms file offset Maj:Min inode pathname
2 08048000-08129000 r-xp 00000000 08:01 265388 /bin/bash
  • base address = 0x08048000
  • size = 0x08129000 - 0x08048000 = 0xe1000
  • name = "/bin/bash"

What is displayed in AXION?

For now, only paths are accessible and displayed.

What is bin_not_mapped:0x078c0000?

If the binary information related to an address is not available, then a generated name is displayed. It has the following form:

1 bin_not_mapped:0x<process's cr3>

Why is binary information unavailable?

There can be many reasons to that:

  • The dump_process was not launched during the scenario generation.
  • The process/binary was spawned/loaded after the execution of the dump_process.
  • The pid of the process could not be retrieved. Consequently, the related map file was not loaded.

How are handled memory segments that are mapped in multiple processes?

Binary information can be grouped in 2 categories:

  • information dependent on the process as base address.
  • information independent on the process as path/name or symbols.

Then information independent on the process is linked to each process that mapped the memory segment. In that way, duplication is avoided and more important, it becomes easy to propagate the modifications of information to all involved processes.

How to add binary information?

With the python API, it is possible to map memory segment in a process address space.

Example:

         Process Address
               Space
        cr3 = 0x078c0000

         |             |
         |             |
         |             |
         |             |
         |             |
 0x400000|-------------|
         |             |
         |             |
         |             |
         | Example.exe |
         |             |
         |             |
         |             |
         |             |
         |             |
         |             |
         |             |
 0x402000|-------------|
         |             |
         |             |
         |             |
         |             |
1 import reven
2 client = reven.Project('localhost', 13370)
3 
4 memory_segment = reven.AddressSpace.Segment(0x400000, 0x2000, 'Exemple.exe')
5 client.map_memory_segment_into_process(0x78c0000, memory_segment)

Symbol

Symbols are part of binary information. A symbol is linked to a memory segment and it is defined by a relative virtual address (RVA) and a name.

A RVA is an offset from the base address of the memory segment.

Why is RVA used?

Using a RVA instead of a virtual memory address allows to be independent on where the memory segment is mapped in the process address space. (see)

Where do symbols come from?

There are four possible sources of symbols:

  • (1) Automatically added by REVEN: during execution, generic symbols of the form func_<rva> are added at each target of a call instruction with no pre-existing symbol.
  • (2) Extracted from the binary file.
  • (3) Extracted from the PDB file (see PDB symbols).
  • (4) Manually added by user: Symbols can be renamed through Axion and added/renamed through the python API.

Can a symbol have multiple name?

No, a symbol has a unique name and there is only one symbol per RVA. If a symbol is modified, then the following priorities are applied between the various symbol sources:

lowest –(1)–(2)–(3)–(4)–> highest

How symbol's name is displayed?

The following example explains what will be displayed in various situations.

         Process Address
             Space
        cr3 = 0x078c0000

         |             |
         |             |                     Example.exe
         |             |                 base address = 0x400000
         |             |
         |             |                                      rva     symbol
 0x400000|-------------|                    .-------------.   0x0      nil
         |             |                    |             |
         |             |                    |             |
         |             |                    |             |
         | Example.exe |                    |-------------|   0x300    Sym1
         |             |         =>         |             |
         |             |                    |             |
         |             |                    |-------------|   0x1200   Sym2
         |             |                    |             |
         |             |                    |             |
         |             |                    |             |
         |             |                    |             |
 0x402000|-------------|                    '-------------'   0x2000
         |             |
         |             |
         |             |
         |             |

Possible cases for symbol's name:

  • [0x400000, 0x400300[ => Example.exe_<rva>.
  • 0x400300 => Sym1.
  • ]0x400300, 0x401200[ => Sym1+0x<offset from rva>.
  • 0x401200 => Sym2.
  • ]0x401200, 0x402000[ => Sym2+0x<offset from rva>.

How to modify/add symbols?

Through AXION, it is only possible to rename a symbol using the default shortcut N on the selected one in the in instruction view.

Adding new symbols is only possible through the python API.

Example:

1 import reven
2 client = reven.Project('localhost', 13370)
3 
4 sym1 = reven.Symbol(0x300, 'Sym1')
5 sym2 = reven.Symbol(0x1200, 'Sym2')
6 client.add_symbols_to_binary('Example.exe', [sym1, sym2])