The free radare2 works as a multi-tool for exploring binary data from the console. Like the unixy console tools that inspired it (bash, vim, grep) it can feel tricky one moment and too simplistic the next. Similar to those tools, radare2 can make the impossible possible. With extensive programming language bindings through r2pipe you can leverage your favorite programming language to explore binary files.

Although radare2 can be used for reverse engineering and debugging, I want to show how to use it to explore a very small binary from the early days of computing. Corkami, Ange Albertini’s great work on showing what the guts of binary file formats look like has a walkthrough called COM101. You can buy a poster version of it here if you like.

I’m going to walk through a session from zero starting with the file from COM101 and showing a few commands in radare2. The file can be found here it’s called “simple.com” It’s a COM file from the MS-DOS days that prints “Hello World” and exits.

Let’s start opening it in radare2 (note you have to figure out how to install r2 yourself which may or may not be the end of your reading)

First we open the binary with r2 simple.com and hit V [enter] to enter the hex viewer. You’ll see some text and if you hit p to change the print mode, you’ll see some text attempting to disassemble the machine code in bytes into assembly (asm). Disassemble is a weird word, since it seems like if you disassemble a chair you’d end up with pieces of a chair, but in the case of asm it makes sense. Asm is assembled into machine code, machine code in turn can be disassembled into asm. The term disassembly is often used for assembly that was generated from machine code (as opposed to by a human or a compiler chain).

fig 1. The disassembly looks invalid from the start

Radare2 has attempted to disassemble the bytes in the simple.com file. But this filetype is from an earlier era when machines had far less processing power. In fact this file is 16-bit. Files like these existed in the early DOS era and persisted into Windows, but eventually backwards compatibility was dropped in Windows 10 or something nearly 30 years later. If you want to run this file you’d likely have to install DOSBox or run 32 bit Windows. But we don’t really need to run it, we’re just using it as a 30 byte intro to radare2. So how do we fix this disassembly?

One of two ways either type q to get out of the visual mode and then set the asm.bits to sixteen with the following command (hit enter after):

e asm.bits=16

Or hit q then q [enter] to leave radare and open the file again.

r2 -b16 simple.com

Either way you should see a prompt like:

[0000:0000]>

Now we can get back to business hit V [enter] then p to show the disassembly again. It should now look like this:

fig 2. Slightly better 16-bit disassembly

But you’ll notice if you press p a few more times that there’s some text in there “Hello World!…$”, yet if you press p and go back to the disassembly, its turned all of the bytes into assembly. The code and data are all being treated as code. This is in part because COM files don’t have a header describing which parts of them are code and which parts are data.

So this means we have to describe which parts of the file are data and which are code. We can do this easily by entering cursor mode by pressing c, moving around using the vim style hjkl-keys and holding shift to select the segment with ‘Hello World!…$’ in it. If vim style navigation scares you, learn that first then come back.

You then should have a selection looking like this:

fig 3. Selected bytes

Now you can hit d and a little menu will come up:

    [Vd]- Define current block as:
     $    define flag size
     1    edit bits
     b    set as byte
     B    set as short word (2 bytes)
     c    set as code
     C    define flag color (fc)
     d    set as data
     e    end of function
     f    analyze function
     F    format
     i    immediate base (b(in), o(ct), d(ec), h(ex), s(tr))
     j    merge down (join this and next functions)
     k    merge up (join this and previous function)
     h    highlight word
     m    manpage for current call
     n    rename flag used at cursor
     r    rename function
     R    find references /r
     s    set string
     S    set strings in current block
     u    undefine metadata here
     x    find xrefs to current address (./r)
     w    set as 32bit word
     W    set as 64bit word
     q    quit menu
 

That’s a lot of options, but the one you want here is to set the ‘Hello World!…$’ bytes definition to data. So press d again (to ‘set as data’), the menu will go away as will the assembly that radare2 attempted to disassemble ‘Hello World!…$’ into. That’s replaced with essentially the raw bytes which you can see in the other views.

So now you have a way to go from compiled binary to the assembly that created it. This is what reverse engineers do every day, but often with more complex 32-bit and 64-bit executables. If you want to go deeper, start looking at the ebook Radare2 Explorations which has pointers to a lot of useful information and techniques.

Have fun!

comments powered by Disqus