Friday, October 14, 2016

radare2 redux: Single-Step Debug a 64-bit Executable and Shared Object.


Introduction


In a previous post, I shared a debugging session of a 32-bit Linux executable (ELF 32-bit LSB) and a shared object (library) using radare2. This post builds on that information and presents the same general steps, but on a 64-bit driver program and a 64-bit shared library. I've also included installation steps for a recent version of radare2 (0.10.6). Recent versions have added improvements that are relevant to this exercise. However, it's important to point out that our exercise here takes advantage of a very small subset of radare2's capabilities.

Our sample program, cybtest, and its library libcyb64, haven't changed since our last post on this topic. We're justgoing to use 64-bit versions of them. As the diagram below illustrates, the driver program, cybtest, wants to make a simple call to a library routine, SHA1Init. Since the shared object, libcyb64.so, is dynamically-linked, the actual call in cybtest reaches the Procedure Linkage Table (PLT), where an indirect jump is made to an entry point address that corresponds to the desired routine in the shared object. When the routine ends, it's simple "ret" instruction returns immediately to the next sequential instruction in the driver program. Within the shared object itself, subroutines are called directly.

Installing Radare2


Before we get into debugging, though, let's walk through an installation of radare2. In this post, we'll install from a compressed archive that we download using "wget", GNU's non-interactive network down-loader. As with the previous post on this topic, we are installing radare2 as a user (not root). The hosting operating system is Ubuntu Linux 16.04.1 (Xenial Xerus) x86_64 on Linux kernel 4.4.
 

After the download completes, decompress and expand the archive.


Enter the folder created by the archive expansion and use "configure" to prepare for building radare2.




At the end of the configuration, a report is displayed showing the product version and other useful information.


Now we build radare2 by issuing the make command and install using "make install".



Building the Driver and Shared Object


With our installation complete, we can check the radare2 version with the "-v" argument.


Now, we'll rebuild our test driver program and shared object. The source code for both components is written in Assembly language. We assemble using the Netwide Assembler (NASM) and link using gcc. The Makefile creates both 32 and 64-bit versions of both the driver and library. It also includes commands to copy the library to /usr/local/lib and setup symbolic link to the shared object.


After building the program components, use the "file" command to verify that both files are executables of the proper size and are stripped of symbolic information.



Starting the Debug Session


Now we start our radare2 session, indicating our driver program with the "-d" (debug) argument.


When radare2 starts, it reports the process ID (PID) of the instance of the driver program, cybtest, and attaches to it. The base address (baddr) of the driver program is reported here as 0x00400000.

Now in radare2, we use "dm" to display mapped memory areas and "is" to inspect symbolic names.


There is a lot of information displayed here and we should note the following:

1. Only the driver (cybtest) and the loader (ld-2.23.so) are loaded into memory. Our shared object, libcyb64, is not yet loaded.
2. Our driver code occupies a read-only segment at 0x40 0000. Our driver data occupies writable memory at 0x60 0000.
3. The loader code resides at 0x7f53 9a1a 5000. We can see that our program's current instruction pointer is displayed as part of our command prompt and is within the loader's code segment, at offset 0xcc0 from its base.

Now we will enter "visual" mode using the "v" command and page using the "p" command until we see a disassembly of the loader code. We have not yet entered our driver program. But, we'll see where the entry point will be.


In visual mode, we can compare our instruction pointer, rip, to our command prompt to to verify that it displays the address of our next instruction. The disassembled source of the loader is displayed. The loader makes a call to a subroutine that will return the entry point of our driver program in rax. The loader preserves this value in r12 and will finally jump to that address at the bottom of the routine shown above. Using the "F8" key to single step over this call, we would see the entry address of our driver, 0x400040, loaded into rax and then into r12.

While all this is fascinating, our point is to debug our own code, not the loader, so we will set a break point at the "jmp r12" instruction and execute the loader code up to that point. To do this, enter the ":" command to open a command prompt while still in visual mode. Enter the "db" command to set a break point and use the address of the "jmp" instruction as the command argument. Rather than type it, I used the mouse to select the address text from the screen and paste it after the "db" command. 















Entering the "db" command with no arguments displays all current break points.

Now, simply pressing Enter will close the command prompt and our break point is indicated now with a "b" on the "jmp" instruction line.


Using the "F9" key, which is mapped to the "dc" (debug|continue) command by default, advances execution to the "jmp" instruction.


Single-stepping from here, using the "F7" key, mapped to the "ds" (step into) command, will take us to our driver program's entry point.

Debugging the Driver Program



We'll start by setting some break points in our driver program. We'll set a break point after each "call" instruction. While in view mode, we can use the "b" command to toggle a break point at the top-most instruction. We can also use the "j" and "k" keys (or up and down arrow keys) to scroll up and down instructions.


When we have set all our break points, we can use the "." (period) key to return the display to our current instruction.


To illustrate the single-step a little more closely, we press the "F7" key again now. Note how the updated display provides a "dynamic" comment indicating where our current instruction pointer (rip) address is.


Pressing "F7" again follows the call to "sym.imp.SHA1Init". While we want to end up in our shared object, this local call only takes us into code that is generated as part of the Procedure Linkage Table, located just above our driver code.


Here we will pause and exit visual mode for short time while we set some break points in our shared object. There are two reasons we do this.

1. Hitting "F7" here will cause radare2 to erroneously branch to an invalid address.
2. Setting break points in a shared object using radare2 isn't as straightforward as it probably should be due to some interesting addressing anomalies we'll look at below.

So, first, we exist visual mode using the "q" (quit) key, returning us to command mode. Then we'll use the "dm" command to display mapped memory again.


Notice first, that our shared object, libcyb64, now appears in our list of memory segments. It is shown as having its code segment loaded at 0x7f58 9e25 b0000. Keep that address in mind.

Next we'd like to see what symbols are exported in our shared object and determine their addresses. We'll use the "dmi" command with our shared object name as an argument. This command displays rows for shared object symbols that are imported by our driver program.

But ... notice the virtual address (vaddr) values displayed for our shared object symbols. These are addresses in our driver program address space, NOT our shared library address space.


Next, look at the offset address (paddr). For functions, as it turns out, these offsets are correct for routines in the shared library. For example, the SHA1Init routine does indeed exist at offset 0x418 in the shared object's code segment. But it does NOT occur at memory address 0x00400418. 

What is at address 0x00400418? It's a garbage address in the driver's code segment as we can see here. Here is the dmi command again, followed by a "pD" command displaying our driver code. Notice that 0x00400418, that "dmi" suggests is the address a library routine, is actually in the middle of an instruction in our driver.


So, the vaddr values are meaningless, but the paddr values are valid offsets. So, to reach our shared object routine addresses, we'll need to combine the base address of the shared object code segment, that we noted above, with the offset values from our "dmi" command.

Here's an example of doing that to reach the SHA1Init routine in the shared object. We capture the base address of the shared object and use a "s" (seek) command to that address plus the offset of the routine. Yes, radare2 commands can do math on arguments. Now, when we seek to that address, our command prompt changes. Entering the "pD 38" command displays our SHA1Init routine in the library. 


Debugging the Shared Object


With our current instruction set to a valid address in our shared object, we can re-enter visual mode with the "v" command and set break points as we'd like using the "b" toggle.


Here is an expanded listing showing breakpoints at the four shared object routines called by our driver, HashGetDigest (+0x3e7), SHA1Init (+x418), SHA1Update (+x43e) and SHA1End (+x486).


With our shared object break points set, we can return to our current instruction using the "." key.




















We can now debug into our shared object using the "F9" key (dc command) and we will stop at our break point in SHA1Init.


Now that we are in our shared object, "call" instructions will take us directly to other library routines. These calls will single-step into routines using "F7". Or we can step over the call using "F8", mapped to the "dco" command.

As an example, we single step to the call into HashInit.


"F7" takes us directly to HashInit.


When we reach a repeated command, such as "rep stosb", recent versions of radare2 allow a single "ds" command to execute all iterations and proceed to the next instruction immediately.


When we reach the end of HashInit, the "ret" instruction returns to "SHA1Init". That routine's "ret" instruction returns directly to our driver program.


With our break points set in both our driver and shared object, we can continue through to the end of our program this way. If we reach a point in a routine where we simply want to proceed to the next return instruction, the "dcr" command will let us do that.

Here, in SHA1End, we again use the ":" to open a command prompt in visual mode and enter the "dcr" command.


Notice something interesting about the "dcr" command. Radare2 reported "break points" at four addresses where break points weren't actually set. These addresses correspond to the end of repeated commands ("rep stosb") and the return addresses after "call" instructions that were actually executed. 

Press "Enter" to return to visual mode.

At the "ret" instruction, control again returns to our driver program.


Now, instead of stepping into the last routine we'll call, we'll simply use the "F9" key to continue execution. Doing this, the break point we set at HashGetDigest will stop us in the shared object routine.


Repeating our use of "dcr" and returning to the driver program, we single step until just before the "syscall" instruction.


Here, our driver program has loaded the first byte of the computed SHA1 hash of our test message, "A9" into the AL and, subsequently, the RDI register, to be the return value from our program.

Note that in the x64 application binary interface (ABI) on Linux, we use the "syscall" instruction to issue a system call, instead of the 32-bit ABI method of calling "int 0x80".

Syscall's function code parameter in rax, (0x3c) indicates a program exit request. The first call argument register, rdi, contains our program result code.

Finally, to exit radare2, use the "q" key to exit visual mode first. Then the "q" command again to exit radare2.


Once our code is debugged, we can observe the same result code returned using a simple shell. (0xa9 = 169).


Conclusion


Well, that concludes this 2nd look at Radare2! We've demonstrated installation from an archive, a simple driver with shared object including the Makefile, stepping into the loader and locating our driver, using base plus offset seek, stepping through shared library routines, setting break points and some related debug commands. Future blog posts will compare this experience to similar tools and in other environments and will turn the focus to the implemented algorithms themselves.




Friday, October 7, 2016

Debugging a Linux Shared Library Using Radare (r2)


Radare (rada.re) is a portable reversing framework that can perform many powerful disassembly, debugging, analysis and exploitation functions. Here, we use a recent version, radare2 (or r2) to demonstrate single-step debugging into a shared library on Ubuntu Linux (Xenial LTS 16.04.01). 

Our shared library, libcyb, contains several simple message digest, or "hash" functions, implementing the SHA1 and SHA256 algorithms. These leverage common and reusable HASH functions. 

This library has been written in 32-bit assembly language using Intel mnemonics, assembled using NASM (Netwide Assembler) and linked using gcc, including the "-s" option to strip the library. The library exports a total of four global function entry points. 

A second program, cybtest, is a 32-bit assembly language driver also assembled with NASM and linked using gcc. The driver program is not stripped. Calls from the driver into the shared object (library) are coded using a simple call syntax, e.g. "call SHA1Update".

NASM assembles these calls as near calls which the loader (ld) will resolve to calls to the Procedure Linkage Table (PLT), which will resolve addressing into the shared object and JMP to the library routine.

If we use the "file" command, we can confirm the nature of these two executable files.



We start our debugging session by entering the "r2" command with a single argument pair, the "-d" to indicate we want to debug an object, and the name of our driver program, "cybtest".



Radare starts by reporting its process ID and presenting a command prompt. We will introduce several commands during this post, but these represent only a small fraction of the commands available in Radare.

Our first two commands help us understand the context of our debugging session and what objects are loaded into memory. The "dm" (show memory maps) command displays how program segments are currently mapped in memory. The "is" (info|symbols) command displays addressable program symbols found by Radare.



Notice here that the addresses (addr=) of all program symbols are within the mapped memory of the cybtest program. Also notice that our shared object, libcyb32.so, has not yet been mapped into memory at all.

To get our shared object loaded into memory, we need to let startup loader code in ld-2.23.so execute. But we want to pause at the beginning of our driver program logic. To do that, we set a break point at the beginning of our cybtest program, which was defined in our source code with the label "_start". This label is addressable as we can see from the output of "is" above. 

To set the break point, we use the "db" (debug|breakpoint) command. The "db" command without arguments displays current break points. At the start of the debug session, none are set. The "db sym._start" command sets a break point at the symbol "_start". Note Radare's command syntax here, using "sym." preceding the actual symbol name.



After setting the break point, the "db" command now reports the break point at the address of our cybtest program code. With the break point set, we now issue the "dc" (debug|continue execution) command to run our cybtest program until it reaches the break point we just set. 

Notice below, that after issuing "dc", the address "prompt" value changes to indicate we have reached the address of our break point - "0x080482c0", corresponding to what the "db" command reported above.



Now, when we issue the "dm" command again, we see additional mapped memory areas, including some for our shared object, libcyb32, in a wildly different area of memory than our driver program.

We'd like to see what new addressable symbols we have with our shared object loaded. Unfortunately, "is" won't show them. Instead, we use "dmi" (list symbols in target library) with the library name as an argument.



Note that both local and global symbols are shown, the last four in the list above corresponding to shared object symbols imported by our driver program.

We want to set break points now in our shared object. To do that we use the "s" (seek) command with the address of one of our imported library symbols as an argument, "s 0xf7773280". I seem to have success with this if I choose the first (lowest address) imported symbol, in this case, the address of "imp.HashGetDigest". 

Notice that the "s" command updates the prompt address to the address we just sought (seeked?).

Now, we use the "V" (enter visual mode) command.




In visual mode, we can toggle through several views using the "p" key. The initial view is a memory dump. Since we used the "s" seek key to seek to our share object HashGetDigest function, the memory area we are looking at is a code segment. 

We can press "p" until the disassembly view appears as shown below. When the disassembly listing appears, press the "b" key. Notice that a "b" indicator at address 0xf7773280, the start of the HashGetDigest routine. That "b" indicates that we have set a break point at that address. 



Since the routines in our shared object are not imported by our driver program, they are not addressable using the "sym." notation which we used to set our break point at "_start". So, we can either set break points using the "db <address>" format or scroll using the up and down arrow keys to bring a line to the top of our disassembly view and press "b". Here, we scroll to the start of the next routine at 0xf7773295 and press "b" to set the break point. 



Now that we have set some break points in our shared object, we can return to our driver program. To do this, simply press the "." (period) key. Notice that our disassembly view changes the address to the _start label address, 0x080482c0. We press "p" again to toggle the view to include 16 double-word stack values and our 32-bit general purpose registers. The current instruction pointer address is highlighted and our break point at _start is indicated with the "b" indicator.



Notice that in our driver program, Radare recognizes the symbolic names of the target address of our call instructions. Remember that since these call instructions are intended to reach code in our shared object, the calls here simply call code generated in the Procedure Linkage Table (PLT).

We don't necessarily want to step through all the code between the driver and the shared object. This is why we set the break points in our shared object. Now, instead of stepping "into" the call using "ds" (debug|step into) command, we can step "over" a call and let the code execute until the break point using the "dso" (debug|step over) command.

It may be helpful to mention that, by default, several debug commands are mapped to keyboard function keys. To see these mappings, along with other configuration settings, you can issue the "e" (evaluable) command. To do this while in View mode, exit View mode first by pressing "q" (quit) to return to the Radare prompt. Then enter "e". 



Now, return to View mode using the "V" command and press "p" (page) until the stack and general purpose registers are visible. Try a single-step now using the "F7" key, which is mapped to the "ds" command.



The "eip" register value is updated and the disassembly view now shows our new current instruction pointer, highlighted. At this point we will use the "dc" (debug|continue) command to run until the break point we set in the shared object on the SHA1Init routine. Since "dc" is mapped to "F9", simply press "F9".

At first it looks like nothing happened, except that the highlight disappeared from our current instruction. In fact, our "eip" shows us now at a new location. Press "." to update the disassembly view.



In our case, we can now single step into ("ds" or F7) several shared object routines, or step over them with "dso" or F8. At times we reach a point where the repeat "rep" opcode prefix is used on a store or move instruction. Both "ds" and "dso" operate in a way that executes only one iteration. To execute all iterations of the repeated instruction, we can use a "conditional" debug execute command, "dsi".



At the source code line shown above, issuing the "dsi ecx=0" command will instruct the debugger to continue until the ecx general purpose register contains a value of zero. In this case, that will leave us at the next sequential instruction.



Issuing the "dsi ecx=0" command while in View mode requires us to exit View mode, "q", and then entering the "dsi ecx=0" command and then re-entering View mode with "V". To avoid this, simply define a function key to issue "dsi ecx=0" if this is a condition that is often tested in your program logic loops.



At this point, we have introduced enough commands to manage stepping through our driver program and shared object. We have not looked at inspecting data areas, modifying data or register values. These may be covered in a later post. To summarize, here are the commands we introduced in this post:


b - toggle a break point at the current line in View mode
db - set a break point
db sym.<symbol> - set a break point at the address of a symbol
dc - continue program execution
dm - show mapped memory
dmi - show library symbols
dsi - execute until a condition is true
dso - step over instruction
e - display configuration settings, including function-key mappings
is - inspect (show) symbols in the current program
p - toggle through display pages in View mode
q - in View mode, exit View mode; in command mode, exit Radare
s - seek to symbol or address
V - enter View mode
. - in View mode, disassemble at current instruction pointer address