radare2 redux: Single-Step Debug a 64-bit Executable and Shared Object.
In a previous post, I shared a debugging session of a 32-bit Linux executable (ELF 32-bit LSB) and a shared object (library) using radare2. This post builds on that information and presents the same general steps, but on a 64-bit driver program and a 64-bit shared library. I've also included installation steps for a recent version of radare2 (0.10.6). Recent versions have added improvements that are relevant to this exercise. However, it's important to point out that our exercise here takes advantage of a very small subset of radare2's capabilities.
Our sample program, cybtest, and its library libcyb64, haven't changed since our last post on this topic. We're justgoing to use 64-bit versions of them. As the diagram below illustrates, the driver program, cybtest, wants to make a simple call to a library routine, SHA1Init. Since the shared object, libcyb64.so, is dynamically-linked, the actual call in cybtest reaches the Procedure Linkage Table (PLT), where an indirect jump is made to an entry point address that corresponds to the desired routine in the shared object. When the routine ends, it's simple "ret" instruction returns immediately to the next sequential instruction in the driver program. Within the shared object itself, subroutines are called directly.
Before we get into debugging, though, let's walk through an installation of radare2. In this post, we'll install from a compressed archive that we download using "wget", GNU's non-interactive network down-loader. As with the previous post on this topic, we are installing radare2 as a user (not root). The hosting operating system is Ubuntu Linux 16.04.1 (Xenial Xerus) x86_64 on Linux kernel 4.4.
After the download completes, decompress and expand the archive.
Enter the folder created by the archive expansion and use "configure" to prepare for building radare2.
At the end of the configuration, a report is displayed showing the product version and other useful information.
Now we build radare2 by issuing the make command and install using "make install".
Building the Driver and Shared Object
With our installation complete, we can check the radare2 version with the "-v" argument.
Now, we'll rebuild our test driver program and shared object. The source code for both components is written in Assembly language. We assemble using the Netwide Assembler (NASM) and link using gcc. The Makefile creates both 32 and 64-bit versions of both the driver and library. It also includes commands to copy the library to /usr/local/lib and setup symbolic link to the shared object.
After building the program components, use the "file" command to verify that both files are executables of the proper size and are stripped of symbolic information.
Starting the Debug Session
Now we start our radare2 session, indicating our driver program with the "-d" (debug) argument.
When radare2 starts, it reports the process ID (PID) of the instance of the driver program, cybtest, and attaches to it. The base address (baddr) of the driver program is reported here as 0x00400000.
Now in radare2, we use "dm" to display mapped memory areas and "is" to inspect symbolic names.
There is a lot of information displayed here and we should note the following:
1. Only the driver (cybtest) and the loader (ld-2.23.so) are loaded into memory. Our shared object, libcyb64, is not yet loaded.
2. Our driver code occupies a read-only segment at 0x40 0000. Our driver data occupies writable memory at 0x60 0000.
3. The loader code resides at 0x7f53 9a1a 5000. We can see that our program's current instruction pointer is displayed as part of our command prompt and is within the loader's code segment, at offset 0xcc0 from its base.
Now we will enter "visual" mode using the "v" command and page using the "p" command until we see a disassembly of the loader code. We have not yet entered our driver program. But, we'll see where the entry point will be.
In visual mode, we can compare our instruction pointer, rip, to our command prompt to to verify that it displays the address of our next instruction. The disassembled source of the loader is displayed. The loader makes a call to a subroutine that will return the entry point of our driver program in rax. The loader preserves this value in r12 and will finally jump to that address at the bottom of the routine shown above. Using the "F8" key to single step over this call, we would see the entry address of our driver, 0x400040, loaded into rax and then into r12.
While all this is fascinating, our point is to debug our own code, not the loader, so we will set a break point at the "jmp r12" instruction and execute the loader code up to that point. To do this, enter the ":" command to open a command prompt while still in visual mode. Enter the "db" command to set a break point and use the address of the "jmp" instruction as the command argument. Rather than type it, I used the mouse to select the address text from the screen and paste it after the "db" command.
Entering the "db" command with no arguments displays all current break points.
Now, simply pressing Enter will close the command prompt and our break point is indicated now with a "b" on the "jmp" instruction line.
Using the "F9" key, which is mapped to the "dc" (debug|continue) command by default, advances execution to the "jmp" instruction.
Single-stepping from here, using the "F7" key, mapped to the "ds" (step into) command, will take us to our driver program's entry point.
Debugging the Driver Program
We'll start by setting some break points in our driver program. We'll set a break point after each "call" instruction. While in view mode, we can use the "b" command to toggle a break point at the top-most instruction. We can also use the "j" and "k" keys (or up and down arrow keys) to scroll up and down instructions.
When we have set all our break points, we can use the "." (period) key to return the display to our current instruction.
To illustrate the single-step a little more closely, we press the "F7" key again now. Note how the updated display provides a "dynamic" comment indicating where our current instruction pointer (rip) address is.
Pressing "F7" again follows the call to "sym.imp.SHA1Init". While we want to end up in our shared object, this local call only takes us into code that is generated as part of the Procedure Linkage Table, located just above our driver code.
Here we will pause and exit visual mode for short time while we set some break points in our shared object. There are two reasons we do this.
1. Hitting "F7" here will cause radare2 to erroneously branch to an invalid address.
2. Setting break points in a shared object using radare2 isn't as straightforward as it probably should be due to some interesting addressing anomalies we'll look at below.
So, first, we exist visual mode using the "q" (quit) key, returning us to command mode. Then we'll use the "dm" command to display mapped memory again.
Notice first, that our shared object, libcyb64, now appears in our list of memory segments. It is shown as having its code segment loaded at 0x7f58 9e25 b0000. Keep that address in mind.
Next we'd like to see what symbols are exported in our shared object and determine their addresses. We'll use the "dmi" command with our shared object name as an argument. This command displays rows for shared object symbols that are imported by our driver program.
But ... notice the virtual address (vaddr) values displayed for our shared object symbols. These are addresses in our driver program address space, NOT our shared library address space.
Next, look at the offset address (paddr). For functions, as it turns out, these offsets are correct for routines in the shared library. For example, the SHA1Init routine does indeed exist at offset 0x418 in the shared object's code segment. But it does NOT occur at memory address 0x00400418.
What is at address 0x00400418? It's a garbage address in the driver's code segment as we can see here. Here is the dmi command again, followed by a "pD" command displaying our driver code. Notice that 0x00400418, that "dmi" suggests is the address a library routine, is actually in the middle of an instruction in our driver.
So, the vaddr values are meaningless, but the paddr values are valid offsets. So, to reach our shared object routine addresses, we'll need to combine the base address of the shared object code segment, that we noted above, with the offset values from our "dmi" command.
Here's an example of doing that to reach the SHA1Init routine in the shared object. We capture the base address of the shared object and use a "s" (seek) command to that address plus the offset of the routine. Yes, radare2 commands can do math on arguments. Now, when we seek to that address, our command prompt changes. Entering the "pD 38" command displays our SHA1Init routine in the library.
Debugging the Shared Object
With our current instruction set to a valid address in our shared object, we can re-enter visual mode with the "v" command and set break points as we'd like using the "b" toggle.
Here is an expanded listing showing breakpoints at the four shared object routines called by our driver, HashGetDigest (+0x3e7), SHA1Init (+x418), SHA1Update (+x43e) and SHA1End (+x486).
With our shared object break points set, we can return to our current instruction using the "." key.
We can now debug into our shared object using the "F9" key (dc command) and we will stop at our break point in SHA1Init.
Now that we are in our shared object, "call" instructions will take us directly to other library routines. These calls will single-step into routines using "F7". Or we can step over the call using "F8", mapped to the "dco" command.
As an example, we single step to the call into HashInit.
"F7" takes us directly to HashInit.
When we reach a repeated command, such as "rep stosb", recent versions of radare2 allow a single "ds" command to execute all iterations and proceed to the next instruction immediately.
When we reach the end of HashInit, the "ret" instruction returns to "SHA1Init". That routine's "ret" instruction returns directly to our driver program.
With our break points set in both our driver and shared object, we can continue through to the end of our program this way. If we reach a point in a routine where we simply want to proceed to the next return instruction, the "dcr" command will let us do that.
Here, in SHA1End, we again use the ":" to open a command prompt in visual mode and enter the "dcr" command.
Notice something interesting about the "dcr" command. Radare2 reported "break points" at four addresses where break points weren't actually set. These addresses correspond to the end of repeated commands ("rep stosb") and the return addresses after "call" instructions that were actually executed.
Press "Enter" to return to visual mode.
At the "ret" instruction, control again returns to our driver program.
Now, instead of stepping into the last routine we'll call, we'll simply use the "F9" key to continue execution. Doing this, the break point we set at HashGetDigest will stop us in the shared object routine.
Repeating our use of "dcr" and returning to the driver program, we single step until just before the "syscall" instruction.
Here, our driver program has loaded the first byte of the computed SHA1 hash of our test message, "A9" into the AL and, subsequently, the RDI register, to be the return value from our program.
Note that in the x64 application binary interface (ABI) on Linux, we use the "syscall" instruction to issue a system call, instead of the 32-bit ABI method of calling "int 0x80".
Syscall's function code parameter in rax, (0x3c) indicates a program exit request. The first call argument register, rdi, contains our program result code.
Finally, to exit radare2, use the "q" key to exit visual mode first. Then the "q" command again to exit radare2.
Once our code is debugged, we can observe the same result code returned using a simple shell. (0xa9 = 169).
Well, that concludes this 2nd look at Radare2! We've demonstrated installation from an archive, a simple driver with shared object including the Makefile, stepping into the loader and locating our driver, using base plus offset seek, stepping through shared library routines, setting break points and some related debug commands. Future blog posts will compare this experience to similar tools and in other environments and will turn the focus to the implemented algorithms themselves.