“Step-over” is a common feature of debuggers. It allows us to avoid stepping into a subroutine, which is especially useful if the subroutine is thousands of lines long, or an operating sytsemsystem API, etc. It also allows the user to (for some debuggers) step out of a loop or skip a repeated string instruction. So what’s the downside? That depends on the debugger.
The most common attack against step-over involves self-modifying code, where the destination of the breakpoint is replaced by another instruction. By stepping over the replacing instruction, the breakpoint is removed,removed and uncontrolled execution results. The Obsidian Debugger and Titan Engine are vulnerable to the simplest implementation of that:
mov b [offset l1], 0b0h
l1: mov al, 1
This is of course unfair to Obsidian. Obsidian is a “non-intrusive” debugger (unlike Titan Engine), which means that it does not attach to the process. Instead, it allows stepping by reading and writing process memory, and placing “jmp $” instructions instead of breakpoints at the target address. Thus, it is not designed to handle self-modifying code.
Of course, most debuggers aren’t written that way, and they also don’t use breakpoints for stepping over ordinary instructions, so some knowledge of a given debugger is necessary. Unfortunately, Rock Debugger and FDBG are vulnerable to a simple trick: placing a “repeat” instruction in front of the replacing instruction, like this:
l1: mov b [offset l1], 90h
If a step-over is attempted at l1, then execution will resume freely from l2. However, a more subtle attack is possible. A debugger has no way of knowing if the breakpoint that it places at a location is the one that is executed. If the application removes the breakpoint, it can also restore the breakpoint afterwards, and then jump to the address to execute that breakpoint. The debugger will see the breakpoint exception that it was expecting to see, and behave as normal, like this:
mov al, 90h
l1: xor ecx, ecx
mov edi, offset l3
l2: rep stosb
cmp al, 0cch
l4: mov al, 0cch
In this example, stepping over the instruction at l2 will allow the code to reach l4. This will cause the breakpoint to be replaced by l2 and executed by l3. The debugger will then regain control. At that time, the only obvious difference will be that the AL register will hold the value 0xCC instead of 0x90, and which will allow l5 to be reached in what appears to be one pass instead of two. Of course, much more subtle variations are possible, including the execution of entirely different code-paths.
On the other hand, an indirect call that points directly to a “return” instruction should be safe to step over, right? Not in OllyDbg. It’s possible to construct a sequence such that stepping over the call will cause uncontrolled execution! Turbo Debug32 has a similar problem, but it’s not the special call->ret sequence that’s at fault. Instead, it’s the fact that Turbo Debug32 miscalculates the length of certain call instructions (a variation of a bug that I have described last year in a paper), allowing a jump instruction to be inserted after the call, which is not “seen” by the debugger. By stepping over the call, the jump is reached and executed instead of the breakpoint, resulting in uncontrolled execution.
WinDbg is not vulnerable to these attacks, but it is vulnerable to a detection method because of how it implements step-over for instructions with certain prefixes (or, in this case, redundant prefixes). The problem is that WinDbg uses the single-step method to step over instructions with redundant prefixes. If the instruction being stepped over is the pushfd instruction, then the T flag will be saved on the stack image, and WinDbg cannot prevent that from occurring, like this:
test ah, 1
Even debuggers have bugs.
– Peter Ferrie