A while back I needed to find out what an executable named HDSPOOF.EXE was doing to my system (This article is based upon an early version of the program found in the hdspooferv2.0.rar WinRAR file — an updated version of the program is available at www.taurine.game-deception.com as hwspoofv2.1.rar — the points and code fragments noted throughout this discussion are the same, only the addresses have changed in the newer version). Starting the program from the command line produced the following display:
With no other visible results than creating a configuration file with the name of HDSPOOF.INI in the program’s installation directory. But, a proprietary hardware identification driver and test program I had written for a client now generated different results after executing this program. Clearly something on my system had changed. A little bit of investigation yielded the discovery that this program had created and started a dynamic driver on the system and was trying to hide evidence of its presence. The driver was visible with a random name in my utility, NTDevices (available at my website, www.smidgeonsoft.com — look for an entry in the index minus the .SYS file extension), but the file for the driver had been deleted from my hard drive. Deleting the configuration file would not restore the expected results. There were still entries present in the system registry for the driver but under a key with a name different than the display name. Furthermore, rebooting the system and rerunning the program would now create a driver with a new random name and with new entries in the system registry but would still “spoof” the hardware identification program. Time to fire up a static analyzer program and then the debugger!
Munged Program PE-File Structure
Running DUMPBIN with /HEADERS and /IMPORTS on the HDSPOOF executable I produced output found in Listing 1. (Or, alternatively you can use my utility, PEBrowse Professional, available on my website to examine the file.) I immediately saw that there are four unnamed sections, two containing code, and that there are four imports from KERNEL32: LoadLibraryA, GetProcAddress, VirtualAlloc, and VirtualFree. A closer inspection of the output though showed that there are some things that are “not quite right” with this PE-file. For starters, the loader flags holds 0xABDBFFDE (normally 0x0) and the number of directories is 0xDFFFDDDE (normally 0x00000010), both values that are bizarre and seemingly incorrect. (In fact, starting SoftIce and then this program results in a complete lock-up of the system; but after hacking a version of HDSPOOF with more “normal” values, I was able to run both at the same time.) And, in three of the four identified sections, the value for virtual size is different than the size of raw data number, which sometimes contains zero or an astronomically high number, i.e., 0xD1EEBD1E. Finally, the location of the entry point for the program could be found in the second of the two sections that contain code. Disassembling this address generated a jumble of call and jump statements that make little or no sense. And yet the program was able to run and “complete” successfully! Perhaps a debugging session could shed more light on what is going on.
Getting Started — The 1st Exception
Launching the HDSPOOF program inside a debugger (I used my own debugger, PEBrowse Interactive V7.22.2, found at the SmidgeonSoft website), I immediately hit an access violation exception at 0x0041A418. (If you decide to use my debugger to follow along, I strongly suggest that you disable disassembly analysis mode which can be done via Tools/Configure and then on the Disassembly tabsheet, checking the “Disable analyze mode?” checkbox. Otherwise, single stepping will take a long time as my debugger attempts to analyze and list the entire jump and call targets from the current EIP. Also, I have included in the downloads a file named HDSpoof(RoadMap).txt that contains directions, notes, and important addresses for successfully stepping through this program.) Backing up a bit in the disassembly, I saw the following statements:
0x41A416: XOR EAX,EAX
0x41A418: MOVZX EAX,DWORD PTR [EAX]
Simple enough: the program is trying to access memory from a null pointer. Now, in the discussion that follows I am not going to explain all of the false leads and paths you can take (and I probably saw as I stepped through this beast). What you will read are the results of many hours of careful exploration and meticulous note taking of the techniques used by this program to stop any reverse engineering activity. I will, however, point out some false leads especially when they illustrate some of the deception designed to frustrate one’s debugging efforts.
Now what does this initial exception mean? The program when run from the command prompt ended normally so there must be an exception handler in the code (which I find at 0x0041ADC9 ? Debug/Exception Handlers in PEBrowse Interactive). Perhaps continuing from the exception will shed more light. I hit another access violation at 0x0042836C (with its exception handler set to 0x00428D1E) created by dereferencing a second null pointer:
0x42836C: XOR EBX,EBX
0x42836E: MOV EBX,DWORD PTR [EBX]
Well, the last exception was “handled”, so there seems to be no reason to be concerned; I continue. Another exception pops up at a very unusual address!
0xF9CC21EB: ENTER 0x4ABB,0x1
Oh, and there are no exception handlers for this one except for the default handler created when Windows starts a program. I am now in trouble as I am about to be booted unceremoniously out of the program. But the program completed successfully when run outside of the debugger, so I need to dig deeper into the code starting after the initial exception and inside the 1st exception handler.
I have included in Listing 2 a disassembly listing 64 statements one sees when they single-step through the code. The listing is broken into two sections: 1) the “normal” display (with portions of the object code highlighted); and 2) the actual statements that are executed (with the important lines underlined). If you carefully examine the object code for the statements that actually get executed, you will see that they are indeed “present” in the first half of the disassembly listing, but marbled throughout the odd series of jump, call, and other miscellaneous statements. You hardcore X86 assembly language junkies will note that by a clever arrangement of short jump and call statements the author of this code invokes statements “hidden” for the most part in the initial listing. Note the careful preservation of the stack, which hardly moves during all of the call statements (without return statements!) One very pleasing sequence occurs at 0x0041ADC9-0x0041ADD0 where the sequence jumps forward, then back, then forward again.
In amongst 64 statements there are only a few that are really doing useful work and these are the underlined statements in the second half of Listing 2. Looking at these 8 statements I see the two unusual RDTSC statements which according to the IA-32 Intel Architecture Software Developer’s Manual, Volume 2B are defined as, “Read Time-Stamp Counter — Loads the current value of the processor’s time-stamp counter into the EDX:EAX registers.” The EAX portion of the result of the 1st RDTSC statement is pushed onto the stack and then subtracted from the 2nd time-stamp counter read. A comparison statement a little later on then jumps to two different locations depending on the size of the difference of the two calls. What does this mean? Well, if the program is executing normally outside of a debugger, the difference between the two RDTSC statements will be negligible, certainly less than the compare?s 0xFFF clock-ticks (or 0.5 ms on a 2 Ghz machine. But, if I am single stepping through the code with a debugger, then the difference will be greater and the “normal” route that that condition takes me leads to an INT 3 statement followed by the privileged OUT instruction (as shown in the listing). Then we have a 2nd exception which if you remember is inside of an exception handler and that means curtains for the debugging session. So, in order to override this anti-single stepping code I need to change the flags set by the subtraction statement to equality (or to less than). (In PEBrowse Interactive I select from the Registers window Edit/EFlags and uncheck the Carry and Overflow checkboxes (if necessary) and check the Zero/Equals checkbox.) Furthermore this must be done multiple times since the same pattern appears over and over again as the program creeps forward. Alternatively, one can locate the target of the first jump statement and force this as the next EIP (Edit/Set EIP here in the disassembly window in PEBrowse Interactive).
Removing and Creating Exception Handlers
If I continue on single stepping through this maze of jumps and calls and carefully control the results of the comparisons (16 times) I reach the code found in Listing 3. The code takes several DWORDs from the stack — these are actually the debug registers, Dr0, Dr1, Dr2 and Dr3 from the context structure passed as the 2nd parameter to RtlUserExceptionDispatcher inside KiUserExceptionDispatcher — and XOR’s them in register ECX. For more information, see Matt Pietrek’s “A Crash Course in Structured Exception Handling” from MSDN, January, 1997. The code then follows with:
0x41B782: MOV ESP,DWORD PTR [ESP+0x8]
0x41B786: POP DWORD PTR FS:[0x0]
Checking the list of exception handlers before and after the execution of these two statements I note that all handlers except for the default are now removed — this means that if the code raises any exception from here on the debugging session will be immediately terminated. I need to be very careful from now on! Also I will need to remember that value stored in the ECX register even though the current code does not do anything with it now.
Jumping ahead a little bit in the discussion, you will see that I will encounter a sequence of instructions like the following (see Listing 4):
0x41F52B: ADD DWORD PTR [ESP],0x9CA
0x41F532: PUSH DWORD PTR [DWORD PTR FS:[0x0]
0x41F539: MOV DWORD PTR FS:[0x0],ESP
Which creates a new exception handler at the contents of ESP plus 0x9CA. The code is manipulating the exception handlers? list to control both the handled or unhandled nature of an exception and the target of handled exception code! This creates some interesting possibilities for controlling the flow of the code, for example, by diverting its path down the wrong direction dependent on conditions one creates and sets in the program.
1st Timing Check
Plowing ahead past 16 more comparisons I run into the handful of statements:
0x41C139: SUB EAX,DWORD PTR [ESP]
0x41C13C: ADD ESP,0x4
0x41C13F: CMP EAX,0x12345678
0x41C144: JBE 0x41C14B
0x41C146: JMP 0x42505E
Until this point I have grown accustomed to a comforting sequence of the RDTSC instruction followed by a PUSH and counter read statement and then the inevitable subtraction; but what is the meaning of the single RDTSC instruction? It clearly looks like the many other timing-checks I have circumvented in the past; so where is the source of the value that was pushed onto the stack? If you remember I launched the program and started single stepping from inside the exception handler. However, I could have started single stepping from the program’s entry-point instead. Is this where the missing date-time stamp value was set?
Restarting from the entry-point breakpoint, I see the familiar pattern of short jump and call statements with the anti-single stepping check and eventually reach the code:
0x419A54: PUSH EAX
0x419A55: CALL 0x419A5A
0x419A5A: ADD DWORD PTR [ESP],0x136F
0x419A61: PUSH DWORD PTR [DWORD PTR FS:[0x0]
0x419A68 MOV DWORD PTR FS:[0x0],ESP
0x419A6F: JMP 0x419A76
Here I see a lonely RDTSC instruction and its EAX results pushed onto the stack. This is followed by a sequence similar to the one in Listing 4 — note the hard coded pointer arithmetic — where the program is setting up an exception handler. Stepping past another set of 16 date-time stamp comparisons (am I noticing a pattern here?) I reach the null pointer dereference that caused the program’s initial access violation at address 0x0041A416! And now I can guess the reason behind the comparison using the threshold of 0x12345678 instead of the more common, 0xFFF. If a debugger stops at the access violation and gives one a chance to “do things” like setting breakpoints, examine memory, etc., the number of elapsed clock-ticks will easily surpass the 0x12345678 limit forcing the comparison down the path to 0x0042505E (note the second, unconditional jump statement at 0x0041C146). This direction eventually leads to the 2nd access violation we saw when I first started debugging the program and the death of our debugging session. The value is high enough so that if HDSPOOF is running outside of a debugger the code will follow its “real” path. There is also a small element of randomness in the check (since the EDX results are discarded) so that sometimes the comparison will take us down the real path. So the purpose of this comparison is to check if a debugger has trapped the omnipresent null pointer assignment and to send the debugging down the wrong direction.
Unpacking Code – 1st Section
Now I seem to be getting somewhere since using the debugger I can force execution down a path the original coder of HDSPOOF was trying to hide — see Listing 5. What is this sequence doing? First it is preserving the contents of ECX on the stack and then making another of those short calls immediately popping and adding the return address to a hard coded offset at 0x9C4 that neatly takes care of the return address on the stack. Note this sequence — heavy use of this trick is made later in the program. EDI now contains 0x0041CB17 — if you examine a portion of Listing 5 you will see that this address seems to contain junk code (though with what I have already seen I cannot be too certain of this.) The code then pops a value off of the stack and adds 0x15 to it. It next takes the byte value contained at 0x0041CB17, XOR’s these two values together and then stores the result back into 0x0041CB17. Finally in a loop lasting 147 (0x93) iterations the bytes following 0x0041CB17 are modified with the same unpacking value producing the results you see displayed in the last section of Listing 5. The program is unobfuscating itself as it runs — in this case the next important piece of work to be performed by HDSPOOF! One obvious conclusion from observing this trick in action is that any static analysis ahead of time on HDSPOOF is likely to be incomplete (unless one is able to identify and modify ranges like these in the analysis tool.)
Another not so obvious fact is that this activity effectively prevents setting breakpoints at known addresses in advance. Why is this? Think about how most debuggers create and respond to user breakpoints — the debugger replaces a byte at the target address with 0xCC, i.e., INT 3, and then when EIP reaches this address and the interrupt is generated and handled, the debugger restores the original byte and forces the computer to reexecute the instruction at the EIP where the breakpoint exception happened. Now, if the breakpoint is added ahead of time, the above code will happily XOR our INT 3 effectively wiping out the breakpoint. And, more than likely the new instruction will likely lead to some exception that program is not prepared to handle — remember at this point in the code all except the default exception handler have been removed.
Anti-debugging Trick in Program Startup
Some of you now may be wondering about that value that was calculated and stored in ECX by the code in Listing 3 and then pushed and then popped off of the stack at the beginning of Listing 5. It was mysteriously involved in the unobfuscation algorithm we have just examined. When I stepped through this code the value for this DWORD was zero, so in effect it added nothing to the algorithm’s calculation. But why was it calculated, preserved, and then referenced in the first place? When I first started to examine HDSPOOF I sometimes got confused after executing the code in Listing 5 — sometimes the code beginning at 0x0041CB17 was correctly unobfuscated, other times it was junk. Then it dawned on me that I was starting my debug sessions using varying combinations of startup conditions. My debugger allows one to ignore the initial break on entry-point execution (see Tools/Configure and the Debug tabsheet under “Break on entry-point execution?”). It also allows one to ignore exceptions and allow a program’s 1st chance exception handling to take over. Coupled with the fact that the starting address in Listing 5 was known in advance and that it was not unobfuscated by the program, there are three similar paths that can be taken in order to reach this point in the code. First, there is breaking on entry-point execution and then single stepping from there. Second, there is single stepping after breaking on the 1st null pointer exception. Finally, one could break on the entry-point execution, continue, break on the null pointer exception, and then set a code breakpoint at 0x0041C14B inside the exception handler, and continue. In the first two cases I saw that the last byte of the calculated ECX value was 0x00 but in the latter the value was 0xF9! Now, I will be honest and say that I cannot explain why the system would hold different values in the stack for these three scenarios. But if the last byte in ECX is something other than 0x00, the unobfuscation value, whose default is 0x15, would be different and the code unpacking would generate corrupt instructions.
Finding and Checking For A Breakpoint on GetProcAddress
Recalling that the limited upfront analysis of HDSPOOF yielded the fact that the program was importing GetProcAddress from KERNEL32.DLL, I might have been tempted to set an advance breakpoint on this routine. This would fail because of the following sequence. Continuing through another batch of date-time stamp checks (16 again), I now reach the code that was revealed by the last unpacking operation (see Listing 6). If you decide to tackle this beast yourself, you can step through this section of the code and determine how it is doing what it is doing. Instead here I will direct you to some of the more interesting points in this code fragment and offer a general explanation of what is going. First, it takes the address of the default exception handler and zeroes out the low-word effectively yielding an address some where inside of KERNEL32’s .text section. It then proceeds to examine the first two bytes to see if they are equal to “MZ”; if the comparison fails then it subtracts 0x1000 from the address and repeats the test — it is looking for the base address or HMODULE of KERNEL32 without performing a GetModuleHandle call or equivalent! Once it has found the correct image base, it uses the optional header to find the virtual address of the export table and sets a register to hold this address. Using this pointer, it next locates the RVA of the table holding the names of all of the exported APIs from KERNEL32 and examines the first 8 characters of each entry for the character sequence, “GetProcA” — what does this mean? The code is trying to find the index value of the API, GetProcAddress inside of the table! Armed with this value, it then reaches into the ordinal table to grab the correct index into the Export Address Table and hence the entry point for GetProcAddress. But what possible reason could there be for the program to obtain this address when it is already importing this function via its import table? The next five statements provide the answer:
0x41CB83: CMP BYTE PTR [EDI],0xCC
0x41CB86: JNE 0x41CB91
0x41CB88: XOR ECX,ECX
0x41CB8A: XOR EDI,EDI
0x41CB8C: JMP 0x419A52
It is checking if a breakpoint has been set at the start of GetProcAddress and, if it finds one, sending the debugging session off on another false scent. This trap is easily circumvented if instead of naively setting the breakpoint at the beginning of GetProcAddress, I set it somewhere in the middle — best near the end where EAX contains the found address.
2nd Code Unobfuscation
Since I have set the breakpoint in the manner previously spoken of I continue to the final portion of the code in Listing 8 — the initialization of an important internal data structure and another code unpacking loop. Observe that there is the same trick of a CALL followed by a POP and followed by the addition of a hard coded value. This provides the location of a data structure that will hold several critical pieces of information. The code stores away KERNEL32’s image base and the entry point for GetProcAddress. Then after another short call in order to calculate an address, there follows more code unpacking. This time, though, it is a straightforward XOR’ing of 3135 (0xC3F) bytes with the value of 0x15 starting at 0x0041D575. At the end of Listing 8 you will find a small fragment of the code that is revealed which not surprisingly contains the next major piece of work for HDSPOOF.
Some Misdirection From GetProcAddress Calls
Now, I will see what the breakpoints in GetProcAddress reveal. I let the debugger continue (instead of stepping through some more timing checks) and find these results (using PEBrowse Interactive’s register display):
Now, having set an additional breakpoint inside of LoadLibraryA, I can determine which DLLs in addition to KERNEL32 are of interest to the program:
Both of these HMODULE values are stored in that structure on return from the LoadLibraryA calls. There follows more calls to GetProcAddress:
Note that all of these entry points are neatly tucked away in the structure beginning at 0x0042B000. There are some promising leads in the saved addresses such as CreateFileA given what we know already about this program’s payload — a dynamically created and loaded device driver. If you are debugging this program yourself, you might want to set breakpoints on these routines as well.
Anti-Debugging From GetProcAddress Calls (2nd Timing Check)
Single stepping from the breakpoint inside of GetProcAddress for the GetModuleHandleA call I now see the code starting at 0x0041D7C8 found in Listing 7. Note that by allowing the debugger to break inside of GetProcAddress I have bypassed more RDTSC-torture but passed over several fragments of code also found in Listing 7, in particular the companion RDTSC instruction to the one I now see in the debugger. Also, you will see that the comparison is done against another large value, i.e., 0xFFFFFF. Based on earlier observations of code similar to this we can guess that this another check by the coder of HDSPOOF guarding against breakpoints set inside of GetProcAddress. There is another possibility though — this prevents analysis using tools like BoundsChecker that hook APIs (in this case GetProcAddress) from revealing any important clues about the program’s operation. Presumably, the code in the hooked function is not fast enough to fall below the 0xFFFFFF threshold and will be misdirected by this second timing check.
Saving the IsDebuggerPresent Flag and More Unpacking
Another set of timing-checks brings me to the code presented in Listing 8. A test against the contents of the CS register determines if its contents are less than 0xFF. If so, the code stores another piece of data in the structure beginning at 0x0042B000 — the IsDebuggerPresent flag. The code locates the address of the Process Environment Block (PEB) and stores the contents of the 3rd byte, which indicates if the process is being debugged in the 4th DWORD inside the HDSPOOF data structure. It then proceeds to read the beginning of the load-order linked list found in the PEB and reference the module item belonging to the main executable (which is always the first item in this list) modifying the image size field by adding 0xC8 and adding zero to the image base field. I do not understand the reason for these two operations although they might confuse or defeat static analyzers or debuggers. But, the code now moves on to another unpacking operation using the address calculation trick to start the unpacking at 0x0041EB7C with a value of 0x11 for 7506 (0x1D52) bytes. This suggests from our earlier experiences with unpacking activity that the next piece of work in HDSPOOF begins somewhere around 0x0041EB7C. Let us see. Carefully single stepping through 16 more RDTSC checks I receive a surprise as I check the EIP. The code that was so recently unpacked is more of the maze of short jumps and calls — have I been led astray? No, by bulling ahead through another set of 16 timing checks I encounter the code previewed by Listing 4. The author of HDSPOOF is now setting up another exception handler this time at 0x0041FEF5 and then invoking another access violation. Note that the null pointer reference and exception handler is within the range of bytes unpacked by the code examined in Listing 8. I let the debugger move on with a breakpoint set inside of the new exception handler.
The Correct Access Violation (3rd Timing Check)
Descending through more RDTSC madness I meet the code in Listing 9 and another timing check that spans an exception. There is some good news here if we note the EIP — this second exception is found in a new area of HDSPOOF. If you remember the inaugural debugging session and where the second access violation occurred, the debugger stopped at 0x0042836C; the exception handler for this error was 0x00428D1E and led fairly quickly to the demise of the debugging session. Now we are at 0x00420899 — it seems we are making slow, but steady progress. If the code looks vaguely familiar, it is! Recall Listing 3 and you will see the same OR’ing of ECX’s contents from the exception context structure. I suggested that the reason for this sequence was linked to how the debugging session treats breakpoints on program entry and the initial access violation. As someone suggested: “Yes, even virus writers practice good engineering principles of code reuse!” Apparently, we are seeing the same thing here except this time it is using the 2nd null pointer error. Forcing our way through the timing check by changing the results of the comparison to equals we skip over some noteworthy code. If I allow this code to be executed I see that one of the bytes in the HDSPOOF data area at 0x0042B000 will be assigned a value of 1. I am going to tentatively name this byte the exception handler flag. We are now entering a new phase of HDSPOOF activity.
Setting Anti-Debugger Flags (IsDebuggerPresent, NtQueryInformationProcess, Single Stepping, and Exception-Handling)
Two more RDTSC timing checks leads to:
0x420A2B: ADD ESP,0x4
0x420A2E: POP EAX
0x420A2F: POP ESI
0x420A30: MOV BYTE PTR [ESI+0xD],CL
0x420A33: JMP 0x420A3A
If we are not paying close attention to what is happening (easily imagined after so much mind-numbing repetition), we might just pass over this and not realize what is occurring. ESI receives the POP operation and the address of the HDSPOOF structure and a second byte in this area receives the least significant byte of ECX, which is zero. Do you remember the XOR sequence in the code immediately after the second access violation? Here, one byte of its results is squirreled away in the data structure. Another 16 date-time stamp arithmetic comparisons bring us to another code fragment also easily overlooked:
0x4213DA: CALL 0x4213E4
0x4213DF: AND EAX,0x732573
0x4213E4: CALL DWORD PTR [ESI+0x14]
0x4213E7: JMP 0x4213EE
After another CALL statement there is a call to an address stored in the data structure located at 0x0042B000 — the entry point for OutputDebugStringA. It looks like the code is starting to make use of all of those GetProcAddress calls performed a while back. Stepping over this CALL instruction adds the following line to the debugging log:
PID: 0x574 TID: 0x9CC %s%s
Not much interesting happening here except perhaps some leftover debugging code. Another series of anti-single stepping checks yields code found in Listing 10. There appears to be more substance here; examining the target of the two call statements through addresses in the data area, we find the following:
Now we are getting somewhere? We know that this program eventually (!) creates a dynamic driver. So perhaps this code is parsing the command line and using one of the parameters to create the file for the driver. But wait a minute. Running HDSPOOF from the command line did not require any input parameters. The README and DOS batch file that are part of the package (hdspooferv2.0.rar) did not mention anything about an input parameter. If we step through this code we will see indeed that the results of the CreateFileA call returns 0xFFFFFFFF, i.e., INVALID_HANDLE_VALUE. This is merely another false lead in the seemingly random activity of this program.
Listing 11 contains what patient stepping through another round of timing checks rewards us with. ESI is still pointing at the data area and the 17th DWORD holds the address of the NTDLL native API, NtQueryInformationProcess. Looking this routine up in Gary Nebbett’s Windows NT/2000 Native API Reference we see that a service code of 7 means that we wish to query “whether a debug port has been set or not.” Now this seems encouraging, enough so we will now step carefully through the code to find out what is going on. In order to make this call, one needs to first acquire a valid process handle — GetCurrentProcessId does the trick. Then the process handle must be opened with at least the PROCESS_QUERY_INFORMATION flags set. Is HDSPOOF doing this? Yes, look at the statement at 0x0042276D and the call to OpenProcess. Finally, the Nebbet book states that this type of call returns a boolean and is the reason for the size of the output area being set to DWORD-size. The result of the call is not actually returned in the EAX register, but as the 1st DWORD on the stack. Let us modify this value from 0xFFFFFFFF (meaning the debug port is set) to zero. Using PEBrowse Interactive, you can change the value via the menu item Edit/Modify.
Checking the Anti-Debugger Flags
Mind you — it may seem that we have been subjected to endless random activity in our journey, but the end is in sight! We have before us (you guessed it) another set of 16 RDTSC calls that lands at the start of the code found in Listing 12. Now, when I step through this code with my debugger, the CS register contains 0x001B and I skip over the byte check in SetUnhandledFilterCheck. (As an experiment I tried forcing the execution down this alternate path through HDSPOOF. After another batch of date-time stamp checks I reached the code fragment at 0x00423B0F, a division-by-zero exception and, of course, there is only the default handler available at this point. But, what would happen if the SetUnhandledFilterCheck were set up? Eventually the exception handling logic calls the UnhandledExceptionFilter API inside of KERNEL32, which would then pass the exception along to the code fragment starting at 0x004244BF in Listing 12. Since the return code from this handler is EXCEPTION_CONTINUE_EXECUTION execution continues at 0x004244E0 and the tests we see coming up.)
16 more date-time stamp comparisons later we reach the code in Listing 13. While the next few statements feel as if they are fractured, the end result is once again filling the ESI register with a pointer to HDSPOOF’s data area at 0x0042B000. The test is against the IsDebuggerPresent byte that was set way back in Listing 8. Assuming we either zeroed out the flag in the PEB before it was read and stored or we altered the contents of this byte after it was set, we continue on to the next test:
0x424EF2: CMP BYTE PTR [ESI+0xD],0x0
0x424EF6: JNE 0x4190AB
0x424EFC: JE 0x424F02
What — no intervening short jumps and calls? This test is against the values XOR’ed into ECX and saved at 0x00420A30 that, if we have been careful in our debugging startup conditions, is also zero. Single stepping a few more times delivers us to:
0x424F05: CMP BYTE PTR [ESI+0xE],0x0
0x424F09: JNE 0x4190AB
0x424F0F: CALL 0x424F1E
And a check against the NtQueryInformationProcess call results made in Listing 14. Finally, a couple more single stepping requests brings to light:
0x424F23: ADD ESP,0x8
0x424F26: CMP BYTE PTR [ESI+0xF],0x0
0x424F2A: JNE 0x4190AB
0x424F30: PUSH EAX
0x424F31: CALL 0x424F38
0x424F36: SUB DWORD PTR [EDX+0x58],EBX
0x424F39: IMUL EAX,EAX,0x3
0x424F3C: CALL 0x424F43
Where there is a test against the fourth and last byte in HDSPOOF’s data area, the byte that was saved in Listing 9 where the 3rd (and most recent) exception-handling timing check was saved. Drum roll and trumpet fanfare please — we have now succeeded in convincing this program that it is not being debugged! Whew!
If you have been following along with me using my debugger or your own favorite, we are coming to the last leg of our journey. But just like in the Boston Marathon, Heartbreak Hill looms on the horizon. The good news now is we have successfully navigated through all of HDSPOOF?s tests and traps blocking us from seeing what is coming next. The bad news is the program can now continue to completion installing the stealth driver (or whatever else it might be doing) without a debugger in some ways inconveniently aborting the process. Plus, there are a few more tricks standing in our way before we get to the real program.
1st Stage Unlocking of Memory Range (0x00401000-0x00408800)
If we examine the last five statements in Listing 19, there appears a sequence that seems to portend more of the RDTSC maze is in our future. Indeed, if we step into the CALL instruction and start watching what is happening, there appears more gyrations and contortions as the code JMPs from one instruction to the next. In an effort to make more sense of what follows I have prepared Listing 14 that once again contains two differing assembly listings. The first half is the disassembly of the code as it appears in the debugger with portions of the object code highlighted. You guessed it: the author of HDSPOOF is employing the same technique we saw in anti-single stepping timing checks. Furthermore, if you examine the sequence carefully you continually will pass through a series of short jumps like the following:
0x424F4D: EB01 JMP 0x424F50
0x424F50: EB02 JMP 0x424F54
0x424F54: EB01 JMP 0x424F57
which add nothing to the code (except making it harder to follow and understand!) Plus this all happens as part of a loop! The object code bytes that are in boldface represent statements found in the next section of the listing. Those bytes in boldface but struck-through are those representing the meaningless short jump instructions.
The second half of Listing 14 is the distilled essence of the next piece of hocus-pocus inside of HDSPOOF. After some meaningless activity, the code places in ECX the current EIP in order to calculate another hard-coded address: at 0x004190AB. Then the contents of the byte at 0x004190AB are added to EBX — which was initialized to zero — followed by an incrementing of EDI and a loop back to read the next byte in EDI. This loop is doing nothing more than producing a checksum of the bytes from 0x004190AB to 0x00424F47 (0x004190AB + 0xBE9C = 0x00424F47)! Once the loop has completed and the checksum been calculated, there are the assignments into EDI and ECX of the values 0x00401000 and 0x7800, respectively, followed by a loop that XOR’s the bytes in the range from 0x00401000-0x00408800 with our calculated checksum. The code is performing another unpacking operation! Listing 15 contains disassemblies of a portion of this range as it passes through stages of unobfuscation — in this case the first and second portions of the listing.
Forging on with Listing 14, we find another CALL statement followed by a POP of the return address into the ECX register. This is followed by:
0x425024: SUB DWORD PTR [ECX+0x39],EBX
If we examine the instruction at 0x0042505B we observe that it contains before the subtraction:
0x42505B: PUSH 0x9B10B5
But after the SUB statement executes, it will now be transformed to:
0x42505B: PUSH 0x405B76
The disassembly for the address at 0x00405B76 is:
0x405B76: MOV DWORD PTR [EAX],ECX
Interesting. If we now inch along to the next few statements we see more meaningless arithmetic that essentially takes the hard coded value, 0x00418EB0, and pops it off the stack as an exception handler — yet another way to set up an exception handling routine. Then, there follows:
0x425059: XOR EAX,EAX
Zeroing out the EAX register and then the modified PUSH instruction:
0x42505B: PUSH 0x405B76
And a return statement. What we have here is unconditional jump to the statement at 0x00405B76, which because EAX has been zeroed out, is an assignment to a null pointer and another access violation! Remember – the action of the return statement is to take the 1st DWORD on the stack and place this into the EIP register — hence the unconditional goto. At first glance then this code might suggest that we were heading towards a nonsensical address at 0x009B10B5 and the end of the program.
3rd Exception Handler and Transition to Uncompression Code Processing
The next code fragment contains the contents of the exception handler for our third (and last) access violation:
0x418EB0: MOV EAX,0xF0417D5A
0x418EB5: LEA ECX,DWORD PTR [EAX+0x10001179]
0x418EBB: MOV DWORD PTR [ECX+0x1],EAX
0x418EBE: MOV EDX,DWORD PTR [ESP+0x4]
0x418EC2: MOV EDX,DWORD PTR [EDX+0xC]
0x418EC5: MOV BYTE PTR [EDX],0xE9
0x418EC8: ADD EDX,0x5
0x418ECB: SUB ECX,EDX
0x418ECD: MOV DWORD PTR [EDX-0x4],ECX
0x418ED0: XOR EAX,EAX
EAX is assigned a funky value followed by an LEA instruction (the pointer arithmetic will “wrap” and assign 0x00418ED3 to ECX). This effectively changes the instruction at 0x00418ED3 from:
0x418ED3: MOV EAX,0x12345678
To (my debugger does not display this):
0x418EB0: MOV EAX,0xF0417D5A
Next the code locates the address of the exception in the _EXCEPTION_RECORD structure (found at ESP+0x4) and changes the code from
0x405B76: MOV DWORD PTR [EAX],ECX
0x405B76: JMP 0x418ED3
It then overwrites the first four bytes of the string “PECompact2” with a junk value and sets the return value to zero by XOR’ing EAX (which tells the exception handling logic that the exception was “handled”) before returning to NTDLL’s ExecuteHandler2. Note how the exception was corrected — the code was modified from a null pointer assignment and changed to an unconditional jump to the address, 0x00418ED3. I guess this is one way to handle exceptions in one’s code — if you cannot modify the data, modify the code!
One last obstacle must be overcome before reaching the inner workings of HDSPOOF — the unpacking process seen in Listing 16. Before plunging into an analysis of this code, recollect that the code in the previous listing overwrote four bytes of a string containing “PECompact2”. Given this coder’s obsession with confusing and hiding his tracks, what could this attempt to partly obliterate a string mean? A search on the Internet reveals that there is a product called PECompact2, which according to its website, “is a next generation win32 executable/module compressor.” Among this product’s reasons for existing is: “compression offers an inherent degree of tamper resistance and obfuscation.” There is also an SDK that apparently allows one to modify the loader code for PECompact2. Hmmm. The internet search also reveals that PECompact2 was used in the network worm, Net-Worm.Win32.Sasser.d (see http://www.viruslist.com/en/viruses/encyclopedia?virusid=50554). Is it possible that HDSPOOF also contains a compressed image that somehow uses PECompact2?
There are many packers and compression utilities available on the Internet, e.g., yoda’s Crypter and Protector, UPX, PEPack, Petite, Morphine, MoleBox, ExeStealth, ASProtect, and ASPack. If you dump the 3rd unnamed section in HDSPOOF you will see embedded in the binary a string and a reference to another compressor:
NeoLite Executable File Compressor
Copyright (c) 1998,1999 NeoWorx Inc.
Portions Copyright (c) 1997-1999 Lee Hasiuk
All Rights Reserved
Is this another example of FUD that the author of HDSPOOF has generously sprinkled throughout our journey?
After downloading a trial version of PECompact2 I decided to use CALC as a guinea pig and see how the compression product works and learn what it does. I have included the compressed version of the calculator program as part of the downloads for this article. If you run the compressed version of CALC (its compressed size is 56K as opposed to the original’s 112K) you will see that our beloved calculator pops up normally. If you decide to debug this module though, you will see that an access violation is generated right at the start of execution — sound familiar? Indeed, if you examine Listing 17 you will find an exception handler that matches almost exactly what was seen at 0x00418EB0 and in Listing 16. The major difference is that the jump statement (at 0x01026537 in the CALC listing) is missing — this bypasses a call to VirtualFree that frees up memory allocated to assist in the unpacking of the executable. In a nutshell then, Listing 16 and Listing 17 both allocate a chunk of memory using VirtualAlloc (see the CALL statement at 0x00418F07), and next extract uncompression code from the main executable and populate the chunk of allocated memory (see the CALL statement at 0x00418F31). The code in the VirtualAlloc’ed region or thunk is then executed to uncompress HDSPOOF’s main executable (see the instruction at 0x00418F5B). You can see a portion of the results of this final transformation at the end of Listing 15. HDSPOOF then calls VirtualFree (at 0x00418F71) to remove all traces of this operation before reaching the unconditional jump statement:
0x418F7B: JMP EAX
EAX will contain the value, 0x00405B60, which is the REAL entry-point for the main executable of HDSPOOF!
We have finally reached the goal of all our hard work. I am not going into a detailed description of all that happens from this point on. Here is a brief outline of the activities in HDSPOOF:
Main Executable (0x00405B60)
Create Exception Handler (0x004808F28)
Polling Available Hard Drives (0x00402509 calls 0x004039C4-0x00403A8A)
Call CreateToolhelp32Snapshot, ModuleFirst, ModuleNext (0x00404E64-0x00404F98) to Find ADVAPI32 and KERNEL32
Generating “input” Value (around 0x00402A72 @ 0x0012E2C0)
Create Registry Key SYSTEM\CurrentControlSet\Services\RaNdOmNaMe (0x004040D8-0x004041E2)
Create “input” Value under Services Branch (0x00403CE8-0x00403DE6)
Create and Write Driver File (0x00401920-0x00401A9E) – 0x004019AC-0x004019D0 unobfuscates driver @ 0x0040F040 with 0x26
Create and Start Driver (0x00401000-0x004010CE)
Call OpenSCManager (0x004010FC-0x004011EE)
Call CreateService (0x00401640-0x0040175A)
Call OpenService (0x00401294-0x00401386)
Call StartService (0x00401828-0x0040191A)
Call DeleteFileA (0x0040355C-0x00403646) – deleting the driver file
Call RegDeleteKeyA (0x004043E4-0x004044D2) – deletes entries in system registry under Services
Exit Program (0x00405CE1)
One way to continue analysis of the now uncompressed code is to generate a crash dump (available in PEBrowse Interactive’s menu at Debug/Generate Mindump…) If you read Jim Finnegan’s classic column, Nerditorium, from the February 2000 issue of MSDN, you will learn how a driver can be embedded as a resource inside of a program, temporarily written out to a file, and then dynamically loaded through calls to the Service Control Manager.
It might seem anti-climatic to just skim over a list of activities that create and start the payload embedded inside of HDSPOOF. For static analysis a copy of the driver can be “saved” by forcing the code not to call DeleteFileA. Reversing the driver and its activities is more challenging. In addition to requiring a kernel mode debugger that does not hang the system, I suspect that all of the work of the stealth driver is performed in its startup code and not via the more usual DeviceIoControl mechanism. One way to approach an analysis of the driver would be to modify, before it is written out, its startup code by replacing one byte with an INT3 and then modifying the checksum value to reflect this update. Remember drivers hold a checksum, which the system checks to detect any tampering. Then, when the breakpoint fires, restore the byte and restart execution from the point where the replacement was made.
But, recall at the beginning of this article, we already knew what the program was doing — we just were not sure how. Through patient perseverance and faith that any code could be reverse-engineered we were able to overcome a succession of exception handling traps, code obfuscation tricks, and anti-debugging logic puzzles that stood in the way of our enlightenment. For more information on this field you may wish to check out, “Reversing – Secrets of Reverse Engineering”, by Eldad Eilam, Wiley Publishing, Inc., 2005.
One final thought — I would really like to see the development environment and source code the author of HDSPOOF has used to create this program! The anti-single stepping code suggests handcrafted X86 assembly and the repetition suggests macros for these. For example, the source code for the anti-reversing portion of the program might look like the following:
void *p = NULL
*p = 0;
__except( EXCEPTION_EXECUTE_HANDLER )
... do some work ...
... do some more work ...
It is probably more complicated than this simplistic representation though, since if you remember there were numerous hard coded offsets used throughout to set up things like exception handlers and calculate the address of the data area.