Some unknown hackers have recently attacked Travelex foreign exchange company using REvil ransomware. This trojan employs simple but efficient obfuscation techniques that conceal its WinAPI calls from the victim. Let’s see how the encoder works.
As usual, I load a sample into DiE and review the output.
DiE believes that the file is not packed. But let’s check the entropy of its sections.
Based on the section names, the file is packed with UPX; however, the entropy of these sections looks pretty weird. Why hasn’t DiE recognized the packer? Here is one of the possible reasons: the UPX signature could be purposively altered to confuse disassemblers. In any event, this is a packed file; so, I load it to the x64dbg debugger, set a breakpoint at the VirtualAlloc
function located not far from the entry point, and launch the trojan.
INFO
The unpacking mechanisms are pretty standard; so, if you have encountered an unknown packer, always set breakpoints at the following WinAPI functions:
- VirtualAlloc (the function allocates memory for the payload);
- VirtualProtect (the function specifies memory access attributes);
- CreateProcessInternalW (when a new process is created, the control is ultimately passed to this function); and
- ResumeThread (the function is used to resume the thread execution after an injection).
After reaching VirtualAlloc
, the breakpoint activates, and I get inside this function. After its execution, I return to the debugger and see the following picture:
008F9552 | FF55 B4 | call dword ptr ss:[ebp-4C] | VirtualAlloc
008F9555 | 8945 F0 | mov dword ptr ss:[ebp-10],eax | <---- I am here
008F9558 | 8365 DC 00 | and dword ptr ss:[ebp-24],0 |
008F955C | 8B85 58FFFFFF | mov eax,dword ptr ss:[ebp-A8] |
008F9562 | 0FB640 01 | movzx eax,byte ptr ds:[eax+1] |
Then I note an interesting piece of code in the end of VirtualAlloc
:
00569C10 | 8985 5CFFFFFF | mov dword ptr ss:[ebp-A4],eax |
00569C16 | 8B85 5CFFFFFF | mov eax,dword ptr ss:[ebp-A4] |
00569C1C | 0385 68FFFFFF | add eax,dword ptr ss:[ebp-98] |
00569C22 | C9 | leave |
00569C23 | FFE0 | jmp eax | An interesting jump!
It is necessary to keep in mind that after the execution of the VirtualAlloc
function, the address of the allocated memory is stored in eax
. So, I set a breakpoint at this jump, switch to the dump (the address is in eax
), and see what happens in the allocated memory. For that purpose, I set a one-time breakpoint at the beginning of data writing to the memory, and the debugger stops at the data writing cycle. A part of this cycle looks as follows:
00279DA4 | 8A11 | mov dl,byte ptr ds:[ecx] |
00279DA6 | 8810 | mov byte ptr ds:[eax],dl |
00279DA8 | 40 | inc eax |
00279DA9 | 41 | inc ecx |
00279DAA | 4F | dec edi |
00279DAB | 75 F7 | jne 279DA4 |
I start rolling the cycle manually, and a painfully familiar signature appears in the memory:
003C0000 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 MZ..........yy..
I resume the program execution in the debugger, stop at jmp eax
, make a step forward - and finally get inside the unpacked file! Now I can dump it and load to IDA Pro. After performing this simple procedure, I see the code of the start function:
public start
start proc near
push 0
call sub_40369D
push 0
call sub_403EEF
pop ecx
retn
start endp
Time to examine the functions and capabilities of the malware; I get into the first call
and see the following code there:
sub_40369D proc near
call sub_406A4D // There is only one subprogram before the hash-based function call. Apparently all important stuff is hidden there:)
push 1
call dword_41CB64 // Hmm, what is it?
call sub_40489C
test eax, eax
jz short loc_4036BD
...
...
I note a call to the sub_406A4D
subprogram and then a call in the format: call dword_41CB64
. I realize that if everything is left "as is", the application would crash at this point because dword_41CB64
points to a table looking as follows (this is just a part of this table!):
.data:0041CB64 dword_41CB64 dd 40D32A7Dh ; DATA XREF: sub_40369D+7↑r
.data:0041CB68 dword_41CB68 dd 0C97676C4h ; DATA XREF: sub_403EE1+6↑r
.data:0041CB6C dword_41CB6C dd 0D69D6931h ; DATA XREF: sub_403BC0+15↑r
.data:0041CB70 dword_41CB70 dd 8AABE016h ; DATA XREF: sub_406299+C0↑r
...
...
In addition, the import table is empty in this sample: the WinAPI functions are called dynamically, function names are not stored openly, and the program likely uses their hashes. In other words, WinAPI calls are obfuscated. So, I get inside the sub_406A4D
function using the debugger, see one unconditional transfer there and proceed into sub_405BCD
. In the beginning of this function, I notice some interesting code:
loc_405BD6:
// Element from the hash table ESI points to is pushed into the stack
push dword_41C9F8[esi]
// Process these data; the resultant value will be returned to EAX
call sub_405DCF
// Return back
mov dword_41C9F8[esi], eax
// Continue going through the list (each step is four bytes in size)
add esi, 4
pop ecx
cmp esi, 230h
jb short loc_405BD6
The sub_405DCF
function immediately attracts my attention. It contains plenty of code; so, I switch to the decompiled pseudocode (I am not an assembler guru and feel more comfortable dealing with the IDA Pro pseudocode).
The function is too massive to be listed here in full; however some of its components must be examined in detail. The execution of sub_405DCF
can be divided into two phases. The first phase involves the transformation of the existing hash sums specified in the program. The second phase involves the retrieval of functions' names from the system libraries export table, their hashing, and comparison with templates taken from the table discussed above.
In pseudocode, the parsing of the system libraries export table looks as follows:
v17 = (IMAGE_EXPORT_DIRECTORY *)(v13 + *(_DWORD *)(*(_DWORD *)(v13 + 0x3C) + v13 + 0x78));
v21 = (int)v17->AddressOfNameOrdinals + v13;
v18 = (int)v17->AddressOfNames + v13;
v22 = (int)v17->AddressOfNames + v13;
v20 = (int)v17->AddressOfFunctions + v13;
v23 = v17->NumberOfNames;
if ( !v23 )
return 0;
while ( (sub_405BAE(v14 + *(_DWORD *)(v18 + 4 * v16)) & 0x1FFFFF) != v15 ){
v18 = v22;
if ( ++v16 >= v23 )
return 0;
}
Why has this piece of pseudocode attracted my attention? First and foremost, because of such eye-catching offsets as 0x3C
or 0x78
. In addition, the v13
variable operating with these values is transformed into the DWORD*
form indicating that this is an offset, too. Overall, everything indicates that this the header of a PE file:
0x00 WORD emagic Magic DOS signature MZ (0x4d 0x5A)
0x02 WORD e_cblp Bytes on last page of file
0x04 WORD e_cp Pages in file
0x06 WORD e_crlc Relocations
0x08 WORD e_cparhdr Size of header in paragraphs
0x0A WORD e_minalloc Minimum extra paragraphs needed
0x0C WORD e_maxalloc Maximum extra paragraphs needed
0x0E WORD e_ss Initial (relative) SS value
0x10 WORD e_sp Initial SP value
0x12 WORD e_csum Checksum
0x14 WORD e_ip Initial IP value
0x16 WORD e_cs Initial (relative) CS value
0x18 WORD e_lfarlc File address of relocation table
0x1A WORD e_ovno Overloay number
0x1C WORD e_res[4] Reserved words (4 WORDs)
0x24 WORD e_oemid OEM identifier (for e_oeminfo)
0x26 WORD e_oeminfo OEM information; e_oemid specific
0x28 WORD e_res2[10] Reserved words (10 WORDs)
0x3c DWORD e_lfanew Offset to start of PE header
I see the 0x3c
offset that corresponds to the e_lfanew
field. I continue moving forward along e_lfanew
and notice the following field at the offset 0x78
(see the pseudocode):
0x78 DWORD Export Table RVA of Export Directory
This means that the function is reading the export table, i.e. WinAPI functions are called dynamically.
To make IDA Pro understand the export table structure, I have to declare the table in Local Types
by pressing Shift + F1
. Then I select Convert to struct*
in the context menu on the v17
variable. The export table structure of the PE file looks as follows:
struct IMAGE_EXPORT_DIRECTORY {
long Characteristics;
long TimeDateStamp;
short MajorVersion;
short MinorVersion;
long Name;
long Base;
long NumberOfFunctions;
long NumberOfNames;
long *AddressOfFunctions;
long *AddressOfNames;
long *AddressOfNameOrdinals;
}
The fields: *AddressOfFunctions
, *AddressOfNames
, and *AddressOfNameOrdinals
are in use. It is clear from the pseudocode that the hashes are generated on the basis of the 'incomplete' hashes present in the code as follows:
int __cdecl sub_405DCF(int (*a1)(void)){ // Pass the argument
... // Numerous strings that can be skipped
v1 = (unsigned int)a1 ^ (((unsigned int)a1 ^ 0x76C7) << 16) ^ 0xAFB9;
... //
v15 = v1 & 0x1FFFFF;
... //
}
Yes, hashes used in the body of the sample are not 'complete' yet and must be converted into the 'proper' format. After getting rid of all the fat and bloat, I get the following algorithm:
hash_api_true = (hash ^ ((hash ^ 0x76C7) << 16) ^ 0xAFB9) & 0x1FFFFF
where hash
is the hash from the table passed as an argument. A good thing is that IDA highlights identical variables; otherwise, the sample analysis would take forever. In the pseudocode, this hash is stored in a variable called a1
that acts as a function argument.
If I apply this algorithm to the hashes specified in the code (remember the table?), I will get the 'correct' hashes to be compared with the ones retrieved from the export table of the system library (to be specific, from the names of exported functions). In Python, the pseudocode generating hash on the basis of the function's symbolic name looks as follows:
def hash_from_name(name):
result = 0x2b
for x in name:
result = ord(c) + 0x10f * result
return result & 0x1FFFFF
Calling the function:
hash_from_name(name) # Name is the variable containing the symbolic name of the function
So, all I have to do now is apply the hash_api_true
algorithm to the entire table of pseudohashes present in the sample and produce a table of 'correct' hashes. Then I apply the hash_from_name
algorithm to the list of WinAPI functions (that consists of their regular symbolic names) to get the hashed names of functions. And finally, I intercompare these two lists, thus, decoding the names and hashes. To expedite this process, I use a special Python scrip for IDA.
Could it be done faster?
In this particular case, REvil produces the entire table of deobfuscated functions at once. Therefore, after loading the sample into a debugger, you may execute the subprogram that retrieves and deobfuscates the WinAPI functions; then the debugger will automatically insert decrypted names of the API functions into the code. After that, you can make a dump and continue working with it; the functions will be present at their locations. But this method is suitable not for all situations. For instance, if the deobfuscation is performed not for the entire list of functions at once, but for each function separately at the time when it's called, this technique won't work.
Conclusions
Now you know how to restore WinAPI calls obfuscated using calls to their hashes. As you can see, this obfuscation technique can be easily negated using a debugger or a disassembler. Furthermore, mathematical manipulations with the hash don't make the reverse engineering impossible. Such tricks can be easily detected in the pseudocode; all they can do is slow down the examination of a sample by a few minutes. The sole purpose of such manipulations is to fool automatic detection systems; if you perform a manual analysis, they are totally useless.