In-depth malware: Unpacking the ‘lcmw’ Trojan

Hey folks,

Happy New Year, and welcome to 2014!

On a recent trip to Tyson’s Corner, VA, I had some time to kill, so I took a careful look at a malware sample that a friend of mine sent to me some time ago, which I believe he originally got off somebody else’s hosed system. The plan was for me to investigate it, and I promised him I would; it just took awhile!

Anyways, the sample has a few layers of packing, and I thought it’d be fun/interesting to show you how to unwrap the entire thing to obtain the final payload. I am not going to discuss the payload itself in this post, largely because I haven’t spent much time reversing it. Perhaps in the future I’ll dig a little deeper, but for now we’ll focus on the packing.

I called this sample “lcmw”. It stood for something interesting, but I don’t really remember what—I may have been drinking when I named it. :)

You can find my IDA .idb files and my notes on Github, and samples on my downloads page. WARNING: This is malware! Only run it in a highly controlled environment!! I don’t know what the final payload does, so if you run it, it’s at your own risk! Use a virtual machine, don’t connect it to the Internet, etc. Be careful!

(The password is ‘infected’)

Fun fact: my ISP got an abuse@ email because I was originally hosting the malware file without putting it in a .zip file:

Dear abuse team,

please help to close these offending viruses sites(1) so far.

status: As of 2013-11-29 12:51:30 CET

(for full uri, please scroll to the right end ...

We detected many active cases dated back to 2007, so please look at the date
column below.
You may also subscribe to our MalwareWatch list

This information has been generated out of our comprehensive real time
tracking worldwide viruses URI's

If your review this list of offending site, please do this carefully, pay
attention for redirects also!
Also, please consider this particular machines may have a root kit installed !
So simply deleting some files or dirs or disabling cgi may not really solve
issue !

That was neat. Thanks to my ISP—Voi Network Solutions—laughing about it with me rather than shutting off my connection!


Stage1: the time waster

All right, if you’re following along, this file is called “sample.bin”. The original name was “hlcumrpi.dat”. But if you run “file” on it (on *nix or Cygwin):

$ file hlcumrpi.dat hlcumrpi.dat: PE32 executable (DLL) (GUI) Intel 80386, for MS Windows

It’s technically a .dll library, and is typically run with a “rundll32.exe” command on startup. I renamed it to “sample.bin”, since it’s easier to remember that way. Fire up IDA, load the file, and have a look at the DllEntryPoint function!

Initial inspection

…are you done? Did you get lost in a deep rats’ nest of functions? Because that’s what this stage is all about—wasting your time. It turns out, there’s maybe 10 lines that do anything—the rest just burns CPU and, more importantly, reverse engineer cycles.

Take this sequence for example:

.text:004014AE mov ebx, 0FFFFFh
.text:004014B3 loc_4014B3: ; CODE XREF: DllEntryPoint+11j
.text:004014B3 call UselessComplicatedFunction ; <-- call this about a million times
.text:004014B3 ; Honestly, I suspect this is a whole lot of nothing that's done to waste time
.text:004014B8 dec ebx
.text:004014B9 cmp ebx, 43h
.text:004014BC jnz short loc_4014B3

I went through that entire function, and determined:

  • There are no inputs
  • Nothing is returned
  • No global variables are touched
  • No API calls are made

Basically, it does nothing. But it does it with a lot of code!

Then it gets the string length of the main function, which doesn’t even make any sense, and ignores the response:

.text:004014C3 call ds:lstrlenA ; <-- get the strlen() of main(), which should be ignored

Then it gets the current process id about a million times:

.text:004014C9 mov ebx, 0FFFFFFh
.text:004014CE loc_4014CE: ; CODE XREF: DllEntryPoint+2Cj
.text:004014CE call GetCurrentProcessIdWrapper ; Call GetCurrentProcessId about a sixteen million times
.text:004014D3 dec ebx
.text:004014D4 cmp ebx, 43h
.text:004014D7 jnz short loc_4014CE

This function has no side effects and the return value is ignored. Once again, wasting time.

Then, we finally get to something interesting! This code:

.text:004014D9 push PAGE_EXECUTE_READWRITE ; flProtect
.text:004014DB push 3000h ; flAllocationType = MEM_COMMIT | MEM_RESERVE
.text:004014E0 push 30AD8h ; dwSize
.text:004014E5 push 0 ; lpAddress
.text:004014E7 call ds:VirtualAlloc ; Allocate memory to store the decrypted code
.text:004014ED mov edx, eax
.text:004014EF mov edi, eax ; edi => destination memory

Allocates 0x30ad8 bytes of read/write/execute memory, and stores the pointer to the memory in edi.

Even though malware is malware—and you can never realllllly be sure why it’s doing something—allocating a very specific amount of r/w/x memory almost always means one thing—it’s going to unpack something into that buffer and then execute it.

And sure enough, that piece of code is followed by a de-obfuscation loop (I originally called it a “decryption loop”, but that was technically wrong since it isn’t actually decrypting anything):

.text:004014F6 decrypt_loop: ; CODE XREF: DllEntryPoint+6Dj
.text:004014F1 mov esi, offset start_encrypted_code
.text:004014F6 call XorAndRotate ; eax = the next 4 non-NULL bytes in esi, each XORed by ebx (0x43?)
.text:004014FB shl al, 2
.text:004014FE shr ax, 2
.text:00401502 stosb
.text:00401503 shl ah, 4
.text:00401506 shr eax, 8
.text:00401509 shl ax, 2
.text:0040150D shr eax, 6
.text:00401510 stosw
.text:00401512 cmp esi, offset end_encrypted_data
.text:00401518 jbe short decrypt_loop

Feel free to look at the function I called XorAndRotate—it’s pretty boring. It just twiddles the current uint32 a bit.

You’ll notice that esi—typically the “source” pointer—is initialized to something I called start_encrypted_code. I’m using the term “encrypted” in only the most general sense; “obfuscated” would have been better, but it’s harder to type. If you dig deeper, this is what it looks like:

.text:00401549 start_encrypted_code db 0D6h, 0E6h, 1Bh, 0BAh, 0C0h, 0F1h, 0CDh, 0C2h, 0, 0D5h
.text:00401549 ; DATA XREF: DllEntryPoint+46o
.text:00401549 db 5Eh, 0B6h, 97h, 0EBh, 0C0h
.text:00401558 dd 2 dup(83008300h), 0C2D50033h, 53C02D2Dh, 0C7008300h
.text:00401558 dd 6A78300h, 83000F49h, 83008300h, 81C88300h, 0C8C0C700h
.text:00401558 dd 4C70081h, 810481C8h, 4E18300h, 8B6F1B04h, 0D6B68300h
.text:00401558 dd 8CB98Ch, 830083h, 1F830083h, 0C2B79A97h, 6C001FE6h
.text:00401558 dd 53D5004Eh, 1F2512A7h, 1FB44EE6h, 61B8300h, 681B9h, 830083h

And so on, for a long, long time. Almost certainly obfuscated code. And since it’s being sent through that de-obfuscation function, that pretty much confirms it.

The function ends with:

.text:0040151A pop ebx
.text:0040151B pop edi
.text:0040151C pop esi
.text:0040151D jmp edx

Where edx—at that point—contains a pointer to the decrypted buffer.

So, we see a function that:

  • Wastes a ton of time
  • Allocated executable memory
  • Populates the executable memory
  • Jumps to the executable memory

A classic decryption/deobfuscation loop!

Let’s look at the easiest possible way to own it!

Owning stage1

So, I’m lazy. Really lazy. I’m gonna find the easiest possible way to decrypt this bad boy.

WARNING: I run malware in this section. It’s de-clawed, but you never know what clever tricks are used (I used to have a cat that was de-clawed, and believe me: they still have sharp teeth!); only do this in a throw-away virtual machine! Never, NEVER run this on any important system!

All right, so we have a useless function at the top (I enabled the ‘code bytes’ now so we can see what the machine code looks like):

.text:004014B3 E8 C0 FE FF FF call UselessComplicatedFunction ; <-- call this about a million times
.text:004014B3 ; Honestly, I suspect this is a whole lot of nothing that's done to waste time
.text:004014B8 4B dec ebx
.text:004014B9 83 FB 43

I’m paranoid; mayyybe it’s doing something important? So I’m gonna fire up a hex editor (like xvi32), search the sample.dll binary for the machine code, “e8 c0 fe ff ff 4b 83 fb 43” (which should be at offset 0x8b3 in the file), and nop out the call (“e8 c0 fe ff ff” -> “90 90 90 90 90”).

That way, even if it is doing something sneaky, it’s never called anyways.

Next, I’m going to do the same to the GetProcessId wrapper:

.text:004014CE E8 D1 FF FF FF call GetCurrentProcessIdWrapper ; Call GetCurrentProcessId about a sixteen million times
.text:004014D3 4B dec ebx
.text:004014D4 83 FB 43 cmp ebx, 43h

The ‘call’ instruction, which you can find at offset 0x8ce in the file, also needs to be replaced with “90 90 90 90 90”.

Finally, we don’t want the malware to actually run. That would defeat the entire purpose of de-clawing! So we find the code at the bottom:

.text:00401518 76 DC jbe short decrypt_loop
.text:0040151A 5B pop ebx
.text:0040151B 5F pop edi
.text:0040151C 5E pop esi
.text:0040151D FF E2 jmp edx
.text:0040151D DllEntryPoint endp
.text:0040151F ; ---------------------------------------------------------------------------
.text:0040151F C3 retn

We want to find the jmp instruction (“ff e2”)—which should be at 0x91d in the file—and replace it with “cd 03”.

Wait, what’s cd 03!?

It’s “int 3”. Besides being my license plate, it’s also the instruction that means “debug breakpoint”. In other words, if a running application hits that instruction, it’ll fire a debug interrupt. If the application is being debugged, the debugger gets control; if it’s not, the application will simply crash. Whatever the case: it will never run the malicious code!

Save the new .dll—you can find this in the .zip under the name “sample_safe.bin”—and load it in IDA just to make sure. It should now look like this—note that there’s only the one call left:

.text:004014AB DllEntryPoint proc near ; DATA XREF: DllEntryPoint+13o
.text:004014AB hinstDLL = dword ptr 4
.text:004014AB fdwReason = dword ptr 8
.text:004014AB lpReserved = dword ptr 0Ch
.text:004014AB push esi
.text:004014AC push edi
.text:004014AD push ebx
.text:004014AE mov ebx, 0FFFFFh
.text:004014B3 loc_4014B3: ; CODE XREF: DllEntryPoint+11j
.text:004014B3 nop
.text:004014B4 nop
.text:004014B5 nop
.text:004014B6 nop
.text:004014B7 nop
.text:004014B8 dec ebx
.text:004014B9 cmp ebx, 43h
.text:004014BC jnz short loc_4014B3
.text:004014BE push offset DllEntryPoint ; lpString
.text:004014C3 call ds:lstrlenA
.text:004014C9 mov ebx, 0FFFFFFh
.text:004014CE loc_4014CE: ; CODE XREF: DllEntryPoint+2Cj
.text:004014CE nop
.text:004014CF nop
.text:004014D0 nop
.text:004014D1 nop
.text:004014D2 nop
.text:004014D3 dec ebx
.text:004014D4 cmp ebx, 43h
.text:004014D7 jnz short loc_4014CE
.text:004014D9 push 40h ; flProtect
.text:004014DB push 3000h ; flAllocationType
.text:004014E0 push 30AD8h ; dwSize
.text:004014E5 push 0 ; lpAddress
.text:004014E7 call ds:VirtualAlloc
.text:004014ED mov edx, eax
.text:004014EF mov edi, eax
.text:004014F1 mov esi, offset byte_401549
.text:004014F6 loc_4014F6: ; CODE XREF: DllEntryPoint:loc_401518j
.text:004014F6 call sub_401520
.text:004014FB shl al, 2
.text:004014FE shr ax, 2
.text:00401502 stosb
.text:00401503 shl ah, 4
.text:00401506 shr eax, 8
.text:00401509 shl ax, 2
.text:0040150D shr eax, 6
.text:00401510 stosw
.text:00401512 cmp esi, offset byte_43201D
.text:00401518 loc_401518: ; CODE XREF: .text:00401556j
.text:00401518 jbe short loc_4014F6
.text:0040151A pop ebx
.text:0040151B pop edi
.text:0040151C pop esi
.text:0040151D int 3 ; - software interrupt to invoke the debugger
.text:0040151F retn
.text:0040151F DllEntryPoint endp

Awesome! Now let’s write a quick app to run it:

#include <windows.h>

int main(int argc, char *argv[])
        LoadLibrary("C:\\Documents and Settings\\Administrator\\Desktop\\sample_safe.bin");

        return 0;

And compile it, then run it in a debugger (I’m going to use windbg, since that’s my favourite debugger):

C:\Program Files\Debugging Tools for Windows (x86)>windbg 'c:\Documents and Settings\Administrator\My Documents\Visual Studio 2008\Projects\test_malware\Debug\test_malware.exe'

Executable search path is:
ModLoad: 00400000 0041b000 test_malware.exe
ModLoad: 7c800000 7c8c0000 ntdll.dll
ModLoad: 77e40000 77f42000 C:\WINDOWS\system32\kernel32.dll
ModLoad: 10200000 10323000 C:\WINDOWS\WinSxS\x86_Microsoft.VC90.DebugCRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_597C3456\MSVCR90D.dll
(1b8.2d4): Break instruction exception - code 80000003 (first chance)
eax=10400000 ebx=7ffda000 ecx=00000003 edx=00000008 esi=7c8877f4 edi=00151f38
eip=7c81a3e1 esp=0012fb70 ebp=0012fcb4 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
*** ERROR: Symbol file could not be found. Defaulted to export symbols for ntdll.dll -
7c81a3e1 cc int 3
0:000> g
ModLoad: 00350000 00389000 C:\Documents and Settings\Administrator\Desktop\sample_safe.bin
ModLoad: 77380000 77411000 C:\WINDOWS\system32\user32.dll
ModLoad: 77c00000 77c48000 C:\WINDOWS\system32\GDI32.dll
ModLoad: 77f50000 77feb000 C:\WINDOWS\system32\ADVAPI32.dll
ModLoad: 77c50000 77cef000 C:\WINDOWS\system32\RPCRT4.dll
ModLoad: 76f50000 76f63000 C:\WINDOWS\system32\Secur32.dll
(1b8.2d4): Break instruction exception - code 80000003 (first chance)
eax=00035000 ebx=003514ab ecx=77e64590 edx=003b0000 esi=0012f7d0 edi=00000001
eip=0035151e esp=0012f7c0 ebp=0012f7dc iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
*** ERROR: Module load completed but symbols could not be loaded for C:\Documents and Settings\Administrator\Desktop\sample_safe.bin
0035151e 03c3 add eax,ebx

Note that it hits a “break instruction”! Perfect!

We know that the original instruction was “jmp edx”, and therefore the code is pointed at by edx. Sure enough, if we dump edx, we get something that looks like code:

0:000> u edx
003b0000 55 push ebp
003b0001 89e5 mov ebp,esp
003b0003 83ec04 sub esp,4
003b0006 56 push esi
003b0007 57 push edi
003b0008 53 push ebx
003b0009 e800000000 call 003b000e
003b000e 5b pop ebx

Perfect! We also know the length of the buffer, from the VirtualAlloc() call, so dump that many bytes to a file:

0:000> .writemem c:\\stage2.bin 0x3b0000 L0x30AD8
Writing 30ad8 bytes..................................................................................................

And open the file in IDA—yup, that’s code!

And thus we’re finished stage1. Congrats!

Stage2: going raw

If you load Stage2 in IDA, it’s going to complain that it isn’t actually a PE file—it’s raw code. That’s fine—load it as raw 32-bit code.

If you scroll around (you may need to use ‘c’ to mark stuff as code), you’ll see a small amount of code (with interspersed strings), followed by another block that looks encrypted/obfuscated, including something that looks suspiciously—but not exactly—like part of a PE header:

seg000:0000040E                 db  54h ; T
seg000:0000040F                 db  68h ; h
seg000:00000410                 db  69h ; i
seg000:00000411                 db  73h ; s
seg000:00000412                 db  20h
seg000:00000413                 db  0Eh
seg000:00000414                 db  70h ; p
seg000:00000415                 db  72h ; r
seg000:00000416                 db  6Fh ; o
seg000:00000417                 db  67h ; g
seg000:00000418                 db  67h ; g
seg000:00000419                 db  61h ; a
seg000:0000041A                 db  6Dh ; m
seg000:0000041B                 db  87h ; ç
seg000:0000041C                 db  63h ; c
seg000:0000041D                 db  47h ; G
seg000:0000041E                 db  6Eh ; n

Well, time to start all over!

By now, you know that my usual strategy is to let the program own itself, rather than spending a lot of time owning it. As a result, I don’t really know how the obfuscation works; I just know how to bypass it!

If you’re following along, I’m not going to be a ton of help on how to get the function readable. It’s a mixture of “c” (for “code” sections), and “u” (to undefine non-code portions). After you see a short “call” that jumps over some weird looking code, that code probably needs to be undefined (or defined as an “a”scii string).

If you do everything right, it should wind up looking like this:

seg000:00000000                 push    ebp             ; Standard function prefix
seg000:00000001                 mov     ebp, esp
seg000:00000003                 sub     esp, 4          ; 4 bytes for local variables
seg000:00000006                 push    esi
seg000:00000007                 push    edi
seg000:00000008                 push    ebx
seg000:00000009                 call    $+5
seg000:0000000E                 pop     ebx
seg000:0000000F                 sub     ebx, 40100Eh
seg000:00000015                 mov     eax, dword ptr fs:loc_2C+4
seg000:0000001B                 mov     eax, [eax+0Ch]
seg000:0000001E                 mov     eax, [eax+1Ch]
seg000:00000021 loc_21:                                 ; CODE XREF: seg000:0000002Aj
seg000:00000021                 mov     esi, [eax+8]
seg000:00000024                 cmp     byte ptr [eax+1Ch], 18h
seg000:00000028                 mov     eax, [eax]
seg000:0000002A                 jnz     short loc_21
seg000:0000002C loc_2C:                                 ; DATA XREF: seg000:00000015r
seg000:0000002C                                         ; sub_2E5+4:r
seg000:0000002C                 call    loc_40
seg000:0000002C ; ---------------------------------------------------------------------------
seg000:00000031 aGetProcAddress db 'GetProcAddress',0
seg000:00000040 ; ---------------------------------------------------------------------------
seg000:00000040 loc_40:                                 ; CODE XREF: seg000:loc_2Cp
seg000:00000040                 push    esi
seg000:00000041                 call    sub_188
seg000:00000046                 mov     [ebx+4013BCh], eax
seg000:0000004C                 call    loc_5E
seg000:0000004C ; ---------------------------------------------------------------------------
seg000:00000051 aLoadlibrarya   db 'LoadLibraryA',0
seg000:0000005E ; ---------------------------------------------------------------------------
seg000:0000005E loc_5E:                                 ; CODE XREF: seg000:0000004Cp
seg000:0000005E                 push    esi
seg000:0000005F                 call    dword ptr [ebx+4013BCh]
seg000:00000065                 mov     [ebx+4013C0h], eax
seg000:0000006B                 call    loc_80
seg000:0000006B ; ---------------------------------------------------------------------------
seg000:00000070 aUnmapviewoffile db 'UnmapViewOfFile',0
seg000:00000080 ; ---------------------------------------------------------------------------
seg000:00000080 loc_80:                                 ; CODE XREF: seg000:0000006Bp
seg000:00000080                 push    esi
seg000:00000081                 call    dword ptr [ebx+4013BCh]
seg000:00000087                 mov     [ebx+4013C4h], eax
seg000:0000008D                 call    loc_9F
seg000:0000008D ; ---------------------------------------------------------------------------
seg000:00000092 aVirtualalloc   db 'VirtualAlloc',0
seg000:0000009F ; ---------------------------------------------------------------------------
seg000:0000009F loc_9F:                                 ; CODE XREF: seg000:0000008Dp
seg000:0000009F                 push    esi
seg000:000000A0                 call    dword ptr [ebx+4013BCh]
seg000:000000A6                 mov     [ebx+4013C8h], eax
seg000:000000AC                 call    loc_BD
seg000:000000AC ; ---------------------------------------------------------------------------
seg000:000000B1 aVirtualfree    db 'VirtualFree',0
seg000:000000BD ; ---------------------------------------------------------------------------
seg000:000000BD loc_BD:                                 ; CODE XREF: seg000:000000ACp
seg000:000000BD                 push    esi
seg000:000000BE                 call    dword ptr [ebx+4013BCh]
seg000:000000C4                 mov     [ebx+4013CCh], eax
seg000:000000CA loc_CA:                                 ; CODE XREF: seg000:000000E0j
seg000:000000CA                 push    4
seg000:000000CC                 push    3000h
seg000:000000D1                 push    0A00000h
seg000:000000D6                 push    0
seg000:000000D8                 call    dword ptr [ebx+4013C8h]
seg000:000000DE                 test    eax, eax
seg000:000000E0                 jz      short loc_CA
seg000:000000E2                 mov     [ebp-4], eax
seg000:000000E5                 push    eax
seg000:000000E6                 lea     eax, [ebx+4013D0h]
seg000:000000EC                 mov     ecx, [eax+4]
seg000:000000EF                 add     eax, ecx
seg000:000000F1                 push    eax
seg000:000000F2                 call    sub_313
seg000:000000F7                 pop     eax
seg000:000000F8                 pop     eax
seg000:000000F9                 mov     esi, [ebp-4]
seg000:000000FC                 add     esi, [esi+3Ch]
seg000:000000FF                 mov     edi, [esi+34h]
seg000:00000102                 mov     eax, [ebp+10h]
seg000:00000105                 test    eax, eax
seg000:00000107                 jnz     short loc_114
seg000:00000109                 mov     eax, [ebp+0Ch]
seg000:0000010C                 dec     eax
seg000:0000010D                 test    eax, eax
seg000:0000010F                 jnz     short loc_114
seg000:00000111                 mov     edi, [ebp+8]
seg000:00000114 loc_114:                                ; CODE XREF: seg000:00000107j
seg000:00000114                                         ; seg000:0000010Fj
seg000:00000114                 push    edi
seg000:00000115                 call    dword ptr [ebx+4013C4h]
seg000:0000011B                 mov     eax, [esi+50h]
seg000:0000011E                 push    40h ; '@'
seg000:00000120                 push    3000h
seg000:00000125                 push    eax
seg000:00000126                 push    edi
seg000:00000127                 call    dword ptr [ebx+4013C8h]
seg000:0000012D                 mov     ecx, [esi+54h]
seg000:00000130                 mov     esi, [ebp-4]
seg000:00000133                 rep movsb
seg000:00000135                 mov     edi, eax
seg000:00000137                 push    dword ptr [ebp-4]
seg000:0000013A                 push    edi
seg000:0000013B                 call    sub_1E9
seg000:00000140                 push    edi
seg000:00000141                 call    sub_219
seg000:00000146                 push    edi
seg000:00000147                 call    sub_28A
seg000:0000014C                 push    8000h
seg000:00000151                 push    0
seg000:00000153                 push    dword ptr [ebp-4]
seg000:00000156                 call    dword ptr [ebx+4013CCh]
seg000:0000015C                 mov     eax, [ebx+4013CCh]
seg000:00000162                 lea     ecx, [ebx+401000h]
seg000:00000168                 mov     edx, [edi+3Ch]
seg000:0000016B                 add     edx, edi
seg000:0000016D                 mov     edx, [edx+28h]
seg000:00000170                 add     edx, edi
seg000:00000172                 push    edx
seg000:00000173                 push    edi
seg000:00000174                 call    sub_2E5
seg000:00000179                 pop     ebx
seg000:0000017A                 pop     edi
seg000:0000017B                 pop     esi
seg000:0000017C                 leave
seg000:0000017D                 push    8000h
seg000:00000182                 push    0
seg000:00000184                 push    ecx
seg000:00000185                 push    edx
seg000:00000186                 jmp     eax

One of the first things I recommend doing it to re-base the program (using edit->segments->rebase or something like that). I re-based to 0x3b0000, because that’s the offset that was allocated by VirtualAlloc() on my system, and therefore is where the in-memory version ended up.

Some reversing

The first part took me some time to figure out:

seg000:003B0015                 mov     eax, large fs:30h ; This section basically gets a handle to kernel32.dll
seg000:003B001B                 mov     eax, [eax+0Ch]
seg000:003B001E                 mov     eax, [eax+1Ch]
seg000:003B0021 loc_3B0021:                             ; CODE XREF: seg000:003B002Aj
seg000:003B0021                 mov     esi, [eax+8]
seg000:003B0024                 cmp     byte ptr [eax+1Ch], 18h
seg000:003B0028                 mov     eax, [eax]
seg000:003B002A                 jnz     short loc_3B0021 ; When this ends, esi = handle to kernel32.dll

I actually googled parts of this, and eventually found an identical function online. Its purpose was to get a handle to the in-memory version of kernel32.dll. Sweet!

You’ll then see this code:

seg000:003B002C loc_3B002C:
seg000:003B002C                 call    loc_3B0040
seg000:003B002C ; ---------------------------------------------------------------------------
seg000:003B0031 aGetProcAddress db 'GetProcAddress',0
seg000:003B0040 ; ---------------------------------------------------------------------------
seg000:003B0040 loc_3B0040:                             ; CODE XREF: seg000:loc_3B002Cp
seg000:003B0040                 push    esi             ; addr of kernel32.dll
seg000:003B0041                 call    find_function
seg000:003B0046                 mov     [ebx+test.addr_GetProcAddress], eax

(Note that I defined a struct for test.addr_GetProcAddress—it involves generous use of the ‘structs’ tab in a way it was never intended to be used in IDA).

The find_function() function was actually a guess that turned out to be right. This sequence of code gets a handle to the GetProcAddress() function, and stores it on line 0x3b0046.

Then there are a bunch of sequences that basically look like:

seg000:003B004C                 call    loc_3B005E
seg000:003B004C ; ---------------------------------------------------------------------------
seg000:003B0051 aLoadlibrarya   db 'LoadLibraryA',0
seg000:003B005E ; ---------------------------------------------------------------------------
seg000:003B005E loc_3B005E:                             ; CODE XREF: seg000:003B004Cp
seg000:003B005E                 push    esi
seg000:003B005F                 call    [ebx+test.addr_GetProcAddress]
seg000:003B0065                 mov     [ebx+test.addr_LoadLibraryA], eax

Basically, it calls GetProcAddress() with “LoadLibraryA” as a parameter, and stores the result. It does this for a bunch of functions—basically, get pointers to a host of useful functions:

  • GetProcAddress
  • LoadLibraryA
  • UnmapViewOfFile
  • VirtualAlloc
  • VirtualFree
  • </pre> VirtualAlloc(), as you'll recall, was used in the last section to allocate space for decrypted memory. At this point, we can guess that it does the exact same thing again! Sure enough, it allocates memory; but surprisingly, it's not executable! Here's the call:
    seg000:003B00CA loc_3B00CA:                             ; CODE XREF: seg000:003B00E0j
    seg000:003B00CA                 push    4               ; flProtect = PAGE_READWRITE
    seg000:003B00CC                 push    3000h           ; flAllocationType = MEM_RESERVE | MEM_COMMIT
    seg000:003B00D1                 push    0A00000h        ; dwSize = 10,485,760 bytes
    seg000:003B00D6                 push    0               ; lpAddress
    seg000:003B00D8                 call    [ebx+test.addr_VirtualAlloc]
    seg000:003B00DE                 test    eax, eax
    seg000:003B00E0                 jz      short loc_3B00CA
    Note how it keeps attempting to allocate memory until it works. It's shit like this, malware... Anyway, the memory is allocated! Then a function is called:
    seg000:003B00E2                 mov     [ebp-4], eax    ; ebp-4 = allocated memory
    seg000:003B00E5                 push    eax             ; allocated memory
    seg000:003B00E6                 lea     eax, [ebx+test.field_4013D0] ; eax = ptr to encrypted data (003b03d0)
    seg000:003B00EC                 mov     ecx, [eax+4]
    seg000:003B00EF                 add     eax, ecx
    seg000:003B00F1                 push    eax             ; Looks like start of obfuscated PE file (003b03e8)
    seg000:003B00F2                 call    sub_3B0313      ; Complicated but looks harmless
    The "encrypted data"—which, as we saw earlier, looks suspiciously like a PE file—is passed in, along with the allocated memory. A fairly complex function is called, that I looked through but didn't reverse. It's complicated, but ultimately harmless.

    Active analysis

    With clever use of breakpoints and sweating bullets, I let that function run. If you're interested, this is how I did it: run sample_safe.bin in windbg; when the breakpoint fired, I moved eip to where the jump would have gone using "r eip=edx" in windbg; I set a breakpoint on line 0x3b00f7 using "bp 0x3b00f7"; I used "g" to continue the program; and bob's your uncle. Running malware like this, once again, is *dangerous*! If you're following along, please be careful! Anyway, once that function finishes, I check out the allocated memory:
    0:000> db 900000
    00900000  4d 5a 90 00 03 00 00 00-04 00 00 00 ff ff 00 00  MZ..............
    00900010  b8 00 00 00 00 00 00 00-40 00 00 00 00 00 00 00  ........@.......
    00900020  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
    00900030  00 00 00 00 00 00 00 00-00 00 00 00 e8 00 00 00  ................
    00900040  0e 1f ba 0e 00 b4 09 cd-21 b8 01 4c cd 21 54 68  ........!..L.!Th
    00900050  69 73 20 70 72 6f 67 72-61 6d 20 63 61 6e 6e 6f  is program canno
    00900060  74 20 62 65 20 72 75 6e-20 69 6e 20 44 4f 53 20  t be run in DOS 
    00900070  6d 6f 64 65 2e 0d 0d 0a-24 00 00 00 00 00 00 00  mode....$.......
    It's a PE file! w00t! So thinking that's a length might just be right! I dump the file to see if IDA recognizes it:
    0:000> .writemem "c:\\stage3.bin" 0x900000 L0x00020a00
    Writing 20a00 bytes..................................................................
    Read it with IDA, and confirm that it's a valid PE. But... there's more code after the PE is decrypted. What's going on?

    Going above and beyond in stage2

    So, this part is strictly unnecessary to figuring out how the malware works. I was simply curious, and wanted to make sure that nothing weird was going on. Some variables are moved around after the decryption, but at this point I'm carefully stepping through with a debugger. I find this code:
    seg000:003B0114                 push    edi
    seg000:003B0115                 call    [ebx+test.addr_UnmapViewOfFile]
    And determine that edi is pointing to stage1. So, stage1 is unloaded. Then, some executable memory is allocated:
    seg000:003B011E                 push    40h ; '@'       ; flProtect = PAGE_EXECUTE_READWRITE
    seg000:003B0120                 push    3000h           ; flAlloctionType = MEM_COMMIT | MEM_RESERVE
    seg000:003B0125                 push    eax             ; dwSize = a bit bigger than the decrypt function returned
    seg000:003B0126                 push    edi             ; lpAddress = the same address that the .dll was unloaded from
    seg000:003B0127                 call    [ebx+test.addr_VirtualAlloc]
    Some incredibly complicated functions are called. I surmised—correctly—that these are taking the PE file in memory—that we just decrypted—and preparing it to be run. Basically, do the relocations and other stuff that is involved with making a PE file actually runnable. Then, the decrypted PE file—the version that came before it was actually relocated, that it just finished relocating—is freed:
    seg000:003B0153                 push    dword ptr [ebp-4]
    seg000:003B0156                 call    [ebx+test.addr_VirtualFree]
    Then finally, this sequence is found:
    seg000:003B0174                 call    sub_3B02E5
    seg000:003B0179                 pop     ebx
    seg000:003B017A                 pop     edi
    seg000:003B017B                 pop     esi
    seg000:003B017C                 leave
    seg000:003B017D                 push    8000h
    seg000:003B0182                 push    0
    seg000:003B0184                 push    ecx
    seg000:003B0185                 push    edx
    seg000:003B0186                 jmp     eax
    This is actually really cool. It calls sub_3b02e5, which returns a pointer to VirtualFree(). On line 0x3b0186, VirtualFree() is jumped to. That leads to two questions: why a jmp and not a call? And if VirtualFree() only takes a single argument, what's with all the other pushes? Well, here's what's happening: a called function returns to whatever's on the top of the stack. If you jmp to a function that expects to be called, it returns to the last thing pushed onto the stack. Since that happens to be edx (pushed at 0x3b0185), that becomes the return address. (edx happens to be the entrypoint of the new .dll file, stage3) The next parameter above it—ecx, pushed at 0x3b0184—is the parameter to VirtualAlloc(). It's the starting address of the current code—0x3b0000. And finally, the other two arguments—0x0000 and 0x8000—are the arguments that the entrypoint of the .dll file expects to receive. To summarize: this piece of code frees itself, then returns into the loaded .dll file. That's really cool! Very little malware will actually clean up after itself like we see here. This tells me that the malware was written by somebody who actually cares about code quality. I'm impressed!

    Stage3: The final frontier

    Stage3 is actually pretty straight forward, although it does a lot of stuff that I haven't actually reversed. I've also made a lot of educated guesses on how it works that I've validated. If you're following along, this is in stage3.bin. Essentially, it's a compressed payload stored in a PE resource. Let's look at what that means... First off, look at the 'strings' window (shift-f12 in IDA). Looking at the strings window is almost always the first thing I do, with malware and also legit software. In this case, you'll see some interesting strings:
    .rdata:100054A4 aAplibV1_01TheS db 'aPLib v1.01 - the smaller the better :)',0Dh,0Ah
    .rdata:100054A4 db 'Copyright (c) 1998-2009 by Joergen Ibsen, All Rights Reserved.',0Dh,0Ah
    .rdata:100054A4 db 0Dh,0Ah
    .rdata:100054A4 db 'More information:',0Dh,0Ah
    .rdata:100054A4 db 0Dh,0Ah,0
    Immediately, I know it's using compression. That's handy! If you follow the DllEntryPoint() function to its calls, you'll quickly find this:
    .text:10001C2E ; int __cdecl sub_10001C2E(HMODULE hModule, int)
    .text:10001C2E sub_10001C2E proc near ; CODE XREF: do_stuff+12p
    .text:10001C2E hModule = dword ptr 8
    .text:10001C2E arg_4 = dword ptr 0Ch
    .text:10001C2E push ebp
    .text:10001C2F mov ebp, esp
    .text:10001C31 push esi
    .text:10001C32 push 0Ah ; lpType
    .text:10001C34 push 65h ; lpName
    .text:10001C36 push [ebp+hModule] ; hModule
    .text:10001C39 call ds:FindResourceA
    .text:10001C3F mov esi, eax
    .text:10001C41 test esi, esi
    .text:10001C43 jz short loc_10001C9E
    .text:10001C45 push edi
    .text:10001C46 push esi ; hResInfo
    .text:10001C47 push [ebp+hModule] ; hModule
    .text:10001C4A call ds:SizeofResource
    .text:10001C50 mov edi, eax
    .text:10001C52 test edi, edi
    .text:10001C54 jz short loc_10001C9D
    .text:10001C56 push ebx
    .text:10001C57 push esi ; hResInfo
    .text:10001C58 push [ebp+hModule] ; hModule
    .text:10001C5B call ds:LoadResource
    .text:10001C61 mov ebx, eax
    .text:10001C63 test ebx, ebx
    .text:10001C65 jz short loc_10001C9C
    .text:10001C67 push ebx ; hResData
    .text:10001C68 call ds:LockResource
    .text:10001C6E mov esi, eax
    .text:10001C70 test esi, esi
    .text:10001C72 jnz short loc_10001C78
    Note the calls—FindResourceA(), SizeofResource(), LoadResource(), and LockResource(). If you're interested in what these are doing exactly, you can find plenty of info in MSDN. But suffice to say, it loads a resource from the PE, identified by the value passed into FindResourceA()—resource 0x65 (101). If you load a resource viewer—such as PEExplorer, you can view the resource section and dump resource 0x65 into a file. That file looks like:
    $ xxd -g1 stage4_compressed.bin | head
    0000000: 41 50 33 32 18 00 00 00 a1 c9 01 00 0b e4 d7 66 AP32...........f
    0000010: 0b 51 03 00 f2 8d 91 b3 0b 38 51 03 1c 49 01 38 .Q.......8Q..I.8
    0000020: 37 b7 0e 0f 8c 07 09 7b d0 1a 01 be bc 55 1c 8b 7......{.....U..
    The file starts with AP32, and earlier we saw a compression library called "aPLib" referenced. Compressed payload anyone? As of the writing, you can download the official AP32 sample application here. You can unpack it with the appack.exe utility:
    $ ./appack.exe d ./stage4_compressed.bin stage4.bin
    aPLib example Copyright (c) 1998-2009 by Joergen Ibsen / Jibz
                                                                All Rights Reserved
    decompressed 117177 -> 217355 bytes in 0.00 seconds
    (That tool is super buggy, you might have to move directories and stuff to get it to work; it's just a sample, after all)

    Decompressed... now what?

    Once decompressed, it looks like:
    0000000: 0b 51 03 00 49 01 00 00 b7 03 00 00 0b 07 00 00 .Q..I...........
    0000010: 0b 7b 01 00 00 00 00 00 00 00 00 00 00 00 00 00 .{..............
    0000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    0000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    0000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Hmm.. it's clearly not compressed/encrypted, but it doesn't look like anything special. If you scroll around, you'll quickly find a PE header. If you scroll down a lot more, you'll find another PE header. After looking at the numbers in the first 20 bytes, that they are offsets into the file as well as lengths. Those offsets represent PE files! Note that I haven't even *started* reversing the code that processes this—and I never did—I simply determined all this by simple guessing and checking. Here are the first few int32 values (note: little endian), and what they seem to mean:
    • 0x0003510b - the length of the entire file
    • 0x00000149 - at this offset, there's some raw code—55 8b ... (a raw binary function)—I initially assumed a 16-bit version, but that doesn't seem likely
    • 0x000003b7 - at this offset, nothing special, but it appears to be code. Not sure what its deal is...
    • 0x0000070b - A proper PE file, that we're gonna call "stage4.bin"
    • 0x00017b0b - Another PE file—I guessed that this is a 64-bit version of the same thing, which seems likely—upon inspection, it has the same imports/strings
    • 0x00000000
    I called the file at 0x70b the actual payload. There are also some loader functions and a 64-bit payload that I'm going to ignore.

    Odds and ends of stage3

    If you want to know more about stage3, keep reading! This section is very light—the code is complex, and doesn't really add much, so I'm going to give a quick high-level overview of it. It actually creates a .dll file whose name is based on the harddrive serial number and an implementation of a standard pseudo-random number generator. This means that, if installed on the same machine, the .dll will have the same name. It injects the .dll file into every running process, by the looks of it. It puts a lot of effort into determining whether to use the 64- or 32-bit version for each running executable (including correctly detecting the use of Wow64). Once again, because of the cleanness and the fact that it handles 32- and 64-bit systems, as well as Wow64 processes, appropriately, I feel like this was written by somebody who clearly knows what they are doing.


    Once you extract the 32-bit .dll file from the de-compressed data, you now have what I called stage4.bin. This is the final stage, and does the actual malicious functionality. As I said initially, I haven't reversed it. But if you look at it in IDA, you'll see a ton of command-and-control-like functionality. It contacts servers over HTTPS, it modifies Web sites, and lots more interesting stuff. When I have more time, I'll look at it in more detail! Hope you enjoyed this!


Join the conversation on this Mastodon post (replies will appear below)!

    Loading comments...