lab15

Lab 15-1

Since the executable’s purpose was described in the book, static or dynamic analysis are both useless here. So we directly jump to IDA to find the command line argument that will print us “Good Job!”.

From the start, it seems like there’s an issue. IDA fails to load graph mode, and all the lines from main are red, meaning the code is not reachable in the normal control flow of main.

xor     eax, eax
jz      short near ptr loc_401010+1

The culprit is this jz instruction, which will be always true. But according to IDA’s logic, this address is in the middle of an instruction. This came as a result of IDA first analyzing the case that the jz doesnt happen first (which is impossible), then came back to the jz branch. Solving it is easy, we just need to mark the instruction after as data, and then select the address 401011 and what comes after as code so IDA can reanalyze again.

The rogue byte is in this case 0xE8, which corresponds to a call with relative offset.

This anti disassembler technique happened multiple time here so we just need to repeat this exact step until it’s correct.

Next, once we define the junk bytes as data, nop both the jump instructions, and those bytes, then undefine and define the main, we get back out beautiful graph view.

I made a plugin when solving this since I wanted a shortcut like the book to nop out the instructions annd the code given was outdated, so here’s mine using “CTL_ALT_N”, feel free to use it:

import idaapi
import ida_kernwin
import ida_bytes
import ida_ua

ACTION_NAME = "nop_next:action"

class NopNextHandler(ida_kernwin.action_handler_t):
    def activate(self, ctx):
        ea = ida_kernwin.get_screen_ea()

        size = ida_bytes.get_item_size(ea)
        if size <= 0:
            ida_kernwin.msg("[NOP] No item at cursor\n")
            return 0

        # Patch bytes
        for i in range(size):
            ida_bytes.patch_byte(ea + i, 0x90)

        # Fix IDA database
        ida_bytes.del_items(ea, ida_bytes.DELIT_SIMPLE, size)
        ida_ua.create_insn(ea)
        ida_kernwin.msg(f"[NOP] NOP'd {size} bytes at {hex(ea)}\n")
        ida_kernwin.jumpto(ea + size)
        return 1

    def update(self, ctx):
        return ida_kernwin.AST_ENABLE_ALWAYS


class NopNextPlugin(idaapi.plugin_t):
    flags = idaapi.PLUGIN_KEEP
    comment = "NOP instruction and go next"
    help = ""
    wanted_name = "NOP + Next"
    wanted_hotkey = ""

    def init(self):
        handler = NopNextHandler()

        action_desc = ida_kernwin.action_desc_t(
            ACTION_NAME,              # name
            "NOP and go next",        # label
            handler,                 # handler
            "Ctrl-Alt-N",             # shortcut
            "NOP instruction",       # tooltip
            -1                        # icon
        )

        ida_kernwin.register_action(action_desc)
        ida_kernwin.attach_action_to_menu(
            "Edit/Patch programs/",
            ACTION_NAME,
            ida_kernwin.SETMENU_APP
        )

        ida_kernwin.msg("[NOP+Next] Loaded (Ctrl+Alt+N)\n")
        return idaapi.PLUGIN_KEEP

    def run(self, arg):
        pass

    def term(self):
        ida_kernwin.unregister_action(ACTION_NAME)


def PLUGIN_ENTRY():
    return NopNextPlugin()

The password is easily deduced from the code analysis. It is loaded from the argc array, and compared char by char (first, third then second char). The “Good Job!” message will be printed for the string “pdq”.

Lab 15-2

Similar to the previous lab, the main function is malformed and graph mode is not functional.

The first anti-disassembly technique appears in this jnz instruction. Under normal circumstances, ESP will never be 0 in a valid running program, so the jnz will always be taken. This confuses IDA’s flow analysis. By undefining the instruction at 40115E and defining 40115F as code, the issue is corrected and the control flow becomes visible.

Inside these newly discovered instructions, a function is called sub_401386 just before InternetOpenUrlA, clearly reconstructing the C2 at runtime.

mov     [ebp+String], 68h ; 'h'
mov     [ebp+var_33], 74h ; 't'
mov     [ebp+var_32], 74h ; 't'
mov     [ebp+var_31], 70h ; 'p'
mov     [ebp+var_30], 3Ah ; ':'
mov     [ebp+var_2F], 2Fh ; '/'

This function contains almost 50 local variables. The reason becomes clear once we examine the logic. IDA mislabeled String as a single byte, so each character was treated as a separate local variable. In reality, all of these locals form one continuous string. We can redefine all the locals as part of a single buffer.

Manually rebuilding the string character by character would work, but I wanted to automate it with Python:

import ida_funcs, ida_ua, ida_kernwin

ea = ida_kernwin.get_screen_ea()
f = ida_funcs.get_func(ea)

out = {}
cur = f.start_ea

while cur < f.end_ea:
    insn = ida_ua.insn_t()
    ida_ua.decode_insn(insn, cur)

    if ida_ua.print_insn_mnem(cur) == "mov":
        op1, op2 = insn.ops[0], insn.ops[1]

        if op1.type == ida_ua.o_displ and op2.type == ida_ua.o_imm:
            out[op1.addr] = op2.value & 0xFF

    cur += insn.size

s = ""
for k in sorted(out):
    s += chr(out[k])

print(s)

The extracted C2 is: http://www.practicalmalwareanalysis.com/bamboo.html

.text:00401215                 jmp     short near ptr loc_401215+1
.text:00401215 ; ---------------------------------------------------------------------------
.text:00401217                 db 0C0h
.text:00401218                 db  48h ; H

Another odd construct appears here. The address 00401216 does not exist as a normal instruction target. This jump lands in the middle of itself: loc_401215+1 points to the second byte of the jmp instruction. This is a classic anti-disassembly trick.

To understand the real execution flow, we can undefine the jmp and define the target bytes as code.

.text:00401216                 inc     eax
.text:00401218                 dec     eax
.text:00401219                 call    sub_40130F

Here, inc and dec cancel each other out and serve no purpose. They are only there to confuse the disassembler. We can safely NOP this entire sequence starting from the jump.

The call to sub_40130F performs the same type of string construction as the C2 loader. I redefined the locals into a single string and reused the Python script to extract it. The resulting string is: Account Summary.xls.exe

.text:00401269                 jz      short near ptr loc_40126D+1
.text:0040126B                 jnz     short near ptr loc_40126D+1

The next trick uses both jz and jnz to jump to the same invalid offset. No matter what the flags are, execution always flows into the same obfuscated region. We can simply NOP both of these jumps and the bytes that follow.

.text:004012E6                 mov     ax, 5EBh
.text:004012EA                 xor     eax, eax
.text:004012EC                 jz      short near ptr loc_4012E6+2
.text:004012EE                 call    near ptr 0AA1D5Dh

This block is similar to the previous case, but now the jump lands inside the mov instruction rather than the jump itself. The CPU ends up executing bytes that IDA does not expect. If we undefine the area and reanalyze it, the real flow becomes clearer.

.text:004012E8                 jmp     short near ptr loc_4012EE+1
.text:004012EA ; ---------------------------------------------------------------------------
.text:004012EA                 xor     eax, eax
.text:004012EC                 jz      short loc_4012E8
.text:004012EE
.text:004012EE loc_4012EE:                             ; CODE XREF: .text:loc_4012E8↑j
.text:004012EE                 call    near ptr 0AA1D5Dh

The rewritten code shows another jump to loc_4012EE+1, which again lands inside the call instruction. By undefining the instruction at that address and defining 4012EF as code, we see what the CPU actually executes.

.text:004012E3 ; ---------------------------------------------------------------------------
.text:004012E6                 db 66h
.text:004012E7                 db 0B8h
.text:004012E8 ; ---------------------------------------------------------------------------
.text:004012E8
.text:004012E8 loc_4012E8:                             ; CODE XREF: .text:004012EC↓j
.text:004012E8                 jmp     short loc_4012EF
.text:004012EA ; ---------------------------------------------------------------------------
.text:004012EA                 xor     eax, eax
.text:004012EC                 jz      short loc_4012E8
.text:004012EC ; ---------------------------------------------------------------------------
.text:004012EE                 db 0E8h
.text:004012EF ; ---------------------------------------------------------------------------
.text:004012EF
.text:004012EF loc_4012EF:                             ; CODE XREF: .text:loc_4012E8↑j
.text:004012EF                 push    0Ah

This entire block is pure obfuscation and does not implement any meaningful logic. After NOPing it, redefining the function, and reanalyzing, the graph view becomes readable again.

What URL is initially requested by the program? How is the User-Agent generated?

As shown earlier, the C2 URL is reconstructed by sub_401386: http://www.practicalmalwareanalysis.com/bamboo.html

The User-Agent is generated dynamically. First, gethostname retrieves the system hostname. Then a loop processes each character (up to 256 iterations). Each character is compared against ‘Z’, ‘z’, and ‘9’

If the character is none of those, it is incremented by 1.
‘Z’ wraps to ‘A’
‘z’ wraps to ‘a’
‘9’ wraps to ‘0’

This is essentially a Caesar-style shift of +1. For example: abz19 → bca20

lea     edx, [ebp+hostname]
push    edx             ; lpszAgent
call    ds:InternetOpenA

The transformed hostname is then used as the User-Agent string for the HTTP request.

What does the program look for in the page it initially requests? What does the program do with the information it extracts from the page?

The program first searches the response page for the marker string “Bamboo::”. Once found, it searches again for the delimiter ”::” that marks the end of the payload.

mov     byte ptr [eax], 0

This instruction null-terminates the extracted substring. After that, the filename Account Summary.xls.exe is constructed in preparation for dropping a file. The pointer into the page buffer is advanced by 8 bytes to skip “Bamboo::”, and the resulting substring is treated as a URL.

The malware then downloads the file, writes it to disk in the current directory, and executes it using ShellExecuteA.

As a POC, I hosted my own C2 with a bamboo.html file and debugged the malware so it would open my http://192.168.0.104/bamboo.html instead of the original link. I hosted a lightweight calculator app on the same server and placed the link between the indicators.

Interestingly, the malware confirmed my suspicions. When it called strstr with the target ”::”, it did not find the next occurrence as intended. Instead, it found the first one again. With the payload bamboo::http://192.168.0.104/bamboo.html::, after the first strstr, the pointer starts at bamboo::… On the second strstr, it matches the same ”::” again instead of the closing delimiter. Because of this, the malware contains a parsing bug and cannot properly extract the URL. If it had first added 8 to the pointer (to skip “Bamboo::”) and then searched again, it would have worked correctly.

For testing, I fixed it in the debugger and tried again.

Once I manually replaced the second ”::” with a null terminator, the calculator app was properly installed and launched.

Lab 15-3

First, the program prints a banner and starts iterating over the PC processes using the standard CreateToolhelp32Snapshot function. For each process, the process name is printed to the terminal, and OpenProcess is used to obtain a handle for further inspection.

Then, GetPriorityClass is called on each process. If the call fails, the error message is passed to a helper function. That function retrieves the last error code using GetLastError and formats it into a readable string for the user.

Next, additional process information is displayed, and two other routines are called to enumerate the process’s loaded modules and their information, as well as the associated threads.

At this point, if we trust IDA’s linear flow, the executable does not look malicious. However, when checking the Functions window, we notice that one function is never referenced normally: sub_401534. Inspecting its cross-references shows that it is called twice from an undefined area of code.

Following the normal control flow of main, that region should never execute, which means something is redirecting execution in a non-obvious way.

mov     eax, 400000h
or      eax, 148Ch
mov     [ebp+4], eax

It turns out we missed something at the very beginning of main. After the first instructions, eax ends up containing 0x40148C. This value is then written to [ebp+4]. On x86, [ebp+4] normally stores the return address of the current function. The stack layout is such that [ebp+0] holds the saved base pointer, and [ebp+4] holds the address used by ret.

By overwriting it, the program replaces the normal return address. Therefore, when main finishes and executes ret, execution continues at 0x40148C instead of returning to the caller. This offset contains instructions that IDA did not initially recognize as part of a function, effectively hiding the real control flow in plain sight. It also contains several anti‑disassembly tricks, which we can clean by patching the useless bytes as done earlier.

The cleaned code implements a Structured Exception Handling (SEH) setup on 32‑bit Windows. First, the address 0x4014C0 is registered as an exception handler, meaning it will be executed if an exception occurs.

On x86 Windows, fs:[0] points to the head of the SEH chain for the current thread. When the code executes:

push offset dword_4014C0
push dword ptr fs:0
mov  fs:0, esp

it creates a new SEH record on the stack. The previous handler is saved, and 0x4014C0 becomes the new handler that Windows will invoke when an exception is raised.

Immediately after installing the handler, the code runs:

xor ecx, ecx
div ecx

which deliberately triggers a divide‑by‑zero exception. Windows catches this exception and transfers execution to the handler at 0x4014C0 instead of continuing normal execution. This technique hides the real jump from static analysis.

Cleaning the next stage, the code pushes two byte arrays and calls the same function on each. This subroutine XOR‑decrypts the arrays with the key 0xFF and stops when the decrypted byte equals 0, effectively null‑terminating the decoded string.

The first decoded string is a URL pointing to an HTML resource, while the second string is used as the output filename with an .exe extension. The file is downloaded and saved locally from “http://www.practicalmalwareanalysis.com/tt.html” to “spoolsrv.exe”, and WinExec is then called on it, executing the final payload.