lab6

Lab 6-1

What is the major code construct found in the only subroutine called by main?

sub_401000 was the only subroutine called by main. It’s basically an internet connection checker. It calls InternetGetConnectedState and then compare its return value to 0 in an a “if” code construct. if it’s 0, there’s an internet connection, else the host is offline. The source code probably looks like this:

int conn_check()
{
  BOOL ConnectedState;
  ConnectedState = InternetGetConnectedState(0, 0);
  if (ConnectedState)
  {
    sub_40105F('Success: Internet Connection');
    return 1;
  }
  else
  {
    sub_40105F('Error 1.1: No Internet');
    return 0;
  }
}

What is the subroutine located at 0x40105F?

When examining the disassembly of sub_401282 and its wrapper sub_40105F, several characteristics immediately identify the routine as an internal implementation of a printf family function. The most prominent sign is the presence of a format-string parsing loop. The code repeatedly loads a byte from a format string, increments the pointer, and branches on special characters. One of the input params is a struct, FILE. Looking it up on memory, we find its fields hard-coded:

<0, 0, 0, 2, 1, 0, 0, 0>
 ^  ^  ^  ^  ^  ^  ^  ^
 |  |  |  |  |  |  |  └─ _tmpfname = NULL
 |  |  |  |  |  |  └──── _bufsiz   = 0
 |  |  |  |  |  └─────── _charbuf  = 0
 |  |  |  |  └────────── _file = 1   <- FILE HANDLE (1 = stdout)
 |  |  |  └───────────── _flag = 2   <- _IOWRT (write mode)
 |  |  └──────────────── _base = NULL
 |  └─────────────────── _cnt = 0
 └────────────────────── _ptr = NULL

The second parameter is a pointer to a format string. At the very start of sub_401282, the code loads the first character of this string into bl and immediately tests it for null:

mov     esi, [ebp+arg_4]    ; esi = format string
mov     bl, [esi]           ; load first char
inc     esi                 ; advance pointer
test    bl, bl              ; check if null
jz      loc_4019F8          ; if null, jump to function exit
mov     [ebp+arg_4], esi    ; save updated pointer

This is a classic early-exit optimization: if the format string is empty, the function can return immediately without performing any further processing. It avoids unnecessary work and prevents the function from entering the main parsing and formatting logic.

Inside sub_401282, a jump table interprets format specifiers:

cmp     bl, 20h        ; check for space
jl      loc_4012E5
cmp     bl, 78h        ; 'x'
jg      loc_4012E5
movsx   eax, bl
mov     al, byte ptr ds:GetStringTypeW[eax]
and     eax, 0Fh

This is a clear sign of format-string dispatching, where each character (like %d, %s, %x) maps to a case in the jump table. Subsequent blocks handle width, precision, flags, length modifiers, and buffer allocation.

Additionally, temporary buffers are allocated using constants like 0x200 and 0x800, which match known CRT patterns for intermediate storage of converted numeric or wide-character data.

What is the purpose of this program?

The program simply checks for internet connection, and prints out the resulting string on the terminal.

Lab 6-2

What operation does the first subroutine called by main perform?

It’s an internet connection check like the Lab 6-1.

What is the subroutine located at 0x40117F?

To identify the purpose of the subroutine at address 0x40117F, we begin by examining its cross-references. In every case where sub_40117F is called, a string is pushed onto the stack immediately before the call. Many of these strings contain format specifiers, such as %c and \n, which strongly suggests that this function is used for formatted output.

This observation is reinforced by the usage in main. In the snippet below, a value previously parsed (stored in ecx) is pushed onto the stack, followed by a format string:

movsx   ecx, [ebp+var_8]
push    ecx
push    offset aSuccessParsedC ; "Success: Parsed command is %c\n"
call    sub_40117F

This calling pattern exactly matches that of the standard C library function printf, where arguments are pushed in reverse order: first the value to be formatted, then the format string.

What does the second subroutine called by main do? What type of code construct is used in this subroutine?

The second subroutine called by main (located at 0x401040) is responsible for retrieving and parsing a command from a remote web resource. It begins by establishing an Internet connection using the Windows WinINet API function InternetOpenA, specifying the user-agent string “Internet Explorer 7.5/pma”. This makes the network activity appear similar to that of a legitimate web browser.

After successfully opening an Internet session, the function attempts to load a remote file hosted at: http://www.practicalmalwareanalysis.com/cc.htm

This is accomplished using the InternetOpenUrlA API. If the URL cannot be opened, the function prints an error message and terminates early.

Once the URL is successfully opened, the function reads 512 bytes from the remote resource using InternetReadFile. The data is stored entirely in a local stack-based buffer named Buffer. If the read operation fails, an appropriate error message is printed, the Internet handles are closed, and the function returns failure.

After the data is read into memory, the function inspects the first four bytes of the buffer. These bytes are checked sequentially to determine whether they match the ASCII sequence: <!—

This sequence represents the beginning of an HTML comment. The checks are implemented as a series of nested conditional comparisons, effectively equivalent to the following C-style logic:

if (Buffer[0] == '<') {
    if (Buffer[1] == '!') {
        if (Buffer[2] == '-') {
            if (Buffer[3] == '-') {
                return Buffer[4];
            } else error;
        } else error;
    } else error;
} else error;

If all four comparisons succeed, the function extracts and returns the fifth byte of the buffer (Buffer[4]). This byte acts as a command character, presumably interpreted elsewhere in the program.

If any of the comparisons fail, meaning the downloaded content does not begin with an HTML comment, the function prints the error message “Error 2.3: Fail to get command” and returns 0.

Are there any network-based indicators for this program?

This program exhibits clear network-based indicators that can be monitored. Specifically, it performs outbound HTTP requests using the user-agent string “Internet Explorer 7.5/pma” and connects to the URL: http://www.practicalmalwareanalysis.com/cc.htm

Monitoring network traffic for this uncommon user-agent string or repeated access to this domain would be effective for detection.

What is the purpose of this malware?

The purpose of this malware is to retrieve a remote command from a web server. It downloads a web page, checks whether the content begins with an HTML comment, and extracts a single-byte command embedded within that comment. If the expected format is not found, the program reports an error and exits cleanly.

When successful, the extracted command character is printed using a formatted output string. The program then sleeps for one minute before terminating. This behavior demonstrates a simple command-and-control (C2) mechanism, where commands are discreetly hidden in seemingly benign web content, making the network traffic less obvious and harder to detect by basic firewall rules.

Lab 6-3

Compare the calls in main to Lab 6-2’s main method. What is the new function called from main?

Compared to Lab 6-2, this version of main introduces a new function call to sub_401130 after successfully retrieving and parsing the command from cc.htm. In Lab 6-2, the command was only printed; in this lab, it is actively processed. This new function is responsible for executing different actions based on the parsed command value.

What parameters does this new function take?

The function sub_401130 takes two parameters:

A single character command (parsed from the HTML comment)
A file path argument (lpExistingFileName), which is provided to the malware via the command-line arguments

What major code construct does this function contain?

The function contains a switch-case construct, implemented via a jump table. The command character is normalized by subtracting 0x61 (‘a’) and then used as an index into the jump table. This allows the malware to efficiently dispatch execution to one of several distinct behaviors.

What can this function do?

Based on the command received the function can perform the following actions:

Command ‘a’: Create the directory C:\Temp
Command ‘b’: Copy an existing file (provided as an argument) to C:\Temp\cc.exe
Command ‘c’: Delete C:\Temp\cc.exe
Command ‘d’: Create a registry value to ensure persistence by running C:\Temp\cc.exe at startup
Command ‘e’: Sleep for 100 seconds If an invalid command is received, the function prints an error message and exits gracefully.

Are there any host-based indicators for this malware?

This malware exhibits several host-based indicators, including:

Creation of the directory C:\Temp
Creation, deletion, or modification of the file C:\Temp\cc.exe
Modification of the registry key:
HKLM\Software\Microsoft\Windows\CurrentVersion\Run with a value named “Malware”

What is the purpose of this malware?

The purpose of this malware is to act as a simple command-and-control backdoor. It retrieves a remotely hosted command hidden inside an HTML comment, parses that command, and conditionally executes filesystem or registry-based actions on the infected host.

Lab 6-4

What is the difference between the calls made from the main method in Labs 6-3 and 6-4?

In Lab 6-4, main introduces a loop that repeatedly calls the network retrieval function (sub_401040), whereas in Lab 6-3 the function was called only once. Specifically, Lab 6-4 uses a loop controlled by var_C that runs 1,440 times, passing the loop counter as an argument to sub_401040 on each iteration.

Additionally, unlike Labs 6-2 and 6-3 where a static User-Agent string was used, Lab 6-4 dynamically modifies the User-Agent for each request using the format string:

Internet Explorer 7.50/pma%d

where %d corresponds to the current loop iteration. This rotating User-Agent strings (pma0, pma1, pma2…) aren’t just for variety, they make each request unique, defeating simple signature-based blocking.

What new code construct has been added to **main`?

A loop construct has been added to main. This loop controls repeated execution of the malware’s command-fetching, command-parsing, and command-execution logic. The loop terminates either after 1,440 iterations or earlier if an error occurs.

What is the difference between this lab’s parse HTML function and those of the previous labs?

The HTML parsing logic itself remains largely the same: it still checks for an HTML comment beginning with <!— and extracts a single command byte from the response. However, in this lab the parsing function is now invoked repeatedly and operates in conjunction with a dynamic User-Agent string, making each request appear slightly different. This change increases stealth and reduces the likelihood of detection by simple signature-based network defenses.

4. How long will this program run? (Assume that it is connected to the Internet.)

The program runs for approximately 24 hours.

The loop executes 1,440 times
Each iteration includes a Sleep(0xEA60) call, which equals 60 seconds

5. Are there any new network-based indicators for this malware?

Yes. New network-based indicators include:

Repeated HTTP requests to: http://www.practicalmalwareanalysis.com
A pattern of rotating User-Agent strings:

Internet Explorer 7.50/pma0
Internet Explorer 7.50/pma1
...
Internet Explorer 7.50/pma1439

6. What is the purpose of this malware?

The purpose of this malware is to function as a persistent command-and-control (C2) agent. It periodically contacts a remote server over the course of 24 hours, retrieves a hidden command embedded within an HTML comment, and executes that command locally.