Last reversing CTF was so much fun, I started directly with the next challenge. For me, it was a very hard one and it took me several weeks (although I recognized, after reading other peoples solutions my approach was far too complex ;). Nevertheless, it is still interesting and I want to describe the details in this blog article.
The task in general: You get the url of a service, the corresponding binary "3x17" for download and your mission is to spawn a shell.
belial@anubis:~/Dropbox/pwnable.tw/4$ objdump -T 3x17 3x17: file format elf64-x86-64 objdump: 3x17: not a dynamic object DYNAMIC SYMBOL TABLE: no symbols
It is statically linked and does not import any external functionality. This leads to a size of 700kb and lots of potential ROP gadgets :).
Next stop: Its security features.
$ checksec 3x17 [*] '/home/belial/3x17' Arch: amd64-64-little RELRO: Partial RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x400000)
Checksec claims there is no canary. Stack isn't executable and protected by ASLR.
Best guess: We have to find the stack address and overwrite it with a set of ROP gadgets to spawn a shell.
When you run the binary, you get something like this:
belial@anubis:~/Dropbox/pwnable.tw/4$ ./3x17 addr:123 data:456 belial@anubis:~/Dropbox/pwnable.tw/4$ 456
It asks for an address and for data. Presumably, we can use it to alter memory locations and dont have to look for a write exploit ourselves. Unfortunatly something is wrong, the data input is rejected and redirected to our shell. Therefore, we start Ghidra to check the binaries parsing functionality and to learn how to handle it correctly.
Both strings "addr:" and "data:" are hardcoded into the binaries data segment and have only one reference. This leads us to the following function:
Obvisiously, it contains a stack protection which terminates the program upon return address manipulation. So, our initial "checksec" command was wrong. Furthermore, the function has a counter and the main body is only executed if it equals to one. Luckily, it is based on a byte which would overflow quite fast ;). Besides this:
Unfortunatly, buffer size, read() and write() length are hardcoded. We can not exploit much here. Therefore, we take a look at FUN_0040ee70(). Maybe it is possible to leak a stack address somehow.
Besides the usual function prolog, several registers are initialized with static values which are not altered but still used for branching. A good example is rdx:
mov edx, 0Ah ;... cmp edx, 1 jz loc_40FEE8 cmp edx, 24h ja loc_40FEE8
This code would jump to loc_40FEE8 if rdx is not between 1h and 24h. Because of the registers static nature, this will never happen but makes reading more difficult. Therefore, we use IDA and its debugger to see directly which branch is taken.
The function starts with the following lookup table access:
test byte ptr [rcx+rax*2+1], 20h jz short loc_40FD92 loc_40FD80: add rbx, 1 movsx rax, byte ptr [rbx] mov rsi, rax test byte ptr [rcx+rax*2+1], 20h jnz short loc_40FD80 loc_40FD92: ;...
Rcx contains a pointer to a lookup table. Rax is the first character and Rbx a pointer to the input string. Taking a closer look at the lookup table leads to the following conclusion: This code skips the first bytes if they are equal to carriage return, new line, etc. Not very difficult to understand but hard to read when you take a look at the disassembly for the first time.
Afterwards, the algorithm checks whether the input string starts with a "+" or a "-". It seems, unary operators are supported for the specified memory address. Next, garbage code performs branches on static variables. Finally, the following opcodes are executed:
loc_40FE49: lea edi, [rsi-30h] cmp dil, 9 jbe short loc_40FE6E loc_40FE6E: movzx esi, dil cmp esi, edx jge short loc_40FE98 ;...
Rsi contains the current input character, Rdx equals to 0ah. It checks whether it is a digit between 30h and 39h. If it is not a digit, lookup tables are used to check whether it is a letter, upper case is transformed into lower case, etc. but in the end the function terminates. If it is a digit, the following code is executed:
loc_40FE29: imul rax, r15 movzx edi, dil mov rsi, r8 add rax, rdi
Rax contains the function return value and is initialized with 0x00. Dil is the current character-30h. R15 holds 0ah. The algorithm multiplicates the previously parsed value with 10 and adds the last character. This repeats until the input string ends with a 0x00.
The current situation: We can enter a decimal value. It is translated into an address and 24 bytes are written to it. The parsing algorithm seems secure. It rejects everything which is not a digit (although in a complicated way with garbage code, lookup tables, allows unary operators, etc.). So we face three problems here:
At this point, I was rather stuck. I debugged the whole binary to find other exploitable parts but without success. The only interesting aspect: Before termination, the application loaded function pointers from the data segment and called them. A friend gave me an important hint: Check the binaries section and read about "fini_array".
readelf -S 3x17 There are 29 section headers, starting at offset 0xb94e8: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align ...  .fini_array FINI_ARRAY 00000000004b40f0 000b30f0 0000000000000010 0000000000000008 WA 0 0 8  .data.rel.ro PROGBITS 00000000004b4100 000b3100 0000000000002df4 0000000000000000 WA 0 0 32
This special section contains a list of function pointers which are called upon termination to clean things up. As you can see, the section is writable and includes two pointers with a total size of 16 bytes. We overwrite them the following way:
At this point the whole application runs in an infinite loop and gives us an important advantage: We can write a multiple of 24 bytes. Next open tasks:
A ROP chain is trivial. The tool ROPGadget finds a set of opcodes and their addresses in memory which can be used to invoke exec() for "/bin/sh". Behind .fini_array is a huge chunk of .data_rel.ro. It is writable as well and we use it to store our ROP chain addresses. Finally, we have to change Esp to execute the gadgets. Therefore, let's take a look at the code which executes the .fini_array function pointers:
sub_402960 proc near ;... lea rbp, off_4B40F0 mov rbx,1 ;... loc_402988: call qword ptr [rbp+rbx*8+0] sub rbx, 1 cmp rbx, 0FFFFFFFFFFFFFFFFh jnz short loc_402988 ;... ret
Obvisiously, Rbp contains the base address of .fini_array. Rbx is used as an index for the function pointers. It is initialized with 1 and decremented by each iteration. Luckily, 3x17 contains the following ROP gadget:
0x00000000004715ec : mov bl, 0x48 ; mov eax, edx ; ret
If we write 0x4715ec into the fini_array, Ebx will contain 0x48. The operation enlarges the fini_array to a total size of 71 pointers which start at address 0x4b4328. This gives us plenty of space for ROP gadgets which can be executed for the final task: Set a new Esp. To do so, the following two gadgets are neccessary:
0x000000000048a578 : mov rax, qword ptr [rsi + 8] ; ret 0x000000000044f62b : xchg eax, esp ; ret
Basically, these opcodes copy [rsi+8] into Rax and assign Eax to Esp. Luckily, when the fini_array loop starts, esi points to 0x004b40f8 which is the end of the fini_array section. Therefore, rsi+8 corresponds to the first bytes of the following .data.rel.ro section. As we previously mentioned, this section is writable and already contains the ROP chain which starts the shell. We write the address of the ROP chain to the beginning of the section, set the new Esp and a shell pops up :).
This is the corresponding Python exploit: