“Luke, I am your fuzzer”. Automating vulnerability management

Fuzzing is all the rage. It is broadly used today by programmers testing their products, cybersecurity researchers, and, of course, hackers. The use of fuzzers requires a good understanding of their work principles. These top-notch tools make it possible to identify previously unknown vulnerabilities in various applications. In this article, I will address different fuzzing types and show how to use one of them, WinAFL.

Fuzzing feeds nonstandard data (either executable code, a dynamic library, or a driver) to a computer program in an attempt to cause a failure. An attacker could use the same technology to deliver malicious payload; this is a common way to discover and exploit vulnerabilities.

Fuzzing allows identifying a broad range of errors in the tested software. This includes buffer overflows, incorrect processing of user data, resource leaks (e.g. RAM-related ones), synchronization failures resulting in a race condition, etc. The fuzzer records all events like that for subsequent analysis.

Techniques

There are two main fuzzing techniques: mutation-based testing and generation-based testing. A mutation-based fuzzer generates input based on preset data and templates constituting its initial seed. By altering the input data byte-by-byte, a fuzzer may draw conclusions on the efficiency of these ‘mutations’ in order to generate even more efficient data sequences in the next round.

As you can see, the basic concept is simple. However, as the number of cycles may reach hundreds and thousands of millions (the testing takes a few days even on powerful computers), fuzzers are able to identify most nontrivial errors in the tested software.

The generation-based testing is a more advanced fuzzing technique involving the generation of input data grammars based on program specifications. These include files of various formats and network packets in communication protocols. In that case, the results must be consistent with certain preset rules. The implementation of generation-based testing is more complicated in comparison with mutation-based one, but the odds of success are much higher as well.

Of course, there are even more advanced techniques. Take, for instance, fuzzing involving tracing and production of equations for SMT solvers. In theory, this makes it possible to cover even hardly-accessible code branches. To achieve this, a trace is created within the OS kernel, while known code sections are excluded (there is no point to fuzz WinAPI functions, etc.). However, it is very difficult to set up such things right; as a result, today, this technique is rather a ‘dark magic’ and not a commonly used method.

Concurrently, there is a simple testing technique feeding random data as inputs, but we are not going to address it in detail due to its very low efficiency. This technique resembles the brute-force method because the history and successfulness of previous attempts are not recorded.

INFO

The Monkey program created back in 1983 is considered one of the first fuzzer prototypes. Its name makes an allusion to the Infinite Monkey Theorem. Despite its uselessness in the real life, this theorem is popular in mass culture: it is mentioned in The Hitchhiker’s Guide to the Galaxy and The Simpsons series. Furthermore, the theorem even has its own RFC 2795.

Fuzzer types

Now that you know the basic fuzzing techniques, let’s proceed to the fuzzer types.

File formats

In this article, any file of any format to be processed by the tested application will be treated as input data. Having said that, one can feed an app with a file in a wrong format and see how the program deals with it. The first thing that comes to mind is an antivirus. The antivirus scanner must determine the file format and deal with it accordingly: either attempt to unpack it, launch the heuristic analysis, etc.

What happens if an antivirus scanner decides that this is a portable executable (PE) file packed with UPX executable packer, while in the course of the unpacking, it turns out that it was not UPX but something pretending to be UPX (and possibly carrying malicious payload)? Of course, the unpacking algorithm will be different, but it is impossible to predict the scanner’s behavior in advance. For instance, it may crash or tag the file as ‘damaged’ and skip it; there are simply too many possible scenarios. File format fuzzers are used to test such situations.

Command line arguments and environment variables

Many utilities require command line parameters: a file path, an argument, etc. So, what happens if we feed some data not expected by the program? For instance, if a program requests a path, it probably does not expect it to consist of a thousand symbols or include illegal characters. The input of such an argument may cause a stack overflow error and crash the app.

This fully relates to environment variables as well. Similarly, with command line fuzzers, parameters from one or several environment variables (i.e. not from arguments) are provided as inputs. The subsequent scenario is roughly the same as the stack overflow error caused by a too long or illegal command line argument.

IOCTL requests

IOCTL requests are used when we need to find out how various kernel drivers react to them. In addition to devices and peripherals, some programs use these drivers to interact with the system. Of course, the IRP request structure is unknown in most cases, but the intercepted packets can be used as the basis for a seed template.

Network protocols

Such fuzzers are specifically designed for certain protocols, although universal ones exist as well. For instance, the OWASP JBroFuzz fuzzer tests the implementation of known protocols for such vulnerabilities as cross-site scripting, buffer overflow, SQL injections, etc. On the other hand, the SPIKE utility can test unknown protocols for many vulnerabilities.

Browser engines

Yes, there are special fuzzers to scan browsers for vulnerabilities. Modern browsers are very sophisticated and include plenty of engines to process various document versions, protocols, CSS, COM, DOM, etc. Accordingly, such fuzzers are used by bug bounty hunters looking for vulnerabilities.

RAM

This category includes highly specialized fuzzers used for modification of program-related data in the RAM. They are useful for the testing of dynamic antidumping programs and utilities with built-in protection.

Coverage problem

Of course, sometimes fuzzers encounter problems. Due to the complexity of some programs, fuzzers may not be able to reach certain sections of their code. This may be due to the nesting depth or other specific implementation features. Fuzzer developers use various tricks to deal with the insufficient code coverage.

For instance, they employ feedback: the fuzzer receives information on the program behavior thanks to signaling instructions in the executable file. This process is called “instrumentation”; it assists the fuzzer in input adjustments for the next round in order to improve the coverage.

At this point, new problems are encountered. Let’s say, the software can be only available as a binary without the source code; or the performance drops due to the dynamic instrumentation of the tested application; or the trace processing takes too much CPU time, or…

In addition, there is a problem of compatibility. Some fuzzers lack support for certain versions of Windows or *.nix. The more sophisticated is the fuzzer and the better it monitors the application execution flow, the stronger it is bound to version-specific OS features.

As you can see, many various fuzzers are available to the hacker these days, and a suitable vulnerability scanner can be found for each particular task. Many fuzzers are designed for *.nix-like operating systems, while others work in Windows.

The following two fuzzers belong to different types: WinAFL and MiniFuzz. Let’s check them out.

WinAFL

WinAFL is a fork of a popular AFL (American Fuzzy Lop) fuzzer ported on Windows by Google. It performs instrumentation of test files, both static (when the application source code is available) and dynamic (instrumentation ‘on the run’). DynamoRIO binary analysis library assists WinAFL in this.

WinAFL

WinAFL

As soon as something goes wrong in the tested application (e.g. it crashes), the data that have caused the failure are saved in a separate folder for subsequent analysis. The fuzzer has plenty of instrumentation options; the most important ones are listed below.

  • -i [directory] specifies input directory storing test cases. Important: the fuzzer includes several simple test cases for various file types.
  • -o [directory] specifies the output directory for fuzzer findings.
  • -D [directory] instructs the fuzzer to use dynamic instrumentation on the basis of DynamoRIO. To do so, you also have to specify the directory that contains this tool.
  • -Y enables the static instrumentation mode.

WinAFL performs the dynamic instrumentation function more or less satisfactory; however, problems may arise when it comes to static instrumentation. For instance, instrument.exe designed to instrument files on Windows-based systems does not support the latest versions of Visual Studio SDK and does not work with programs created in Visual Studio 2019 yet.

When everything is ready and the file has been instrumented, all you have to do is run afl-fuzz.exe -Y -i input -o output -- test.exe. The command launches a vulnerability scan with static instrumentation.

MiniFuzz

Another fuzzer to be examined today is MiniFuzz. It has been developed by Microsoft and has a user-friendly graphic (!) user interface. Furthermore, MiniFuzz can be integrated with Visual Studio.

MiniFuzz

MiniFuzz

The developers recommend to generate at least a 100000 files for each file format. Each input file is a separate fuzzing iteration. Therefore, fuzzing requires a set of template files. For instance, if you want to test how the program processes *.zip archives, you have to place some 100 of such reference files in the ‘templates’ folder. You may place more files there as well – the fuzzer would appreciate this!. However, if you place fewer files, the fuzzing efficiency would significantly drop.

Then the fuzzer randomly selects a file from the reference set, alters it, and sends to the tested application. If this causes a crash, the file is copied to the crashes folder for subsequent research with the purpose to identify the direct cause of the failure.

All fuzzer settings are stored in the file minifuzz.cfg in the XML format. Below is the list of the most interesting options (I omitted the obvious ones for the sake of conciseness):

  • Command line args. In this field, you can add missing command line parameters if they are required during the fuzzing.
  • Allow process to run for specifies the maximum time in seconds to allow the process to run before killing it. Never set this value too low; otherwise, the fuzzer may not be able to complete its job.
  • Shutdown method specifies the application shutdown method. The following methods are supported: ExitProcess (graceful shutdown), WM_CLOSE (graceful shutdown for graphical (i.e. not console) applications), and TerminateProcess (emergency shutdown that can result in some resources left unfreed).
  • Aggressiveness determines how much the template files are altered before they are sent to the tested application. If the fuzzer has been working for a while with no results, you may want to increase the value of this parameter.

Practice

To get an idea how WinAFL searches for errors, I will write a short test program with a built-in function accessible by a null function pointer. This is a common example of an error everyone can make. The code that is going to crash looks as follows:

int crash() {
  int *x = NULL;
  int y = *x;

  printf("%s", y);

  return 0;
}

This is up to the developer how to call this function. I am going to input as a command line argument a ‘magic’ parameter calling the function when the condition if (argc == 2 && !strcmp(argv[1], "key")) is satisfied. In addition, one can ‘wrap’ the tested function in a cycle to expedite the fuzzing:

while (__afl_persistent_loop()) {
  ...
}

The cycle control function is located in the file winafl-master\afl-staticinstr.h that has to be added to the project (this step also adds diagnostic notifications).

File preparation for further instrumentation

File preparation for further instrumentation

As you can see, the file is not ready for fuzzing yet, and WinAFL reminds us to instrument it. To do so, enter the following command:

$ instrument.exe --mode=afl --input-image=tst.exe --output-image=instr_crash.exe
Instrumentation

Instrumentation

Finally, add two parameters in the linker properties: /PROFILE (enables profiling support) and /SAFESEH (safe exception handling). Now everything is ready, and the fuzzer can be launched:

$ afl-fuzz.exe -Y -i in -o out -t 500+ -- -fuzz_iterations 10000 -- instr_crash.exe

This command specifies that the file has been statically instrumented, specifies the in directory containing the test cases, and the out directory that will store the findings. In addition, it sets the iteration processing waiting time (in milliseconds) and the number of testing iterations. The screenshot below shows how the fuzzer works:

Fuzzing with WinAFL

Fuzzing with WinAFL

Keep in mind that in the real world, fuzzing can take weeks. Fortunately, in our case, this process is much faster. The application crashes, and the out directory contains files whose names have the following structure id:000003,src:000001,op:flip1,pos:1. They contain diagnostic information with explanations that look like the one below:

Program received signal SIGSEGV, Segmentation fault.
0x0802f36a in crash at tst.c:32
32 int y = *x;
#0 0x0802f36a in crash at tst.c:32
 
crash dump #:1

As you can see, the log is pretty detailed: it specifies both the crash function and the error type: SIGSEGV. This indicates that the fuzzer has generated the ‘magic’ parameter correctly and everything worked fine.

Conclusions

This article provides only a brief overview of fuzzers and their work principles. Of course, this is not a comprehensive guide but just a vector for further research and self-study. However, the provided basic information enables you to start your own experiments and raise your level of expertise in this area.


Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>