Dissecting Viber. How to analyze Android apps

Once, while being on a job search, I received an interesting test assignment: analyze Viber for Android, find vulnerabilities in it, and exploit them. Using this episode as an example, I will demonstrate you an efficient approach that can be used to analyze real apps and obtain results in a short time. Joking aside: if you go through all the steps described below, you have a good chance to find a 0-day vulnerability in Viber. 😉

info

Due to the constraints on space, some analytical approaches are mentioned only in passing. However, I have purposively described all mistakes made during the analysis. After all, only fools learn from their mistakes, the wise man learns from the mistakes of others.

Target

Errors in an Android app can be identified in different ways, and you can select various targets for your searches. This article examines a low-level target: shared libraries; accordingly, it’s primarily focused on memory corruption bugs; while Java code will be analyzed only in situations when it’s necessary to identify its relationships with shared libraries.

APK analysis was performed using standard tools that can be taken from any public awesome list; so, I won’t specifically focus on them unless the context requires this.

The target APK was taken from one of the mirror sites. The APK source must be selected carefully since such sites often contain:

outdated versions;
versions for unsuitable platforms and architectures; or
modified apps (potentially containing malware).

My first step was to conduct reconnaissance of the target: examine CVE and then perform binary diffing to identify 1-day vulnerabilities.

Shared libraries

After unpacking the APK, you can find library files in the lib directory: they are neither packed, nor encrypted, nor obfuscated, nor protected, which, in theory, makes my task much easier. Note that messengers usually don’t use passive defense mechanisms to protect their libraries from analysis; WhatsApp only uses a custom packer, but not for the sake of protection. I decided to analyze library versions for the x86_64 architecture due to the following reasons:

more tools are available for this architecture;
better decompilation (this, of course, is debatable since much depends on the tool you are using);
emulation at higher speeds (my host has the x86_64 architecture);
possibility to perform analysis partially on the host PC (i.e. not on the emulator); and
at this stage, my goal is not to write a specific exploit for multiple targets; accordingly, ARM architectures can be omitted for now.

For initial shallow analysis, I used such tools as IDA Pro, Binary Ninja, and rizin (I didn’t use Ghidra because the problem had to be solved quickly). The algorithm is simple: you load the library, look at the exported symbols, find strings, read the code a little… But then I switched to a one-line command containing everything I needed: readelf -W --demangle --symbols $(LIBRARY_SO) | tail -n +4 | sort -k 7 | less.

After identifying JNI functions contained in the libraries, I applied rg/grep to the Smali code to find files containing declarations of native functions:

$ readelf -W –demangle –symbols libnativehttp.so | tail -n +4 | sort -k 7 | rg “FUNC.Java_.” | less
33: 0000000000001bb5 55 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_HttpEngine_nativeCreateHttp
34: 0000000000001bec 15 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_HttpEngine_nativeDelete
38: 0000000000001bfb 622 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_HttpEngine_nativeTest
44: 00000000000018c3 109 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_NativeDownloader_nativeOnConnected
39: 00000000000015e8 366 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_NativeDownloader_nativeOnData
35: 0000000000001b0c 40 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_NativeDownloader_nativeOnDisconnected
40: 0000000000001930 476 FUNC GLOBAL DEFAULT 13 Java_com_viber_libnativehttp_NativeDownloader_nativeOnHead

$ rg “native.*nativeCreateHttp”
app/src/main/java/com/viber/libnativehttp/HttpEngine.java
9: public static native long nativeCreateHttp();

Shallow analysis is performed to identify: (1) the target library; and (2) open-source components to be examined later. Analysis can be conducted using the above-mentioned readelf, or Rizin, or Binary Ninja: you just Google the name of the exported symbol and perform superficial reverse engineering to recreate the overall picture of library’s functions.

Preliminary results

Below is the list of shared libraries with brief descriptions of their functions or links to respective open-source projects:

libc++_shared.so – C++ standard library;
libcrashlytics-common.so, libcrashlytics-handler.so, libcrashlytics-trampoline.so, libcrashlytics.so – Firebase Crashlytics;
libreactnativeblob.so, libreactnativejni.so, libglog_init.so, libjscexecutor.so, libjsijniprofiler.so, libjsinspector.so – React Native;
libfb.so, libfbjni.so – fbjni;
libfolly_futures.so – Folly: Facebook Open-source Library;
libfolly_json.so – Folly: Facebook Open-source Library, double-conversion;
libglog.so – Google Logging Library;
libhermes-executor-release.so, libhermes.so – Hermes JS Engine;
libicuBinder.so – ICU extension for SQLite;
libimage_processing_util_jni.so – androidx.camera.core;
libimagepipeline.so, libnative-filters.so, libnative-imagetranscoder.so – Fresco;
libgifimage.so – The GIFLIB project, Fresco;
libjingle_peerconnection_so.so – an old component (libjingle) from WebRTC;
libmux.so – FFmpeg from fftools;
libpl_droidsonroids_gif.so – android-gif-drawable;
librenderscript-toolkit.so – RenderScript;
libsigner.so – Adjust SDK for Android;
libspeexjni.so – Speex;
libsqliteX.so – SQLite for Android;
libtensorflowlite_gpu_jni.so, libtensorflowlite_jni.so – TensorFlow;
libyoga.so – Yoga; and
libCrossUnblocker.so, libFlatBuffersParser.so, liblinkparser.so, libnativehttp.so, libsvg.so, libViberRTC.so, libvideoconvert.so, libVoipEngineNative.so – hand-crafted libraries that contain open-source code.

Based on it, a short list of ‘interesting’ libraries can be produced. The selection condition is simple: maximum volume of hand-crafted code and minimum amount of open-source components. Here it is:

libCrossUnblocker.so;
libFlatBuffersParser.so;
liblinkparser.so;
libnativehttp.so;
libsvg.so;
libViberRTC.so;
libvideoconvert.so; and
libVoipEngineNative.so.

Functions

Before you start analyzing any function, you have to figure out whether an attacker can reach to it. For this purpose, let’s sort all libraries and functions by priority and then check where the latter ones are called from. To analyze Java code, I use the following combination: jadx (decompilation) + Android Studio (refactoring) + Understand (analysis of graphs showing relationships between variables, functions, and data).

In addition, each library has to be checked for the presence of JNI functions. It cannot be ruled out that some of them use the RegisterNatives method.

libFlatBuffersParser.so

More detailed examination shows that this is an open-source library called FlatBuffers. I leave its analysis for later.

libsvg.so

To extract strings from a binary file, I use such utilities as strings and Ghidra. The extracted strings indicate that the code is written in C++:

[...]
_ZTVN10__cxxabiv121__vmi_class_type_infoE
_ZNSt6__ndk119__shared_weak_countD2Ev
_ZTINSt6__ndk119__shared_weak_countE
__android_log_print
_ZNKSt6__ndk16locale9has_facetERNS0_2idE
_ZNKSt6__ndk16locale9use_facetERNS0_2idE
[...]

In addition, I check for the presence of RTTI information: fortunately, in this particular case, it’s available. Using a plugin for Ghidra called Ghidra C++ Class and Run Time Type Information Analyzer, you can restore structures of classes in C++ code.

Recovered RTTI information: C++ classes and their methods

At first glance, a hand-crafted SVG library is used, which is good. But further analysis of the Java code (call tree analysis) shows that library functions are used mainly to load application assets (the ./assets/svg directory in the APK file), which doesn’t suit my purposes. However, the call chain of the native function nativeParseFd → parseFile is used to parse stickers. This seems to be interesting! But I decided to leave this function for later since my intuition tells me: it makes no sense to write an SVG parser from scratch. Accordingly, it likely uses some open-source stuff.

Call graph of SVG native functions in Understand

libnativehttp.so

In the course of analysis, an inevitable question arises: what purpose was this library created for? Apparently, its capabilities aren’t very powerful: wrappers for functions that process network data. When events occur, handlers call the same Java code, not something native. Perhaps, some functionality could be there in the past (or at least it was intended)? I’ve already seen something similar in the Telegram messenger: it contains plenty of legacy code; such a vast attack surface makes your hands itch at first, but thorough analysis puts everything into place. So, be aware that sometimes you might encounter not only dead code, but entire dead libraries whose examination would waste plenty of valuable time.

However, in this particular case, everything becomes clear after a thorough analysis. This library is used by other libraries, which means that functions processing network data must be implemented elsewhere. In fact, this is an interceptor that can be used, for instance, to measure network traffic or collect analytical data.

Other libraries

The text below describes my manipulations with just one library. I found it to be of interest since it can be used to search for low-hanging fruits and left the analysis of other libraries for later. Their list is as follows:

libCrossUnblocker.so;
libvideoconvert.so;
libViberRTC.so; and
libVoipEngineNative.so.

Note that these libraries might also contain interesting vulnerabilities since they implement a set of VoIP functions that have enough room for errors.

Target analysis

For the above reason, I chose the liblinkparser.so library as a target.

Analysis of this library began with detailed reverse engineering. It was so detailed that I spent two days on it and forgot to check whether functions contained in this library are used at all and whether I (i.e. attacker) can call them using malicious input data. Reverse engineering took so long because library’s functionality seemed very promising from the vulnerability identification perspective.

I compared all the main reverse engineering tools and chose Binary Ninja for my research. A long time ago, I performed a comparative analysis of Binary Ninja and IDA Pro, and that article is still relevant (furthermore, BN has got more advantages since then). I suggest performing such comparisons every time because different targets require different tools.

But in some situations I had to use IDA Pro: debugging libraries only with GDB/LLDB is inconvenient, to put it mildly; while BN’s debugger is still under active development, and wrappers over other debuggers don’t always work properly. If this were a real-life case, not a test assignment, I would probably use another program as the main tool. Later, I will have to automate reverse engineering processes, write an exploit, transfer existing data between library versions, debug several modules at once, etc… I’m more used to performing all these operations in Ghidra, but as I said, in this particular case, speed was of utmost importance.

You might want to take a closer look at Binary Ninja

In the course of reverse engineering, I found out the purpose of the library that has attracted my attention: it parses website URLs and metadata and generates a preview for a link sent in a private message to another user or to myself. The next step was to prove that a specific function is responsible for this operation in the app.

Function accessibility

I used the QEMU emulator (Android Virtual Device) for the x86_64 architecture as a test system. A test run of Viber on the emulator showed the absence of protection mechanisms, which was good for my purposes. I deployed a Frida server; JS scripts were taken directly from jadx, which ultimately played a bad joke on me (see below).

The first (spoiler: and the last) function selected for analysis was nativeGeneratePreview:

let LinkParser = Java.use("com.viber.liblinkparser.LinkParser");
LinkParser["nativeGeneratePreview"].implementation = function (url, http) {
    console.log(`LinkParser.nativeGeneratePreview is called: url=${url}, http=${http}`);
    let result = this["nativeGeneratePreview"](url, http);
    console.log(`LinkParser.nativeGeneratePreview result=${result}`);
    return result;
};

Too bad, the script could not be processed: Frida claimed that this class could not be found. I tried different Frida launch configurations, traced native functions using frida-trace – and everything worked fine. At this point, I could stop since the fact that the function can be called has been proven – but I was wondering why didn’t Frida work and couldn’t find any apparent reasons for that. I checked for the presence of loaded classes:

Java.enumerateLoadedClasses({
    onMatch: function(className) {
        console.log(className);
    },
});

The class wasn’t loaded (which, generally speaking, was logical since I hadn’t called app functions yet). Then I tried to load the class manually (perhaps, there is a custom loader inside?):

Java.enumerateClassLoaders({
    onMatch: function(loader){
        Java.classFactory.loader = loader;
        var LinkParserClass;
        try{
            LinkParserClass = Java.use("com.viber.liblinkparser.LinkParser");
            LinkParserClass.nativeGeneratePreview.implementation = function(){
                console.log("[+] Inside nativeGeneratePreview method");
            }
        }catch(error){
            if(error.message.includes("ClassNotFoundException")){
                console.log("Class not found");
            }
        }
    }
});

This gave no result: neither positive nor negative one. And then the light came on me! I forgot about Java.perform; while the code doesn’t work without it. In the past, I never used Frida to analyze Android apps and completely missed this important detail. Needless to say that jadx didn’t remind me of it. But eventually, all the above-mentioned scripts were executed successfully.

Intercepting the call of the native function nativeGeneratePreview

I continued my research and found out that some URLs are blacklisted: no previews are generated for them:

rakuten-viber.atlassian.net;
jira.vibelab.net.

$ timeout 5 curl jira.vibelab.net ; echo $?

124

Out of curiosity, I allowed previews for these sites using Frida:

Java.perform(function () {
    let LinkParser = Java.use("com.viber.liblinkparser.LinkParser");
    const moduleName = "liblinkparser.so";
    const moduleBaseAddress = Module.findBaseAddress(moduleName);
    const functionRealAddress = moduleBaseAddress.add(0x0000000000013560);
    Interceptor.attach(functionRealAddress, {
        onEnter: function(args) {
            console.log(`check_link_in_black_list = ${args[0]}`);
        },
        onLeave: function(retval) {
            console.log(`return = ${retval}`);
            retval.replace(0);
        }
    });
})

However, nothing interesting occurred; still no access. This is strange; why were they blocked then? I left this question for later; perhaps, these sites can be reached from Viber servers via the client?..

Fuzzing

After spending some time on reverse engineering, I realized that my eyes and intuition aren’t sufficient to quickly find a vulnerability, and other methods are required.

Since I was limited in time, I had to choose only one vulnerability identification technique, and it must be really effective. My choice was fuzzing. Of course, I could try to use several methods at once. For instance, apply static analysis to decompiled code using Joern or other tools, or perform taint analysis and test the symbolic and concolic execution of dangerous functions used in the program (I normally use built-in Ghidra functions based on PCode, as well as angr, Miasm, BAP, and other tools). In addition, I could search for 1-day vulnerabilities in open-source code using binary diffing (I use Ghidra Version Tracking and BinDiff, as well as manual search if the code was successfully reverse-engineered).

Several fuzzing options came to my mind:

AFL++ on an emulator with Frida-based instrumentation;
AFL++ on an Android device with QEMU-based instrumentation;
AFL++ on a host with QEMU-based instrumentation;
Honggfuzz/AFL++ and QBDI; or
LibAFL on an emulator with Frida-based instrumentation.

The first option seemed to be unfeasible because it could take plenty of time to build AFL++ for Android (I mean time required to get an insight into the process, not to build, let’s say, AOSP).

How about the second option? I had a test device with the AArch64 architecture. However, the number of assembly-related problems in this case could be even greater. In addition, I would have to build a harness for QEMU for Android AArch64, and my experience shows that assembling a harness for ARM can take a while.

The third option could be used only as a backup. With regards to the fourth option, I have no experience with QBDI (although I really want to try it every time… but apparently not this time).

And finally, the fifth option. It seemed ideal to me for a number of reasons:

I have already worked with LibAFL;
I am a Rust developer; so, any potential problems will be solved faster;
it seemed to me that I can build Frida without significant problems; and
Rust is better suited for cross-compilation.

First of all, I decided to build a test fuzzer that comes with LibAFL. Problems arose right away, but, to the credit of the developers, their number wasn’t too large. All these problems were fixed with a simple patch:

diff --git a/libafl/src/bolts/minibsod.rs b/libafl/src/bolts/minibsod.rs
index 59f6ae6b..40d8e3d5 100644
--- a/libafl/src/bolts/minibsod.rs
+++ b/libafl/src/bolts/minibsod.rs
@@ -10,7 +10,10 @@ use libc::siginfo_t;
 use crate::bolts::os::unix_signals::{ucontext_t, Signal};
 /// Write the content of all important registers
-#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
+#[cfg(all(
+    any(target_os = "linux", target_os = "android"),
+    target_arch = "x86_64"
+))]
 #[allow(clippy::similar_names)]
 pub fn dump_registers<W: Write>(
     writer: &mut BufWriter<W>,
@@ -408,7 +411,10 @@ fn dump_registers<W: Write>(
     Ok(())
 }
-#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
+#[cfg(all(
+    any(target_os = "linux", target_os = "android"),
+    target_arch = "x86_64"
+))]
 fn write_crash<W: Write>(
     writer: &mut BufWriter<W>,
     signal: Signal,

Such fixes indicate that users either rarely use LibAFL on Android x86_64 or don’t use it at all.

To compile LibAFL, I used Android NDK: you can either download a ready-made version or build it yourself. Then I built the fuzzer frida_libpng and successfully tested it on the emulator.

Harness

The situation with harness wasn’t as smooth as I expected; so, I dedicate a separate section to it.

I spent a lot of time on reverse engineering to figure out which function could be called without side effects (i.e. doesn’t require an execution context) and concurrently is of interest from the fuzzing perspective. I omit here all the details of reverse engineering of C++ code: plenty of such information is available on the Internet. To simplify the interaction with tools, I use my own scripts for Ghidra (also don’t forget the powerful Ghidra repository containing useful scripts) and snippets for Binary Ninja.

Most of the interaction with the network occurs in Java code. But there are also functions that process data received from Java code (i.e. load websites for preview). Unfortunately, these functions cannot be analyzed in a short period of time since their initialization is mixed with Java code execution. Plenty of time is required to implement all stubs during initialization, and only then you can start fuzzing, for instance, data parsing functions. By general initialization, I mean the initialization of parsers: a separate parser is used for each type of data (embedded sites, images, website metadata, etc.).

There is also a finite-state machine for HTML parsing. The website body is read in a stream and fed in chunks as input to the finite-state machine that determines the data type contained in HTML. Then, depending on the data type, one or another information extractor is called:

BareTitleExtractor;
ImgTagExtractor;
LinkTagExtractor;
MetaTagExtractor;
etc.

All these targets are of utmost interest, but their complexity and dependence on Java code kept me from an attempt to analyze them.

Ultimately, a function that parses URL strings (parse_link) was found. At first glance, it was perfectly suited for analysis and fuzzing.

The next step was to write a wrapper for the target function. I started computing offsets of the functions I needed and then performed additional reverse engineering of the required structures:

const ptrdiff_t ADDR_JNI_ONLOAD = 0x0000000000011640;
const ptrdiff_t ADDR_PARSE_LINK = 0x000000000002F870;
const ptrdiff_t ADDR_COPY_JNI_STRING_FROM_STR = 0x0000000000011160;
[...]
typedef struct ParserResult
{
    struct String user_agent_string;
    struct String user_agent_info_string;
    struct String accept_string;
    struct String mime_type_string;
} ParserResult;
[...]

My analysis showed that data from another library, libicuBinder.so, are initialized prior to calling the target function. At this point, I encountered a problem that took me almost two days to solve. Inside the parse_link function, the uidna_nameToASCII_UTF8 function is called from the libicuBinder.so library. And, of course, in my harness, I used functions in their ‘raw’ form:

[...]
Functions *load_functions()
{
  LIBC_SHARED = dlopen("/data/local/tmp/libc++_shared.so", RTLD_NOW | RTLD_GLOBAL);
  LIBICU_BINDER = dlopen("/data/local/tmp/libicuBinder.so", RTLD_NOW | RTLD_GLOBAL);
  LIBLINKPARSER = dlopen("/data/local/tmp/liblinkparser.so", RTLD_NOW | RTLD_GLOBAL);
  if (LIBLINKPARSER != NULL && LIBC_SHARED != NULL && LIBICU_BINDER != NULL)
  {
    int (*JNI_OnLoad)(void *, void *) = dlsym(LIBLINKPARSER, "JNI_OnLoad");
    void (*binder_init)() = dlsym(LIBICU_BINDER, "_ZN22IcuSqliteAndroidBinder4initEv");
    if (JNI_OnLoad != NULL && binder_init != NULL /* && binder_getInstance != NULL */)
    {
      Dl_info jni_on_load_info;
      dladdr(JNI_OnLoad, &jni_on_load_info);
      size_t jni_on_load_addr = (size_t)jni_on_load_info.dli_saddr;
      Dl_info binder_init_info;
      dladdr(binder_init, &binder_init_info);
      size_t binder_init_addr = (size_t)binder_init_info.dli_saddr;
      int diff_parse_link = ADDR_PARSE_LINK - ADDR_JNI_ONLOAD;
      int diff_copy_jni_string_from_str = ADDR_COPY_JNI_STRING_FROM_STR - ADDR_JNI_ONLOAD;
      size_t parse_link_addr = jni_on_load_addr + diff_parse_link;
      size_t copy_jni_string_from_str_addr = jni_on_load_addr + diff_copy_jni_string_from_str;
      printf("[i] parse_link_addr: %zXn", parse_link_addr);
      printf("[i] copy_jni_string_from_str_addr: %zXn", copy_jni_string_from_str_addr);
      void (*parse_link)(ParserResult *, String *) = (void (*)(ParserResult *, String *))(parse_link_addr);
      void (*copy_jni_string_from_str)(String *, const char *) = (void (*)(String *, const char *))(copy_jni_string_from_str_addr);
      if (parse_link != NULL && copy_jni_string_from_str != NULL)
      {
        Functions *functions = (Functions *)malloc(sizeof(Functions));
        functions->parse_link = parse_link;
        functions->copy_jni_string_from_str = copy_jni_string_from_str;
        return functions;
      }
[...]

However, every time I tried to start the harness, I got a segmentation fault. Using strace for initial analysis, I found out that libicuBinder.so performs initialization when the function uidna_nameToASCII_UTF8 is called, but something goes wrong. As a result, another function is called inside the called function at address 0. Ultimately, I had to reverse the libicuBinder.so library to understand what’s going on during initialization and how the uidna_nameToASCII_UTF8 function is called.

Initially, I tried to debug this library in GDB, but then switched to IDA (IDA android_x64_server) since, by that time, I have already restored some of the functions and it would be stupid not to use this information for debugging purposes.

Debugging in IDA the libicuBinder.so library loaded into memory

Eventually, I realized what’s going on. First, the Android system searches for libraries that implement ICU; then it searches for the required export function symbols that will be used later (this is why the library is called Binder). The library takes the basis for symbol names from within; while the ICU version is received using a Java call. This is the reason why the required function symbols could not be loaded. I added a modification of the version loaded into the memory to the harness without calling Java code (for each system image, I had to change the version to make everything work; of course, this process could be automated in the future).

void set_icu_version(ptrdiff_t binder_init_addr)
{
  ptrdiff_t diff = g_ICU_VERSION - ICU_SQLITE_ANDROID_BINDER__INIT;
  ptrdiff_t version_addr = binder_init_addr + diff;
  printf("[i] original ICU_VERSION = %Xn", *(uint32_t *)version_addr);
  *(uint32_t *)version_addr = ICU_VERSION;
  return;
}

Then I started the fuzzer with the updated harness and encountered numerous crashes. It was clear that something is going wrong again. This time, the function std::codecvt<InternT,ExternT,StateT>::do_in in the liblinkparser.so library was throwing an exception because it couldn’t create a wide character string from bytes. At the time when I was doing this test assignment, I wasn’t 100% sure whether the attacker can send raw bytes as a message or not (now I know this for certain) and simply fixed the fuzzer so that it generates valid UTF-8 data.

Experiments and enhancements

Overall, the code coverage was very low, and no new input was generated. Therefore, I decided to capture the execution trace for analysis. This could be done in several ways, and the above-mentioned LibAFL seemed to be the easiest one. Unfortunately, the Frida coverage collection method works only on the AArch64 architecture:

$ ./frida_fuzzer –help

[…]

–drcov Enable DrCov (AArch64 only)

[…]

Then I decided to launch my fuzzer on on AArch64 platform. This way, I could use a separate physical device instead of the emulator.

Again, I encountered multiple problems, including fuzzer building. As a result, I had to play a little with the toolchain for AArch64 and with the Android NDK since its latest versions refuse to work with Rust. Various dirty tricks and patches didn’t help; so, I had no choice but to use an old NDK version.

Then an error occurred during the building of frida-gum-sys crate: to build Frida Gum, header files from the system (in my case, x86_64) are used, which is incompatible with AArch64 (a problem with pthread.h). I fixed this by cloning the dependency repository (entire frida-rust) and manually patched the build.rs file to add a directive to use sysroot from the Android NDK. It worked. But another problem appeared: the Android NDK doesn’t contain the required header file frida-gum.h (which is quite understandable). So, I added another directive telling the program where to get this file.

diff --git a/frida-gum-sys/build.rs b/frida-gum-sys/build.rs
index 6afbb737..adcb2c02 100644
--- a/frida-gum-sys/build.rs
+++ b/frida-gum-sys/build.rs
@@ -65,9 +65,11 @@ fn main() {
         bindings.clang_arg("-Iinclude")
     } else {
         bindings
+        bindings.clang_arg("-Iinclude")
     };
     let bindings = bindings
+        .clang_arg("--sysroot=./ndk/22.1.7171670/toolchains/llvm/prebuilt/linux-x86_64/sysroot/")
         .header_contents("gum.h", "#include "frida-gum.h"")
         .header("event_sink.h")
         .header("invocation_listener.h")

Then another problem arose: the new version of frida-gum simply couldn’t be built: apparently, the developer recently broke something or a new version of Frida was released, and the API has changed. I fixed this too: the error was old; it was a fix for some other error; and, as a result of it, the function didn’t work on newer versions of Frida.

diff --git a/frida-gum-sys/src/lib.rs b/frida-gum-sys/src/lib.rs
index d689106a..f8d5cbed 100644
--- a/frida-gum-sys/src/lib.rs
+++ b/frida-gum-sys/src/lib.rs
@@ -16,10 +16,4 @@ mod bindings {
 pub use bindings::*;
-#[cfg(not(any(
-    target_os = "macos",
-    target_os = "ios",
-    target_os = "windows",
-    target_os = "android"
-)))]
 pub use _frida_g_object_unref as g_object_unref;

Then another series of errors occurred (although of a different type): dependency conflicts. Several crates used different versions of dependencies. It turned out that libafl_frida uses old versions of frida-gum and frida-gum-sys. At this point I stopped because, with the updated dependencies, a bunch of errors popped-up in libafl_frida, and I had neither desire nor time to fix them. Currently, I am trying to fix and build libafl_frida for AArch64; so, I’ll discuss the LibAFL + Frida combination on the AArch64 architecture in the next article.

Ultimately, I decided to take a different approach: try to improve the fuzzer blindly and without coverage. Mutations and raw input data used in the fuzzer were unsuitable for my purposes since I was checking for a valid UTF-8 string. I decided to rewrite the fuzzer using a tokenizer. To do this properly, I needed time that I didn’t have… So, I implemented a primitive tokenizer described in the study Tartiflette: Snapshot fuzzing with KVM and libAFL.

As a result, the fuzzer started behaving much better and produced the desired result.

Conclusions

The above-described path led me from the very beginning to the potential discovery of a vulnerability. You can travel this path yourself and improve a lot in some places since you aren’t as limited in time as I was. In addition, I had to leave many promising points of interest for later – but you are welcome to explore them! Both the fuzzer and the harness are available on GitHub.

Most importantly, the presented approach can be used to identify vulnerabilities in many real apps that contain shared libraries.

Good luck!

2023.02.21 — Herpaderping and Ghosting. Two new ways to hide processes from antiviruses

The primary objective of virus writers (as well as pentesters and Red Team members) is to hide their payloads from antiviruses and avoid their detection. Various…

Full article →

2023.04.20 — Sad Guard. Identifying and exploiting vulnerability in AdGuard driver for Windows

Last year, I discovered a binary bug in the AdGuard driver. Its ID in the National Vulnerability Database is CVE-2022-45770. I was disassembling the ad blocker and found…

Full article →

2023.02.13 — First Contact: Attacks on Google Pay, Samsung Pay, and Apple Pay

Electronic wallets, such as Google Pay, Samsung Pay, and Apple Pay, are considered the most advanced and secure payment tools. However, these systems are also…

Full article →

2023.04.19 — Kung fu enumeration. Data collection in attacked systems

In penetration testing, there's a world of difference between reconnaissance (recon) and data collection (enum). Recon involves passive actions; while enum, active ones. During recon,…

Full article →

2023.02.21 — SIGMAlarity jump. How to use Sigma rules in Timesketch

Information security specialists use multiple tools to detect and track system events. In 2016, a new utility called Sigma appeared in their arsenal. Its numerous functions will…

Full article →

2022.02.09 — F#ck da Antivirus! How to bypass antiviruses during pentest

Antiviruses are extremely useful tools - but not in situations when you need to remain unnoticed on an attacked network. Today, I will explain how…

Full article →

2022.06.03 — Vulnerable Java. Hacking Java bytecode encryption

Java code is not as simple as it seems. At first glance, hacking a Java app looks like an easy task due to a large number of available…

Full article →

2022.02.09 — First contact: An introduction to credit card security

I bet you have several cards issued by international payment systems (e.g. Visa or MasterCard) in your wallet. Do you know what algorithms are…

Full article →

2022.04.04 — Elephants and their vulnerabilities. Most epic CVEs in PostgreSQL

Once a quarter, PostgreSQL publishes minor releases containing vulnerabilities. Sometimes, such bugs make it possible to make an unprivileged user a local king superuser. To fix them,…

Full article →

2023.07.07 — VERY bad flash drive. BadUSB attack in detail

BadUSB attacks are efficient and deadly. This article explains how to deliver such an attack, describes in detail the preparation of a malicious flash drive required for it,…

Full article →