Creating events
First of all, it’s necessary to prepare a dataset to be used in this research.
info
All experiments described below were conducted on VMware Workstation 14 Pro. Accordingly, hypervisor-related commands are provided for it.
To prepare test events, you’ll need two virtual machines: one will run Windows 10; while the other one, Kali Linux 2021.2. If you want to build a test environment on your own, I don’t recommend to create a large disk for the Windows VM: 40 GB will be enough, and all processes will run faster. In addition, I recommend installing Sysmon on the experimental machine for greater clarity of the analysis. When you start pursuing the ‘malefactor’, you’ll understand the value of this utility.
I will tell nothing about the infection and post-exploitation of the test system; this way, it will be more interesting for you to extract evidence from the test image.
So, after performing some malicious operations in the test system, you stop pretending to be a cybercriminal and return to your normal status: forensic cybersecurity specialist.
info
You don’t have to exactly repeat all the steps described below. This article describes a general approach to such situations, including those involving real hardware. The Plaso tool that I use effectively analyzes VM images provided that each image is stored in a single .vmdk file.
After collecting evidence from the virtual machine, I will run a shell on the attacked system once again, clear the logs using the clearev
command from Meterpreter, and recollect the evidence. At the end of the article, you’ll see how effective Meterpreter is in covering up your tracks and whether you can rely on it.
Turn off both VMs, mount the virtual disk from the Windows VM to the Kali VM as the second hard drive, and configure the Shared Folder for Kali on the hard drive of your host computer (important: it must have enough free space to make a byte copy of the Windows disk). If you have allocated 40 GB for it, then your disk should have at least the same amount of free space.
Run Kali, create a mounting point, and mount the Shared Folder:
$ sudo mkdir /mnt/hgfs$ sudo /usr/bin/vmhgfs-fuse .host:/ /mnt/hgfs -o subtype=vmhgfs-fuse,allow_other
Find the target hard drive:
$ sudo fdisk -l
In this particular case, it’s sda
. To create a byte copy, run the following command:
$ sudo dd if=/dev/sda of=/mnt/hgfs/dd/disk1.dd bs=8M
Wait for the end of copying and turn the VM off. Now, according to all protocols, you have to calculate the checksum of the resultant image and there comes a time for a fluegergeheimer.
Plaso
Plaso (a recursive acronym for the Icelandic “Plaso Langar Að Safna Öllu”, which means “Plaso wants to collect everything”) is a tool written in Python whose primary purpose is to produce a super timeline of all events that have occurred in the system and export it into a single gigabyte CSV.
Version 1.0.0 was released on December 5, 2012; however, the log2timeline utility written in Perl was mentioned for the first time on The Forensics Wiki back on August 28, 2009.
Installation
No special skills are required to install Plaso (at the time of the writing, the current release was 20210606); all you need on Linux is a terminal and Internet access. If you want to examine the source code, clone the repository; if you don’t, use pip.
$ sudo apt-install python3-pip$ pip install plaso elasticsearch
Then install all the required dependencies:
$ pip install -r requirements.txt
The requirements.txt file lacks some optional dependencies: chardet, fakeredis, and mock; so, you need one more command to make the things work:
$ pip install chardet fakeredis mock
On Windows, the situation is more complicated because you have to install Build Tools for Visual Studio. But the problem can be solved easily if you have the Visual Studio 2017 installation package (I used it, and everything worked just fine). Run the installation and select the VC++
component in the Individual
section. After that, all dependencies will be installed via pip.
Required tools
Plaso includes a number of useful utilities located in the tools directory.
image_export extracts files from a device or its image based on various criteria: from extension and path to signatures and creation/modification time. After extracting the required files, the utility generates a hashes.
file that contains hash values of all extracted files for further analysis (e.g. on VirusTotal).
The utility can be launched in several ways:
py image_export.py disk1.dd
Such parameters as --names
, --extensions
, and --date-filter
shouldn’t raise any questions; so, let’s examine the extraction by signatures in more detail.
py image_export.py --signatures list
This command searches for signatures described in data\
and extracts files based on their signatures. Of course, the config contains only basic signatures suitable for all occasions, but you can always add to the config a string with a specific file signature you are interested in. Just don’t forget to choose a unique identifier for it.
For instance, all files with Windows PE Binary signature can be extracted using the ready-made exe_mz
signature:
py image_export.py --signatures exe_mz disk1.dd
log2timeline is the main tool created 12 years ago. It’s used to extract various events from files, directories (e.g. mount points), devices, or their images. The utility generates a file in the Plaso format for subsequent analysis.
The standard launch command looks as follows:
py log2timeline.py --storage-file output_file disk1.dd
All file types that may contain information essential for the generated timeline are located in the test_data
folder, while the parsers are located in the plaso/
folder. So, if you need to modify an existing parser or develop a new one for some special file format, you know where to look for it.
It’s necessary to specially note the --process-archives
option. It can significantly increase the log collection duration, but as you understand, archives may contain important pieces of evidence forgotten by the attacker.
A great advantage of log2timeline is its integration with YARA rules. To add information about YARA rule triggers to the timeline results, you have to run the tool with the --yara_rules
key.
py log2timeline.py --storage-file output_file --process-archives --yara_rules rules.yar disk1.dd
where rules.
is a previously created file containing YARA rules.
If you obtain open source rules (e.g. from Clam AV) and convert them into the YARA format, log2timeline may become a truly powerful tool in your hands.
pinfo displays information about the contents of a Plaso file (e.g. versions, parsers, types of events recorded in the report, their number, and errors).
A useful option is -v
: it displays detailed information about the computer name, exact OS version, and users.
psort performs additional processing and converts the previously generated Plaso file into a format required for its subsequent analysis. When you run it, don’t forget the --output-time-zone
option to avoid the need to add the required time zone to the timestamps that correspond to UTC by default.
If you run the psort
utility with the --analysis
option, you will receive from the plugin high-level information based on the processed events (for instance, about malware present among all investigated files that have ever been uploaded to VirusTotal). Furthermore, you can run this plugin without the risk to leak any information sensitive for the customer to the cloud because the check is performed exclusively based on hash values calculated at the data collection stage. Of course, you will need the API key to the service to use this plugin. In addition, it supports the Viper framework where you can create your own malware collections.
The --slice
and --filter
options allow to select specific events within a certain (usually a few minutes) interval from the time of the event you are interested in, but their detailed descriptions are beyond the scope of this article.
Events can be converted to various formats, including CSV, XLSX, and JSON; use the -o
option to select the format.
psteal combines the functionality of log2timeline and psort. Personally, I used it only once out of curiosity. The list of its options is short. Sill, in certain situations this utility can be used as an express tool.
Usage
Now let’s put the above-described utilities to use.
First, collecting logs:
py log2timeline.py --storage-file "d:\Work\!Article\test_case_1.plaso" "d:\Work\!Article\dd\disk1.dd"
If Plaso finds shadow copies of volumes, it reports that it’s ready to extract all available events from them as well. Agree with all its suggestions and go have some coffee. The required time depends on the performance of your PC and the amount of data to be extracted.
If a system was compromised for a long time, and the attacker has cleared all logs on exit, sometimes you can extract plenty of information from shadow copies, even without Sysmon.
info
According to statistics dated a couple of years ago, the average time an attacker stays in a compromised system is almost nine months!
This is why malefactors normally launch a cryptolocker when they leave an infected host. However, in most cases, this isn’t as effective as it seems at first glance.
When the utility completes its task (in my case, the log collection took about an hour), you can proceed to the analysis.
Run pinfo and see what it says about the investigated image.
py pinfo.py -v d:\Work\!Article\test_case_1.plaso
In addition to the parameters used to collect data from the image, you can see a list of all parsers and plugins that were used. If, for some reason, you don’t see in the image a certain event type that must be present there, you have to restart log2timeline and forcibly specify the parser you are interested in (such things happen, for instance, if you have to collect logs from a system in the evt format).
Also, pinfo determines the version of the installed OS and its hostname and displays information about users and paths to their home directories. This information can be useful if somebody’s home directory is stored on a remote server. In such situations, you have to separately copy the folder containing the user directory from the server and feed it to Plaso.
Overall, it took Plaso an hour to pull 1,116,040 events from the image. Looking at the pinfo output, you can already guess how many events from what categories you have to analyze and what automatic analysis modules will bring the most interesting results.
For comparison, look at the diagram of events extracted from the image where I had erased the logs using clearev
. As you can see, the number of extracted winevtx
events it is only 3000 less. Based on the difference between the two diagrams, the picture has not changed much, and the amount of evidence indicating that the system was compromised has even increased.
Let’s run psort and try to assess the results of its analysis without any extensions.
py psort.py --output-time-zone Europe/Moscow --output-format l2tcsv -w d:\Work\!Article\test_case_1.csv d:\Work\\!Article\test_case_1.plaso
A CSV more than 300 MB in size may terrify an inexperienced user… But let’s see what else can be done to simplify your forensic study.
Imagine that the user has suddenly remembered and told you above a tempting offer (e.g. to get a bitcoin from Elon Musk) with a link to it that was received by e-mail on the 25th of this month. Let’s see how this information and the image_export script can help in your investigation:
$ py image_export.py --date-filter "atime,2021-08-25 09:00:00,2021-08-25 18:30:00" --signatures exe_mz -w d:\Work\!Article\extracted d:\Work\!Article\dd\disk1.dd
As of a sudden, you see an executable file with an extremely suspicious name free_bitcoins_from_Musk.
in the user’s Downloads directory. My advice is to start searching the timeline from this particular event using grep.
$ grep free_bitc test_case_1.csv > free_bitc.csv
Now the things look much better. The output contains 55 events, and I highlighted the 6 most interesting ones.
The list of events shows that at 16:07:44, the user has indeed downloaded a suspicious binary using Edge and launched it after a few seconds. The next event is the establishment of a network connection from the running process to the remote address 192.168.79.131:7788. And in 30 seconds, you can see an interaction between the app and the vmtoolsd.exe process.
This sequence of events is very distinct; it indicates that Meterpreter launched in the context of the process has successfully migrated into the context of the vmtoolsd.exe process in the guest OS. The last entry on the screenshot indicates that the shell was restarted at 17:08:59 at the next login attempt. In other words, the attacker managed to gain a foothold in the system using some malicious techniques, while the user didn’t notice anything suspicious.
Out of curiosity, let’s look for these six events in the timeline with ‘erased traces’: surprisingly, but all of them are in place. It’s a trap!
So, now you have specific time intervals during which the system was compromised and can select events specifically for this period.
Run psort again and add the time intervals that are of interest to you using the filter. Don’t forget that the default timezone is UTC.
py psort.py --output-time-zone Europe/Moscow --output-format dynamic -w d:\Work\!Article\test_case_1_attack_time.csv -q d:\Work\\!Article\test_case_1.plaso "date < '2021-08-25 14:20:00' and date > '2021-08-25 13:07:00'"
After a few minutes, you can see the result.
Not bad: the physical volume has been reduced by 16 times; while the semantic volume, by 20 times (57 thousand events versus 1.1 million).
Let’s scroll this file a little bit.
Starting approximately from record 1500 and up to record 30,000, you see the same events. This indicates that your adversary was traversing the disk contents using an automation tool in search of something valuable (e.g. files with the passw
substring). Understanding the nature of such events, you can exclude them from the analysis, thus, reducing the volume of information you have to process.
Further analysis involves the filtering of events by categories and careful examination of them in chains. As a homework assignment, you can play around with the tagging
plugin for psort
.
Conclusions
As you can see, the utility described in this article is very efficient; it enables you to get on the track and describe the attacker’s adventures in the system under investigation. Even if the malefactor is cautious and tries to cover all tracks, you may still find some evidence in logs and archives.
In the next article, I will discuss more high-level tools used to analyze timelines generated by Plaso.