How to fool MSI installer: Instruction for lazy hackers

To run a program, you must install it first. But what if the installer doesn’t want to start, or even worse, refuses to install the app? In that situation, you have no choice but to hack it. Today, I will show how to do this easily, quickly, and effectively.

Some time ago, I was asked to reinstall an old program. It was a 100% legitimate piece of software officially purchased a long to time ago. The program had worked fine; but then a need to reinstall it arose. In the course of the reinstallation, it turned out that: (1) it cannot be installed on modern versions of Windows; and (2) one of the serial numbers required for the installation (to be specific, the license code for a small auxiliary library critically important for the program) is invalid. After so many years, it was physically impossible to contact the developer. So, I decided to examine the installer and find out what the hell it needs for installation.

The main principle of a lazy hacker is to follow the path of least resistance. In strict adherence to it, I opened the installation package (a separate .exe module) in a Hex editor (yes, not in a disassembler) to read the error message. I wasn’t surprised or disappointed to find out that such a string does not exist – neither in the Unicode nor in the standard format.

But what I found in the code was the [0] resource with a characteristic signature: InstallShield. Parsing InstallShield packages is a pretty trivial task, even though, for obvious reasons, it’s not documented officially. If necessary, all the required information can be found on special forums.

The window with the error message claiming that the serial number is invalid remained open, and I decided to connect to the running installation process using ОllyDbg. To be honest, I totally forgot how to debug the InstallShield interpreter; the only thing I remembered was that it’s a very boring and time-consuming task. In fact, I didn’t even connect after seeing an unpronounceable mess of hexadecimal letters and numbers in the list of processors suitable for connection (under the window title with the warning MSIEXEC -Embedded).

Of course, I could decrypt this mess to find out the name of the subdirectory in the Temp folder from where the installation script is interpreted, but why overcomplicate simple things? I opened the Temp folder and saw three items in it:

  1. The MSI package created by the installer in the course of the installation;
  2. A temporary folder containing a number of DLLs typical for InstallShield; and 
  3. Most importantly: the setup.inx installation script.

Time to share a big secret: in this particular case, the debugger is not needed at all. I ran it just in case and didn’t really use (see below). A much faster and, importantly, more accurate way is to track the file I/O of the installation process. This can be done using a simple and popular utility called ProcessMonitor (FileMonitor for older versions of Windows). The analysis of its pretty heavy log allows to determine the folder, the script, and the time when the MSI package was created.

A more detailed analysis showed that the process for some reason consists of three stages: EXE gives birth to MSI, who, in turn, gives birth to another MSI, who actually produces the contents of the target subdirectory containing the installation script. To summarize: I have got a clean MSI package equivalent to the installation EXE and the installation script setup.inx that contains all the installation logic.

My task became clear. After making sure that MSI is 100% equivalent to the original installer (which acts just as a wrapper), I started examining it. The MSI format is well-documented (including Microsoft manuals), but, as said before, my main principle is laziness. I have no time to read obscure technical descriptions full of inaccuracies and incomprehensible terms – I need the result immediately and with minimal effort.

Therefore, I used the 7z archiver to open the MSI and saw plenty of interesting stuff there, including a .cab file that contains all the files and libraries. If, by some coincidence, the serial numbers and license codes are correct, there is a chance to unpack them. If my task were just to find the missing file (i.e. not install the program 100% correctly), I could stop examining the package at this point.

But despite my laziness, I always get the job done: the package must be legitimately installed, including all the paths, registry keys, registration of controls, and other operations performed by the installer. Therefore, I had no choice but to overcome the most unpleasant stage of the process. The problem was that I couldn’t find the installation script setup.inx in the list of unpacked files; accordingly, it was impossible to edit it directly in the installation package. With a hard feeling, I started examining the installation script.

After a quick glance at it, my hard feeling got worse: no readable code, just a mess of pseudo-random characters. No setup messages, no text lines, and not a single readable word at all. The script was definitely coded, and I needed special utilities to decode and decompile it.

Thanks to almighty Google: some good people have already taken care of this and created a wonderful utility called IsDcc (its source code is available on GitHub). Not only does it decrypt encoded scripts (this is called scramble/unscramble), but also decompiles a decrypted script into readable source code! Too bad, the program cannot compile it back, but my another principle is to deal with problems as they come. After decoding and decompiling the installation script, I have got slightly more than a megabyte of BASIC-like code and found in it the section responsible for checking the serial number:

00E8EE:000D: lNumber18 = lString9 == lString17;
00E8FB:000D: lNumber18 = lNumber18 == 0;
00E90A:0004: if lNumber18 == false then goto label198 ;
00E916:0021: call function651("Sorry, but the license key that you entered is invalid.");

As you can see, the script computes the valid key and compares it with the one entered by the user. My first idea was to analyze the generation algorithm and write a key generator. However, after taking a look at the algorithm (several pages of obscure unstructured code), I dismissed this idea as contradicting the lazy hacking concept. Instead, I decided to edit one of the three checks by inverting its condition.

But how to do this with minimal effort? I don’t know the command system of the interpreter. Of course, I could review the IsDcc documentation (the source code is well-commented), but this approach is too boring for me. Especially that I noticed an interesting feature of the decompiled code: a pair of hexadecimal digits on the left.

Decompiled installation script code
Decompiled installation script code

This is not a segmented address as one might think, but an offset:opcode pair. A quick review of several adjacent commands made it possible to assume that D is an equivalence comparison opcode, while its antagonist (i.e. nonequivalence comparison) has the ʻEopcode. Important: such a conclusion didn't require me to painstakingly research the command system. So, all I had to do was open the compiled script with a Hex editor and changeDto ʻE at the desired offset: after that, any incorrect code entered by the user will be correct, which is exactly what I needed! I unscrambled the resultant setup.inx and moved on to the next problem.

As you remember, there is nothing resembling setup.inx in the files unpacked from the MSI package. Furthermore, searches for fragments of scrambled code bring negative results. But the script is definitely present in the MSI file, it cannot come out of thin air! A search inside the MSI installation package provides a link to its name: setup.inx. This indicates that the script is there, perhaps in an encrypted or packed form.

Because I don’t want to spend time on reviewing the MSI specifications, I try a ‘lazy’ approach again. This time, I am going to use a miraculous utility called FileMon. I take the log of calls to the installation package’s file system and find in it the ‘birthplace’ of the file setup.inx (i.e. its creation in a subdirectory inside the Temp folder and recording of the first data block to it). I am lucky: right before this entry, I see the reading of the same data block from the parent file (i.e. MSI package) and even the offset of this block.

I check the data located at this offset in the parent MSI package: they have nothing to do with the content of setup.inx. However, the name setup.inx is right before this data block, and I also see there four bytes setting the size of the file setup.inx: apparently, this is the header of an encrypted block. This indicates that another milestone has been passed: I found the source of the script data.

The source of the script data has been found!
The source of the script data has been found!

One might think that I am deadlocked: the data are encrypted, I don’t know the encryption algorithm, and accordingly, cannot edit them. Time to start writing a key generator or reviewing the MSI specifications… If the data were encrypted with a serious ‘adult’ cryptoalgorithm, I would do so because it’s impossible to identify the encryption algorithm and key by trial and error method. But I don’t give up: I assume that the developers were as lazy as myself and encrypted the data with the trivial XOR cipher.

A quick examination of adjacent encrypted data in other similar files indirectly confirms my assumption: they contain blocks filled with the same repeating patterns, sometimes broken into several bytes, sometimes interrupted – as if the file was simply XORed into short blocks 8-12 bytes in size. Interestingly, the length of such a pattern is equal to the length of the encrypted file’s name, but right now, this is irrelevant to my purposes.

If my assumption is correct, then, thanks to the commutativity and associativity of the XOR operation, I can patch one byte in the encrypted code by XORing it with the exclusive difference of original bytes. Then I compare the two scrambled setup.inx files: the original and the patched one. The exclusive difference is equal to four, and it is located in the edited byte at offset E8FB.

Searching for the exclusive difference
Searching for the exclusive difference

I apply the offset E8FB to the supposed beginning of data pertaining to the file setup.inx inside the MSI package and then XOR the byte at this offset with the value of four. Then I launch MSI… Oops, the trick fails, and the installation freezes. It looks like my lazy hypothesis was wrong, and the file is encrypted with something serious, not the trivial XOR.

Fiasco: installation failed
Fiasco: installation failed

If I weren’t that lazy, I would abandon further attempts at this point and start writing a key generator. But I am determined to find out what was my mistake. I extract the newly-created setup.inx and compare it with the original one: what has changed in the file after the editing? It turns out that not all is lost: the difference is in a single byte. However, for some reason, the offset is not E8FB, and it is the opposite nibble that was XORed with the value of four. So, I adjust the offset, and swap the locations of the nibbles: instead of 04, I XOR 0x40. And finally, my efforts are crowned with success: the altered installation package gladly accepts a randomly entered code and smoothly installs the libraries on the computer. If I were a perfectionist, I could go even further and find the ‘secondary’ MSI package in the original .exe installer… but I’m too lazy for that. After all, the problem is solved, and the experiment results in a complete and unconditional victory.

One might say: “Dude, you are teaching us bad things: the above-described process is full of weak points, and your assumptions won’t work in serious cases.” I don’t argue, but you’ll be surprised to see how many situations in real life can be resolved using simple and available means (i.e. without employing a debugger and a disassembler). Which, of course, does not relieve you from the need to study the subject. Good luck!


Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">