info
I strongly recommend you to review “The IDA Pro Book” by Chris Eagle. It’s outdated in terms of API, but still answers most questions.
Plugin in C++
The first way to extend IDA’s capabilities involves compiled DLLs. To build a plugin, you need the C++ SDK. To download it legally, you have to purchase an IDA Pro license (illegal ways are beyond the scope of this article).
Let’s try to build and run a test plugin in Visual Studio 2019 (or newer). Create an empty project for Windows (not for the console) and specify the configuration type: dynamic library. In additional catalogues of include files, specify the path to the folder containing included headers: idasdk90\. You can simply drag the included *. files from idasdk90\ to the list of project files, and the compiler will pick them up from there.
#define WIN32_LEAN_AND_MEAN#include <windows.h>#include <ida.hpp>#include <idp.hpp>#include <loader.hpp>#include <funcs.hpp>class MyPlugmod : public plugmod_t{public: MyPlugmod() { msg("MyPlugmod: Constructor called.\n"); } virtual ~MyPlugmod() { msg("MyPlugmod: Destructor called.\n"); } virtual bool idaapi run(size_t arg) override { msg("MyPlugmod.run() called with arg: %d\n", arg); int n = get_func_qty(); for (int i = 0; i < n; i++) { func_t* pfn = getn_func(i); if (pfn == nullptr) continue; qstring name; get_func_name(&name, pfn->start_ea); msg("Function %s at address 0x%llX\n", name.length() ? name.c_str() : "-UNK-", pfn->start_ea); } return true; }};static plugmod_t* idaapi init(void){ return new MyPlugmod();}static const char comment[] = "The plugin displays a list of functions along with the addresses.";static const char help[] = "Information is not provided";static const char wanted_name[] = "List functions";static const char wanted_hotkey[] = "Ctrl+Q";plugin_t PLUGIN ={ IDP_INTERFACE_VERSION, PLUGIN_MULTI, // flags init, // initialize nullptr, // terminate nullptr, // invoke the plugin comment, help, wanted_name, wanted_hotkey};The architecture is standard. The export specifies a reference to the PLUGIN structure. It contains the plugin description, the desired hotkey, and references to callbacks. The init function returns a reference to an instance of the MyPlugmod class. It contains the run function, which will be started when the plugin is invoked. The callback code gets the number of functions recognized by IDA using get_func_qty and goes through each function to obtain its name (using getn_func) and start address. After that, it outputs the collected information to the console.
Copy the built dll to the IDA folder. A new item, “List functions”, will appear in the Edit → Plugins menu. Alternatively, the plugin can be invoked using the Ctrl-Q combination specified in it.
Compatibility issues
The key problem with plugins is the extremely poor compatibility between older and newer SDK versions. A plugin compiled using one version most probably won’t work on a slightly newer SDK (and significant effort will be required to build it). In other words, you have to modify and build a new version of your plugin for each IDA Pro release. Of course, no one does this. As a result, many plugins became unavailable over time.
Such an approach involves a commercial interest. The SDK provides extremely scant documentation (basically only the names of arguments and functions). To find out how a specific API function works, you have to search for someone else’s code or contact support, which costs money. It can be said that poor compatibility is a kind of protection against piracy. Too bad, the main victim of this approach is the end user.
IDC
Support for scripts written in the IDC language was introduced with the second version of IDA Pro in 1994.
IDC is a C-like language without strong typing. But unlike C, it doesn’t have references: all arguments are passed by value. The language supports most C expressions except for+=. User-defined functions are declared using the static keyword. Library functions are documented, but provide less control compared to the C++ SDK. There are exceptions, classes, and syntactic sugar (e.g. concatenating two strings with + and getting a substring using str[ slicing).
#include <idc.idc>static main(){ auto ea,x; for ( ea=get_next_func(0); ea != BADADDR; ea=get_next_func(ea) ) { msg("Function at %08lX: %s", ea, get_func_name(ea)); x = get_func_flags(ea); if ( x & FUNC_NORET ) msg(" Noret"); if ( x & FUNC_FAR ) msg(" Far"); msg("\n"); }}The code is similar to the example provided in the previous section. It gets ea (Effective Address) of the first function via get_next_func and continues requesting addresses of subsequent functions in a loop until the API returns the BADADDR constant. The function name is returned by get_func_name; flags with metadata are received using get_func_flags. The same msg outputs data to the console.
The main advantage of IDC scripts is their out-of-the-box support. In all other respects, they are fatally obsolete. Previously, all noteworthy solutions were written in the C++ SDK, but today most plugins and one-time scripts available in public repositories are written in Python.
IDAPython
This plugin integrates Python into IDA Pro. Its first version was released in 2004. Its author Dyce (Gergely Erdelyi) developed IDAPython at the expense of his then-employer, the Finnish company F-Secure that produces an antivirus of the same name. In mid-2010s, due to lack of time, the project was transferred to Hex-Rays. IDAPython is still supported by the Hex-Rays employee 0xeb (Elias Bachaalany). Starting with version 5.4, the plugin became part of IDA Pro.
Today, IDAPython is the main language used to write new plugins; while old ones are ported to it. The above-mentioned backward compatibility issues have also contributed to its popularity. In addition, IDAPython is well-documented.
IDAPython code makes it possible to perform the same operations as the C++ SDK. It’s easy to notice that the function names are the same. In fact, IDAPython is a thin and handy wrapper for low-level APIs. Therefore, in IDAPython you can write loaders for unknown file formats or plugins that support their own windows in the GUI.
import idaapiimport idautilsimport idcclass ListFunctionsPlugin(idaapi.plugin_t): flags = idaapi.PLUGIN_UNL comment = "The plugin displays a list of functions along with the addresses." help = "Information is not provided" wanted_name = "List Functions v2" wanted_hotkey = "Alt-F8" def init(self): print("[ListFunctionsPlugin] Constructor called.") return idaapi.PLUGIN_OK def run(self, arg): for func_ea in idautils.Functions(): func_name = idc.get_func_name(func_ea) print(f"0x{func_ea:08X}: {func_name}") def term(self): print("[ListFunctionsPlugin] Destructor called.")def PLUGIN_ENTRY(): return ListFunctionsPlugin()As usual, let’s examine the plugin that displays the list of functions. Copy the code to the IDA folder, restart IDA, and you’ll get a new menu item: List Functions v2. Alternatively, the plugin can be started by pressing Alt-F8.
This is a complete analogue of the C++ plugin; the only difference is that Python features handy wrappers from the idautils library that provide the list of function addresses. The name get_func_name is familiar from the previous example. Instead of the PLUGIN structure, class fields inherited from idaapi. are used. The class has three predefined functions: a constructor, a destructor, and run (i.e. function containing code that is executed when the plugin is invoked).
Writing scripts in IDAPython
Below are a few examples illustrating how you can make your life easier with IDAPython. Scripts from disk are executed via File → Script File; short scripts from the clipboard can be executed via File → Script Command.
Highlighting CALL
import idautilsimport idcCALL_COLOR = 0xFFDDCCfor seg_ea in idautils.Segments(): for head in idautils.Heads(seg_ea, idc.get_segm_end(seg_ea)): if idc.is_code(idc.get_full_flags(head)): mnem = idc.print_insn_mnem(head) if mnem.lower() == "call": idc.set_color(head, idc.CIC_ITEM, CALL_COLOR)Run the script, and you’ll see that all strings containing the CALL instruction are highlighted in pale blue.
The idautils. function returns a list of segments: addresses of the beginning of each section in PE32. Then idautils. returns all elements inside the designated addresses. This can be either code or data. The script checks whether the element is a piece of code using idc.. Then it gets the mnemonics from the instruction using idc. and compares it with the desired CALL. And finally, idc. highlights the element in the specified color. The colors are specified in the BBGGRR format (i.e. to make it pure blue, you have to specify FF0000).
Substituting WinAPI results
Let’s try to bypass simple anti-debugging. For example, the program under examination compares the IsDebuggerPresent result with 1. An obvious solution is to erase the BeingDebugged flag from the PEB (Process Environment Block), but to make this example more interesting, let’s try to substitute the WinAPI result on the fly.
import idcimport idaapiIAT_NAME = "__imp__IsDebuggerPresent@0"RETURN_VALUE = 0global_hook = Noneclass ExitHook(idaapi.DBG_Hooks): def __init__(self, target_addr): super().__init__() self.target_addr = target_addr self.ret_addr = None def dbg_bpt(self, tid, ea): print(f"[+] Breakpoint hit at 0x{ea:X}") if ea == self.target_addr: esp = idc.get_reg_value("esp") print(f"[+] ESP: 0x{esp:X}") self.ret_addr = idc.get_wide_dword(esp) print(f"[+] Captured return address: 0x{self.ret_addr:X}") idc.add_bpt(self.ret_addr) elif self.ret_addr and ea == self.ret_addr: print(f"[+] Return point hit. Overwriting EAX with {RETURN_VALUE}") idc.set_reg_value(RETURN_VALUE, "EAX") idc.del_bpt(self.ret_addr) self.ret_addr = None return 0def setup_hook(): global global_hook imp_ptr = idc.get_name_ea_simple(IAT_NAME) if imp_ptr == idc.BADADDR: print(f"[-] Import {IAT_NAME} not found.") return target_addr = idc.get_wide_dword(imp_ptr) print(f"[+] Real ExitProcess address: 0x{target_addr:X}") idc.add_bpt(target_addr) global_hook = ExitHook(target_addr) global_hook.hook() print("[+] Hook installed. Start or resume process (F9).")setup_hook()The easiest way is to use idc. to get a reference to IAT_NAME: four bytes in the import section where the loader will write the target address of the called WinAPI function. Next, the script takes the current address of the function using idc. and sets a breakpoint on it using idc..
The ExitHook class that inherits idaapi. is used to create a hook. This is a modern way to set hooks on various debug events, including breakpoints. The trick is that the class instance must be global; if you create it locally in setup_hook, the garbage collector will delete it at the moment the function terminates. IDA won’t say anything, but the set hook won’t work.
In the dbg_bpt breakpoint handler, the script checks whether it has stopped at the address of the intercepted WinAPI, and gets the return address from the stack. Then it sets the second breakpoint on it. When it’s triggered, the script replaces the value of the EAX register with zero and deletes the hook that is no longer required.
#include <windows.h>int APIENTRY wWinMain(_In_ HINSTANCE hInstance, _In_opt_ HINSTANCE hPrevInstance, _In_ LPWSTR lpCmdLine, _In_ int nCmdShow){ if (IsDebuggerPresent()) { MessageBoxA(0, "debugger", "!", 0); }}Now you can run the script at the start of the test application — and Voila! IsDebuggerPresent returns zero, and the message isn’t displayed.
Searching for paths to unsafe functions
If the import contains functions that can potentially cause a buffer overflow, it’s worth checking whether user data can be passed with them. To do this, you have to create a graph by traversing from a given function to all functions that call it, then to all functions that call them, and so on up to the topmost level.
import idautilsimport idcimport idaapiimport ida_funcsfrom collections import defaultdictIMPORT_NAME = "lstrcpyW"callers_map = defaultdict(set)calls_to_func = []def get_func_name(ea): return idc.get_func_name(ea) or f"sub_{ea:X}"def get_func_start_ea(ea): f = ida_funcs.get_func(ea) return f.start_ea if f else eadef find_import_address(): ea = idc.get_name_ea_simple(IMPORT_NAME) if ea == idc.BADADDR: print(f"[-] Import {IMPORT_NAME} not found.") return None print(f"[+] Found {IMPORT_NAME} import at: 0x{ea:X}") return eadef find_calls_to_import(imp_addr): print("[*] Scanning for calls to imported function...") for func_ea in idautils.Functions(): for insn_ea in idautils.FuncItems(func_ea): if idc.print_insn_mnem(insn_ea).lower() != "call": continue op_type = idc.get_operand_type(insn_ea, 0) if op_type in [idc.o_mem, idc.o_displ]: target = idc.get_operand_value(insn_ea, 0) if target == imp_addr: calls_to_func.append(func_ea) callers_map[imp_addr].add(func_ea) print(f"[+] Found {len(calls_to_func)} direct callers of {IMPORT_NAME}")def build_call_graph(): print("[*] Building global call graph...") for func_ea in idautils.Functions(): for insn_ea in idautils.FuncItems(func_ea): if idc.print_insn_mnem(insn_ea).lower() != "call": continue target = idc.get_operand_value(insn_ea, 0) if ida_funcs.get_func(target): callers_map[target].add(func_ea)def build_paths(target_ea, path=None, visited=None): if path is None: path = [] if visited is None: visited = set() path = [target_ea] + path visited.add(target_ea) if target_ea not in callers_map or not callers_map[target_ea]: yield path else: for caller in callers_map[target_ea]: if caller not in visited: yield from build_paths(caller, path, visited.copy())def print_path_tree(path): for depth, ea in enumerate(path): print(" " * depth + f"- {get_func_name(ea)}")def main(): imp_addr = find_import_address() if not imp_addr: return find_calls_to_import(imp_addr) build_call_graph() print("\n[+] All unique call paths to lstrcpyW:\n") seen_paths = set() for caller in calls_to_func: for path in build_paths(caller): norm_path = tuple(get_func_start_ea(ea) for ea in path) if norm_path not in seen_paths: seen_paths.add(norm_path) print_path_tree(path) if not seen_paths: print("[-] No paths found.")main()Let me briefly explain what’s going on. The find_calls_to_import function creates a call map (i.e. records all places from where lstrcpyW is called). Then build_call_graph continues compiling this map by going through all functions and collecting all calls. Finally, build_paths builds all the possible paths to the desired address based on the received data. The top-level code filters the received paths to output only unique combinations.
The script output looks as shown below:
[[[[
[
- - - -
- - -
- - -
There are thirteen unique calls, but all of them are made from three functions.
Applying scripts to multiple targets
IDA Pro can be controlled from the command line. All you have to do is run it in headless mode (i.e. without displaying graphics) by executing idat64. instead of ida64.:
idat64.
The -A key starts IDA in a standalone mode without dialog boxes. The -S key specifies the path to the script. Then the path to the analyzed file is specified.
import ida_autoimport idcida_auto.auto_wait()with open("result.txt", "w") as f: f.write("something")idc.qexit(0)The executed script must wait for the analysis to complete; ida_auto. is used for this purpose. The script output is written to an external file. After that, IDA terminates.
Conclusions
Scripts with access to internal APIs significantly expand IDA’s capabilities and effectively transform this decompiler into a universal tool suitable for any task. IDA can act as a code analyzer, situationally enhanced debugger, etc., etc. Furthermore, if you run IDA from the command line, it will become a weapon of mass destruction for enemy code!
Good luck!
