A primer on Windows PE files and doing API calls without knowledge of memory layout
This blog post started as a ridiculously long comment on a GitHub issue. It’s long enough that it should be a blog post, as someone on Twitter pointed out to me, so now I’m replicating it here with some tweaks to make it read a bit better in continuous prose.
A caveat: I very quickly slapped this together and have not 100% validated everything. There might be some mistakes. Shout at me on Twitter (I’m over on Mastodon now) if you find issues.
The issue at hand here is this: you’ve got an x86_64 Windows PE and you want to change its behaviour by executing a stub or some shellcode in the process when it runs. That stub or shellcode needs to make API calls, but you can’t guarantee that the PE you’re injecting into actually imports the APIs you want to use, and you don’t know anything about the memory layout ahead of time. So how do you make this work?
To make sure we’re all on the same page, I’m going to start with the PE format.
All screenshots here are from CFF Explorer, which is a PE editor tool. It’s kinda old but it gets the job done. I’m also just looking at a 64-bit executable since 32-bit structures are slightly different.
PEs files start with an old 16-bit DOS header. This header is almost entirely ignored on modern Windows, so the only fields that typically matter are e_magic
(which must be ‘MZ’) and e_lfanew
, which points to the offset of the NT header.

You can see that its offset is at 0x108, which is where e_lfanew
said it was. You might notice that there’s a bit of a gap between the end of the PE header at 0x40 and the start of the NT header at 0x108.
What sits in that space is the DOS stub. You know the old “This program cannot be run in DOS mode”? That’s actually a 16-bit x86 DOS program, stored in the file immediately after the e_lfanew
field, but before the NT header. If you try to run a modern Windows PE under DOS, it runs that program instead of the PE. Since the e_lfanew
field is 32-bit, you can actually embed a complete 16-bit DOS program in there for cross-compatibility!
For fun, here’s the stub disassembled:

The first is the Machine
field, which tells you what machine this was built for. 0x8664 means x86_64, and 0x14C means x86_32. There are a bunch more defined values but unless you’re planning on working with Itanium or ARM PEs I wouldn’t worry about it.
Next is the NumberOfSections
field. This tells you how many sections there are in the section table. We’ll come to that later.
TimeDateStamp and the symbol table fields can be ignored. SizeOfOptionalHeader
is the next of importance - it tells us how big the next structure is going to be. It should always be 0xF0
on a 64-bit executable.
Finally there’s Characteristics
. This is a bitfield that specifies various flags. The flags in here should be irrelevant for your use-case, but flag 0x20 is “image can handle >2GB address space” which, if you ever do 32-bit stuff, will be important because it signifies PAE compatibility, i.e. the ability to have a virtual address space up to 3GB (or sometimes 4GB) in size per 32-bit process. If you’re just doing 64-bit, ignore this.
The optional header is where most of the magic happens. It’s different between 32-bit and 64-bit programs. I’ll focus only on 64-bit.

The data directories table is an array of up to 15 entries, each containing an RVA and size field. The NumberOfRvaAndSizes
field is the number of valid entries in the table, plus one null entry on the end. So for a full table (the norm) it’s 16, or 0x10. Normally you won’t see any other value than 0x10 in a non-packed executable.
The meaning of each directory is hard-coded by its index, i.e. export = 0, import = 1, resource = 2, exception = 3, etc.
The RVA is the virtual address, relative to the image base, of the location of the data for that directory. These match up with sections, i.e. every directory points to some address in a section, rather than to an offset in the file.
The ones you care about at the Export Directory, Import Directory and the Import Address Table (IAT) Directory. I’ll describe these later since it makes more sense to look at sections first.
Immediately after the data directories you have the sections table.

Name
is the RVA of a null-terminated string that specifies the name of the DLL. If you convert the RVA to an offset you’ll find the string in the file there. In this case I’m using kernel32.dll as an example:

So basically you’ve got:
void* Functions[NumberOfFunctions];
char_t* Names[NumberOfNames];
uint16_t NameOrdinals[NumberOfNames];
Each function can be accessed by those indices. Functions with no name are importable by their ordinal (index into the array) - not to be confused with a name ordinal, which is different. Don’t worry about ordinals too much, they don’t come up very often and you don’t really care about them here.
So to find a function by its name in the export table, you look at the AddressOfNames
field to get the RVA of the names array, then use that to loop through each of the name RVAs to find the one that matches the name of the API you want. That gives you the index into the other arrays to find where the function is.
For example:
void* getFunction(const char* functionName)
{
for (int n = 0; n < exportDirectory->NumberOfNames; n++)
{
if (strncmp(functionName, exportDirectory->Names[n], peHeader->SizeOfImage) == 0)
{
return exportDirectory->Functions[n];
}
}
return NULL;
}
Keep in mind that this gives you the RVA, so if you want the virtual address you need to add the base address of the module.
So now you know how to find an API export in a PE file, as long as you know its base address. So how do you find its base address? All Windows processes have a structure called the Process Environment Block (PEB) in memory. The structure is undocumented, but extremely stable. You can access the PEB via the Thread Environment Block (TEB), which has a ProcessEnvironmentBlock
field at offset 0x60 on 64-bit processes. The TEB is accessible via the GS segment register, so reading the PEB pointer is just a case of doing mov rax, gs:[0x60]
or using an intrinsic such as __readgsqword(0x60)
.
The field at offset 0x10 of the PEB is ImageBaseAddress
. This tells you the base address of the main executable module for the current process. So if your process is Task Manager, this is the image base address for the taskmgr.exe module.
The field at offset 0x18 is Ldr
, also known as the loader data. It is a pointer to a PEB_LDR_DATA
structure that includes information about modules loaded into the process. The InLoadOrderModuleList
and InMemoryOrderModuleList
fields of that structure are the heads of doubly-linked lists that describe the modules that are loaded into the process. Their offsets are 0x10 and 0x20 respectively, and this hasn’t changed since the Windows 3.x days.
Each of these linked lists uses a LIST_ENTRY
struct as a header. Immediately after each entry (apart from the one inside the PEB_LDR_DATA
struct itself) is an LDR_DATA_TABLE_ENTRY
struct. This struct describes a module that has been loaded into the process. Its fields of key interest include DllBase
, EntryPoint
, FullDllName
, and BaseDllName
. Ignore the use of “DLL” here - it really means any executable module.
The DllBase
field tells you the base address of the module after it was loaded into memory. The EntryPoint
field tells you the address of the entry point for that module, which should match the AddressOfEntryPoint
field from that module’s PE (albeit as a virtual address, not an RVA). The FullDllName
and BaseDllName
fields are UNICODE_STRING
structures that contain the full path to the module file and the name of the module respectively. You can use these to find a module by name.
In short, the process to find a module in memory, by name, is:
- Read the
ProcessEnvironmentBlock
pointer from the TEB atgs:[0x60]
. - Read the
Ldr
pointer from the PEB to get the loader data. - Iterate through either
InLoadOrderModuleList
orInMemoryOrderModuleList
using theFlink
field (forward link). - Find the
LDR_DATA_TABLE_ENTRY
struct immediately after eachLIST_ENTRY
struct. - Read the
Buffer
field ofBaseDllName
and check it against the name that you want, e.g. kernel32.dll - If it matches, read the
DllBase
field to get its base address in memory.
The base address of the module points to the DOS header (MZ …) of the module in memory. You can then apply the techniques previously discussed to find the export table and figure out where APIs are.
So let’s say you want to find LoadModule
and GetProcAddress
from kernel32.dll at runtime - here’s the steps in pseudocode:
// get PEB from TEB at gs:[0x60]
* peb = (PEB*)__readgsqword(0x60);
PEB* ldr = peb->Ldr;
PEB_LDR_DATA// start at the first node
// the first LIST_ENTRY is in the PEB_LDR_DATA struct, so not valid
* currentNode = &ldr->InLoadOrderModuleList->Flink;
LIST_ENTRY* kernel32_dos = NULL;
IMAGE_DOS_HEADERdo
{
// get LDR_DATA_TABLE_ENTRY after LIST_ENTRY
* entry = (LDR_DATA_TABLE_ENTRY*)(
LDR_DATA_TABLE_ENTRY((uint8_t*)currentNode) + sizeof(LIST_ENTRY)
);
= currentNode->BaseDllName->Length;
USHORT length wchar_t* dllNameStr = currentNode->BaseDllName->Buffer;
// case-insensitive wide string comparison, with length limit
if (wcsnicmp(L"kernel32.dll", dllNameStr, length) == 0)
{
// this is kernel32
= (IMAGE_DOS_HEADER*)currentNode->DllBase;
kernel32_dos break;
}
// not kernel32, try the next module
= currentNode->Flink;
currentNode }
while (currentNode != NULL && currentNode != &ldr->InLoadOrderModuleList);
// did we find kernel32?
if (!kernel32_dos)
return -1;
uint8_t* kernel32_base = (uint8_t*)kernel32_dos;
// find the NT header at the offset specified by e_lfanew
* ntHeader = (IMAGE_NT_HEADERS64*)(
IMAGE_NT_HEADERS64+ kernel32_dos->e_lfanew
kernel32_base );
// get the file & PE (optional) headers
* fileHeader = &ntHeader->FileHeader;
IMAGE_FILE_HEADER* peHeader = &ntHeader->OptionalHeader;
IMAGE_OPTIONAL_HEADER64uint8_t* peHeaderBase = (uint8_t*)peHeader;
// data directories are directly after the PE (optional) header.
* directories = (IMAGE_DATA_DIRECTORY*)(
IMAGE_DATA_DIRECTORY+ sizeof(IMAGE_OPTIONAL_HEADER64)
peHeaderBase );
// find the sections
size_t sizeOfDirectories = sizeof(IMAGE_DATA_DIRECTORY) * peHeader->NumberOfRvaAndSizes;
* sections = (IMAGE_SECTION_HEADER*)(
IMAGE_SECTION_HEADER+ sizeof(IMAGE_OPTIONAL_HEADER64) + sizeOfDirectories
peHeaderBase );
// get the virtual address of the export directory
* exportDir = (IMAGE_EXPORT_DIRECTORY*)(kernel32_base + directories[0]->RVA);
IMAGE_EXPORT_DIRECTORY
// get the export arrays
* nameRVAs = (DWORD*)(kernel32_base + exportDir->AddressOfNames);
DWORD* functionRVAs = (DWORD*)(kernel32_base + exportDir->AddressOfFunctions);
DWORD
void* fnLoadLibrary = NULL;
void* fnGetProcAddress = NULL;
for (int n = 0; n < exportDir->NumberOfNames; n++)
{
char* name = (char*)(kernel32_base + nameRVAs[n]);
void* func = (void*)(kernel32_base + functionRVAs[n]);
if (strcmp("LoadLibrary", name) == 0)
= func;
fnLoadLibrary if (strcmp("GetProcAddress", name) == 0)
= func;
fnGetProcAddress
if (fnLoadLibrary != NULL && fnGetProcAddress != NULL)
break;
}
// did we find the APIs?
if (fnLoadLibrary == NULL || fnGetProcAddress == NULL)
return -2;
// ok, now you've got the address of LoadLibrary and GetProcAddress and you can call them!
Once you’ve got LoadLibrary
and GetProcAddress
you can just get any API you like, or load any DLL:
HANDLE hKernel32 = LoadLibrary("kernel32.dll");
SOME_FUNCTION_TYPE fnOpenProcess = GetProcAddress(hKernel32, "OpenProcess");
Congrats, you’re done.