The PEB Walk Anatomy

Intro & Motivation

The Process Environment Block (PEB) is a user-mode data structure created by the Windows kernel for each process in the NT family of operating systems. It stores process-wide state used by ntdll/user32/loader internals — things like a pointer to the list of loaded modules, process startup parameters, and a BeingDebugged flag. The PEB is not a public, stable Windows API — Microsoft documents only a few fields (and warns that the layout may change). Still, the structure has been studied extensively by researchers and reverse engineers because it is accessible from user mode and contains the module lists that shellcode and malware often rely on.

Shellcodes run in a constrained environment (no imports available, position independence, small size), authors commonly locate module bases and resolve API addresses by walking the PEB and then parsing the PE export table of a DLL (e.g., kernel32/ntdll) instead of calling LoadLibrary/GetProcAddress. This technique is often called PEB walking and is widely used in shellcode and advanced loaders.

Currently, we have many C2 frameworks where we often encounter PEB Walking implementations such as CobaltStrike, Metasploit, Brute Ratel C4, and Havoc, among others. Understanding this technique helps you avoid wasting time and ensure the efficiency of your future analyses and code.

Understanding the PEB Structure

Memory Layout Visualization

Using the layout visualization below, we can demystify the PEB Walk concept and focus on the offsets. When you are analyzing something like this, you’ll probably see assembly code and offsets, but before digging into the offsets, let me show you, in a big-picture sense, how it would work internally.

Windbg !peb
windbg !peb

PEB Walk - Detailed Offset Navigation (x86 / x64)

Step 1: Get PEB from TEB

x86 (32-bit):
  MOV EAX, FS:[0x30]  ; TEB.ProcessEnvironmentBlock
; EAX = PEB

x64 (64-bit):
  MOV RAX, GS:[0x60]  ; TEB.ProcessEnvironmentBlock
; EAX = PEB

Step 2: PEB Structure Offsets

These core fields are stable across versions and safe to rely on for a loader walk.

PEB:
  +0x02  BYTE  BeingDebugged
  +0x03  BYTE  BitField

x86:
  +0x0C  PPEB_LDR_DATA Ldr
  +0x10  PRTL_USER_PROCESS_PARAMETERS ProcessParameters
  +0x18  PVOID SubSystemData
  +0x1C  PVOID ProcessHeap

x64:
  +0x18  PPEB_LDR_DATA Ldr
  +0x20  PRTL_USER_PROCESS_PARAMETERS ProcessParameters
  +0x28  PVOID SubSystemData
  +0x30  PVOID ProcessHeap

Step 3: PEB_LDR_DATA Structure

PEB_LDR_DATA:
  +0x00  ULONG  Length
  +0x04  BOOLEAN Initialized
  +0x08  HANDLE SsHandle

x86:
  +0x0C  LIST_ENTRY InLoadOrderModuleList
  +0x14  LIST_ENTRY InMemoryOrderModuleList
  +0x1C  LIST_ENTRY InInitializationOrderModuleList

x64:
  +0x10  LIST_ENTRY InLoadOrderModuleList
  +0x20  LIST_ENTRY InMemoryOrderModuleList
  +0x30  LIST_ENTRY InInitializationOrderModuleList

Step 4: LDR_DATA_TABLE_ENTRY offsets (the essentials)

These are the fields you actually need for a PEB walk. Many other fields exist and can shift between versions; avoid depending on them.

x86:
  +0x00  LIST_ENTRY InLoadOrderLinks      ; 8 bytes
  +0x08  LIST_ENTRY InMemoryOrderLinks    ; 8 bytes
  +0x10  LIST_ENTRY InInitializationOrderLinks ; 8 bytes
  +0x18  PVOID       DllBase
  +0x1C  PVOID       EntryPoint
  +0x20  ULONG       SizeOfImage
  +0x24  UNICODE_STRING FullDllName      ; 8 bytes (x86 UNICODE_STRING)
  +0x2C  UNICODE_STRING BaseDllName      ; 8 bytes

x64:
  +0x00  LIST_ENTRY InLoadOrderLinks      ; 16 bytes
  +0x10  LIST_ENTRY InMemoryOrderLinks    ; 16 bytes
  +0x20  LIST_ENTRY InInitializationOrderLinks ; 16 bytes
  +0x30  PVOID       DllBase
  +0x38  PVOID       EntryPoint
  +0x40  ULONG       SizeOfImage
  +0x48  UNICODE_STRING FullDllName      ; 16 bytes (x64 UNICODE_STRING)
  +0x58  UNICODE_STRING BaseDllName      ; 16 bytes
  ; Following fields (Flags, LoadCount, TlsIndex, etc.) are version-sensitive.

Step 5: UNICODE_STRING Structure

x86 (8 bytes total):
  +0x00 USHORT Length
  +0x02 USHORT MaximumLength
  +0x04 PWSTR  Buffer

x64 (16 bytes total):
  +0x00 USHORT Length
  +0x02 USHORT MaximumLength
  +0x08 PWSTR  Buffer   ; 4 bytes of padding between 0x02 and 0x08

Example Navigation Code

x86 Assembly

; EAX = PEB
mov     eax, fs:[0x30]

; EAX = PEB->Ldr
mov     eax, [eax + 0x0C]

; EAX = &PEB_LDR_DATA.InLoadOrderModuleList (list head)
lea     eax, [eax + 0x0C]

; ECX = Flink (first entry)
mov     ecx, [eax]                 ; ECX -> InLoadOrderLinks of first module

; Get DllBase of that entry:
; InLoadOrderLinks is at +0x00 of LDR_DATA_TABLE_ENTRY,
; so entry_base = curr - offsetof(InLoadOrderLinks) == curr
; DllBase at +0x18
mov     edx, [ecx + 0x18]          ; EDX = DllBase

; Iterate: ECX = ECX->Flink
mov     ecx, [ecx]                 ; next

x64 Assembly

; RAX = PEB
mov     rax, gs:[0x60]

; RAX = PEB->Ldr
mov     rax, [rax + 0x18]

; RDX = &PEB_LDR_DATA.InLoadOrderModuleList (list head)
lea     rdx, [rax + 0x10]

; RCX = Flink (first entry)
mov     rcx, [rdx]                 ; RCX -> InLoadOrderLinks of first module

; RBX = DllBase (at +0x30)
mov     rbx, [rcx + 0x30]          ; RBX = DllBase

; Iterate: RCX = RCX->Flink
mov     rcx, [rcx]

PE Headers & Export Directory (offset map)

Step 1: DOS Header

Base + 0x00  IMAGE_DOS_HEADER
Base + 0x3C  DWORD e_lfanew  (RVA of NT headers)

Step 2: NT Headers

nt = Base + *(DWORD*)(Base + 0x3C)
nt + 0x00  DWORD Signature "PE\0\0"
nt + 0x04  IMAGE_FILE_HEADER (20 bytes)
nt + 0x18  IMAGE_OPTIONAL_HEADER (x86: 224 bytes, x64: 240 bytes)

Step 3: Optional Header & Data Directories

opt = nt + 0x18

x86: opt + 0x60  -> DataDirectory[16]
x64: opt + 0x70  -> DataDirectory[16]

Export Directory entry is index 0 (8 bytes per entry):
  +0x00  DWORD VirtualAddress (RVA)
  +0x04  DWORD Size

Step 4: Export Directory

export_rva = DataDirectory[0].VirtualAddress
exp = Base + export_rva   ; IMAGE_EXPORT_DIRECTORY (40 bytes)

exp + 0x1C  DWORD AddressOfFunctions   (RVA to EAT)
exp + 0x20  DWORD AddressOfNames       (RVA to name RVAs)
exp + 0x24  DWORD AddressOfNameOrdinals(RVA to WORD ords)

Step 5: Export Arrays

funcs = Base + *(DWORD*)(exp + 0x1C)         ; DWORD RVAs
names = Base + *(DWORD*)(exp + 0x20)         ; DWORD RVAs (to ASCII)
ords  = Base + *(DWORD*)(exp + 0x24)         ; WORD ordinals

func_rva  = *(DWORD*)(funcs + idx*4)
name_rva  = *(DWORD*)(names + idx*4)
ordinal   = *(WORD*)(ords  + idx*2)

Compact “Find kernel32 base” examples

x86 assembly (skip EXE and ntdll)

mov     eax, fs:[0x30]         ; PEB
mov     eax, [eax + 0x0C]      ; PEB->Ldr
mov     esi, [eax + 0x0C]      ; Ldr.InLoadOrderModuleList.Flink (EXE)
mov     eax, [esi]             ; ntdll (2nd)
mov     eax, [eax]             ; kernel32 (3rd)
; eax -> InLoadOrderLinks for kernel32's LDR_DATA_TABLE_ENTRY
mov     ebx, [eax + 0x18]      ; EBX = kernel32.DllBase

x64 assembly (same idea)

mov     rax, gs:[0x60]         ; PEB
mov     rax, [rax + 0x18]      ; PEB->Ldr
mov     rsi, [rax + 0x10]      ; Ldr.InLoadOrderModuleList.Flink (EXE)
mov     rax, [rsi]             ; ntdll (2nd)
mov     rax, [rax]             ; kernel32 (3rd)
mov     rbx, [rax + 0x30]      ; RBX = kernel32.DllBase

The Offsets

Here we are mapping not only the PEB offsets, but also the PE offsets, which are used together, and it is good to know what each field means.

TEB (Thread Environment Block)

Field Description x86 x64
PPEB Peb Pointer to Process Environment Block 0x30 0x60

PEB (Process Environment Block)

Field Description x86 x64
BYTE BeingDebugged Debug flag 0x02 0x02
PPEB_LDR_DATA Ldr Loader data 0x0C 0x18

PEB_LDR_DATA

Field Description x86 x64
LIST_ENTRY InLoadOrderModuleList Modules in load order 0x0C 0x10
LIST_ENTRY InMemoryOrderModuleList Modules in memory order 0x14 0x20
LIST_ENTRY InInitOrderModuleList Modules in init order 0x1C 0x30

LDR_DATA_TABLE_ENTRY (essentials)

Field Description x86 x64
InLoadOrderLinks List entry (anchor) 0x00 0x00
InMemoryOrderLinks 0x08 0x10
InInitializationOrderLinks 0x10 0x20
PVOID DllBase Module base 0x18 0x30
PVOID EntryPoint Entry point 0x1C 0x38
ULONG SizeOfImage Image size 0x20 0x40
UNICODE_STRING FullDllName Full path 0x24 0x48
UNICODE_STRING BaseDllName File name 0x2C 0x58

LIST_ENTRY

Field Description x86 x64
Flink forward link 0x00 0x00
Blink back link 0x04 0x08

The Anatomy of a CobaltStrike sample

I grabbed a sample from MalwareBazaar in order to analyze a real case of malicious code using PEB Walk and then be able to rename and understand the fields based on the offsets and above. Let’s do it! ;)

Malcat

To those who are interested in an amazing reversing tool - I might say, indispensable. I show you Malcat, according to the website.

Malcat is a feature-rich hexadecimal editor / disassembler for Windows and Linux targeted to IT-security professionals. Inspect more than 50 binary file formats, disassemble and decompile different CPU architectures, extract embedded files and scan for Yara signatures or anomalies in a fast and easy-to-use graphical interface. Don’t like what you get? Malcat is also heavily customizable and scriptable using python.

Among all the delightful features found in Malcat, we need just take a look at the left panel, as shown in the image below, to see the PEBx64 being highlighted.

Malcat left data panel
Malcat left data panel

Getting the code, we can comment it to be easy to understand before going to the x64dbg to see everything happening in runtime.

Malcat Disassembly
Malcat Disassembly

PEB_Walking commented func

peb_walking() {
    sub          rsp, 0x28                           ; Allocate stack space
    
    ; Check if already initialized
    cmp          dword ptr [0x140005740], 0x00       ; Check initialization flag
    jnz          .18                                 ; Skip if already done
    
    ; === PEB TRAVERSAL TO FIND NTDLL.DLL ===
    mov          rax, gs:[0x60]                      ; Get PEB pointer from TEB (Thread Environment Block)
    xor          r8d, r8d                            ; Clear r8
    mov          rcx, [rax+0x18]                     ; Get PEB_LDR_DATA pointer
    mov          rdx, [rcx+0x10]                     ; Get first entry in InLoadOrderModuleList
    mov          rcx, [rdx+0x30]                     ; Get DllBase from LDR_DATA_TABLE_ENTRY
    test         rcx, rcx                            ; Check if DLL base is valid
    jz           .19                                 ; Exit if null
    
.1: ; === LOOP THROUGH LOADED MODULES ===
    movsxd       rax, dword ptr [rcx+0x3C]           ; Get e_lfanew (PE header offset)
    mov          r10, rcx                            ; Save module base in r10
    mov          r9d, [rax+rcx*1+0x88]               ; Get Export Directory RVA from Optional Header
    test         r9d, r9d                            ; Check if exports exist
    jz           .2                                  ; Skip if no exports
    
    lea          r8, [rcx+r9*1]                      ; r8 = Export Directory address
    mov          ecx, [rcx+r9*1+0x0C]                ; Get Name RVA from Export Directory
    mov          eax, [rcx+r10*1]                    ; Read first 4 bytes of DLL name
    or           eax, 0x20202020                     ; Convert to lowercase
    cmp          eax, 0x6C64746E                     ; Compare with 'ntdl' (little-endian)
    jnz          .2                                  ; Not NTDLL, continue searching
    
    mov          eax, [rcx+r10*1+0x04]               ; Read next 4 bytes of DLL name
    or           eax, 0x20202020                     ; Convert to lowercase
    cmp          eax, 0x6C642E6C                     ; Compare with 'l.dl' (little-endian)
    jz           .3                                  ; Found NTDLL.DLL!
    
.2: ; === MOVE TO NEXT MODULE ===
    mov          rdx, [rdx]                          ; Get next LIST_ENTRY
    mov          rcx, [rdx+0x30]                     ; Get next DllBase
    test         rcx, rcx                            ; Check if valid
    jnz          .1                                  ; Continue loop
    
.3: ; === PROCESS NTDLL EXPORTS ===
    test         r8, r8                              ; Check if Export Directory valid
    jz           .19                                 ; Exit if invalid
    
    ; Save registers and setup export processing
    mov          r11d, [r8+0x18]                     ; NumberOfNames
    mov          [rsp+0x38], rbx                     ; Save rbx
    mov          [rsp+0x40], rbp                     ; Save rbp
    mov          ebp, [r8+0x24]                      ; AddressOfNameOrdinals RVA
    mov          [rsp+0x20], rsi                     ; Save rsi
    add          rbp, r10                            ; Convert to VA
    mov          esi, [r8+0x1C]                      ; AddressOfFunctions RVA
    mov          [rsp+0x18], rdi                     ; Save rdi
    add          rsi, r10                            ; Convert to VA
    mov          edi, [r8+0x20]                      ; AddressOfNames RVA
    mov          [rsp+0x10], r12                     ; Save r12
    add          rdi, r10                            ; Convert to VA
    mov          [rsp+0x08], r14                     ; Save r14
    xor          ebx, ebx                            ; Clear ebx
    mov          [rsp], r15                          ; Save r15
    lea          r14, [0x140005740]                  ; Load table base address
    mov          r12d, 0x775A                        ; 'Zw' prefix check
    
.4: ; === ENUMERATE EXPORTED FUNCTIONS ===
    lea          eax, [r11-0x01]                     ; Index = NumberOfNames - 1
    mov          r9d, [rdi+rax*4]                    ; Get function name RVA
    add          r9, r10                             ; Convert to VA
    cmp          [r9], r12w                          ; Check if starts with 'Zw'
    jnz          .11                                 ; Skip if not Zw* function
    
    ; === HASH FUNCTION NAME ===
    xor          edx, edx                            ; Clear edx
    mov          r8d, 0x23C08AE1                     ; Initial hash value
    mov          rax, r9                             ; Function name pointer
    
.5: ; === HASH CALCULATION LOOP ===
    movzx        eax, word ptr [rax]                 ; Get character
    mov          ecx, r8d                            ; Copy hash
    ror          ecx, 0x08                           ; Rotate right 8 bits
    inc          edx                                 ; Increment length
    add          ecx, eax                            ; Add character to hash
    mov          eax, edx                            ; Get offset
    add          rax, r9                             ; Next character position
    xor          r8d, ecx                            ; XOR into hash
    cmp          byte ptr [rax], 0x00                ; Check for null terminator
    jnz          .5                                  ; Continue hashing
    
    ; === STORE FUNCTION INFO ===
    mov          edx, ebx                            ; Copy counter
    lea          eax, [r11-0x01]                     ; Get index
    add          rdx, rdx                            ; Double for 16-byte entries
    mov          word ptr [rsp+0x30], 0x50F          ; Store 0x50F (syscall pattern)
    movzx        r9d, word ptr [rsp+0x30]            ; Load pattern
    mov          r15b, 0xC3                          ; RET instruction
    mov          [r14+rdx*8+0x08], r8d               ; Store hash
    movzx        eax, word ptr [rbp+rax*2]           ; Get ordinal
    mov          ecx, [rsi+rax*4]                    ; Get function RVA
    mov          [r14+rdx*8+0x0C], ecx               ; Store function RVA
    lea          r8, [r10+rcx*1]                     ; Get function address
    
    ; === FIND SYSCALL STUB ===
    cmp          r9w, [r8+0x12]                      ; Check for syscall pattern at +0x12
    lea          rax, [r8+0x12]                      ; Point to potential syscall
    jnz          .6                                  ; Not found, search more
    cmp          r15b, [rax+0x02]                    ; Check for RET after syscall
    jz           .10                                 ; Found syscall stub
    
.6: ; === SEARCH FOR SYSCALL PATTERN ===
    mov          ecx, 0x01                           ; Start offset
    
.7: ; === SYSCALL SEARCH LOOP ===
    mov          eax, ecx
    shl          eax, 0x05                           ; Multiply by 32
    mov          edx, eax
    add          rax, 0x12                           ; Offset to check
    add          rax, r8                             ; Add to function base
    cmp          r9w, [rax]                          ; Check for syscall pattern
    jnz          .8                                  ; Not found
    cmp          r15b, [rax+0x02]                    ; Check for RET
    jz           .10                                 ; Found!
    
.8: ; === CHECK NEGATIVE OFFSET ===
    mov          rax, r8
    sub          rax, rdx                            ; Try negative offset
    add          rax, 0x12
    cmp          r9w, [rax]                          ; Check pattern
    jnz          .9
    cmp          r15b, [rax+0x02]                    ; Check RET
    jz           .10                                 ; Found!
    
.9:
    inc          ecx                                 ; Next offset
    cmp          ecx, 0x200                          ; Max search limit
    jb           .7                                  ; Continue searching
    xor          eax, eax                            ; Not found, store NULL
    
.10: ; === STORE SYSCALL ADDRESS ===
    mov          ecx, ebx                            ; Get index
    inc          ebx                                 ; Increment counter
    add          rcx, rcx                            ; Double for entry size
    mov          [r14+rcx*8+0x10], rax               ; Store syscall stub address
    cmp          ebx, 0x258                          ; Check max entries (600)
    jz           .12                                 ; Done if at limit
    
.11: ; === PROCESS NEXT FUNCTION ===
    add          r11d, 0xFFFFFFFF                    ; Decrement NumberOfNames
    jnz          .4                                  ; Continue if more functions
    
.12: ; === SORT THE TABLE ===
    mov          r15, [rsp]                          ; Restore r15
    lea          eax, [rbx-0x01]                     ; Get count - 1
    mov          r12, [rsp+0x10]                     ; Restore r12
    xor          r10d, r10d                          ; Outer loop counter = 0
    mov          rbp, [rsp+0x40]                     ; Restore rbp
    mov          [0x140005740], ebx                  ; Store entry count
    test         eax, eax                            ; Check if entries exist
    jz           .17                                 ; Skip sort if empty
    
.13: ; === BUBBLE SORT OUTER LOOP ===
    mov          eax, ebx                            ; Get total count
    xor          ecx, ecx                            ; Inner counter = 0
    sub          eax, r10d                           ; Subtract outer counter
    cmp          eax, 0x01                           ; Check if only 1 element left
    jz           .16                                 ; Skip inner loop
    
.14: ; === BUBBLE SORT INNER LOOP ===
    lea          r11d, [rcx+0x01]                    ; Next index
    mov          r9d, ecx                            ; Current index
    add          r9, r9                              ; Double for entry size
    mov          r8d, r11d                           ; Next index
    add          r8, r8                              ; Double for entry size
    
    ; Compare RVAs
    mov          edi, [r14+r9*8+0x0C]                ; Get current RVA
    mov          esi, [r14+r8*8+0x0C]                ; Get next RVA
    cmp          edi, esi                            ; Compare
    jbe          .15                                 ; Skip swap if in order
    
    ; === SWAP ENTRIES ===
    mov          eax, [r14+r8*8+0x08]                ; Load next hash
    mov          edx, [r14+r9*8+0x08]                ; Load current hash
    mov          rcx, [r14+r9*8+0x10]                ; Load current syscall address
    mov          [r14+r9*8+0x08], eax                ; Store next hash in current
    mov          rax, [r14+r8*8+0x10]                ; Load next syscall address
    mov          [r14+r9*8+0x10], rax                ; Store next syscall in current
    mov          [r14+r9*8+0x0C], esi                ; Store next RVA in current
    mov          [r14+r8*8+0x08], edx                ; Store current hash in next
    mov          [r14+r8*8+0x0C], edi                ; Store current RVA in next
    mov          [r14+r8*8+0x10], rcx                ; Store current syscall in next
    mov          ebx, [0x140005740]                  ; Reload count
    
.15: ; === CONTINUE INNER LOOP ===
    mov          eax, ebx                            ; Get count
    mov          ecx, r11d                           ; Move to next
    sub          eax, r10d                           ; Subtract outer counter
    dec          eax                                 ; Decrement
    cmp          r11d, eax                           ; Check if done with inner loop
    jb           .14                                 ; Continue inner loop
    
.16: ; === CONTINUE OUTER LOOP ===
    inc          r10d                                ; Increment outer counter
    lea          eax, [rbx-0x01]                     ; Get count - 1
    cmp          r10d, eax                           ; Check if done
    jb           .13                                 ; Continue outer loop
    
.17: ; === RESTORE REGISTERS ===
    mov          rdi, [rsp+0x18]                     ; Restore rdi
    mov          rsi, [rsp+0x20]                     ; Restore rsi
    mov          rbx, [rsp+0x38]                     ; Restore rbx
    mov          r14, [rsp+0x08]                     ; Restore r14
    
.18: ; === SUCCESS RETURN ===
    mov          eax, 0x01                           ; Return 1 (success)
    add          rsp, 0x28                           ; Clean up stack
    ret
    
.19: ; === FAILURE RETURN ===
    xor          eax, eax                            ; Return 0 (failure)
    add          rsp, 0x28                           ; Clean up stack
    ret
}

The Runtime Validation

The first execution gets the binary name. So, to get to ntdll.dll, we need to iterate over the LIST_ENTRY to get the next DLLBase. After going to the next entry, we found that ntdll.dll was mapped into the current binary’s memory, and then we can see the e_lfanew showed and follow the rest of the code.

x64dbg PEB Walk ntdll
x64dbg PEB Walk ntdll

x64dbg ntdll Memory Map
x64dbg ntdll Memory Map

This image explains better the math used here to get the Export Directory. rax has the e_lfanew offset, which points to E8 — as shown in the previous image. It should be summed with rcx, which points to ntdll in memory, then summed with 0x88, and we get the 0x170 offset shown below.

CFF Explorer - Export Directory offset from ntdll
CFF Explorer - Export Directory offset from ntdll

But, wait… WTH is the 0x88 value if we knew that the PE+ has 0x70?

Smart thinking meme
Smart thinking meme

Because, as shown here Step 3: Optional Header & Data Directories the value 0x88 already includes:

  • the 4 bytes for the PE Signature (“PE\0\0”)
  • the 0x18-bytes IMAGE_FILE_HEADER
  • the 0x70-bytes distance inside the IMAGE_OPTIONAL_HEADER to the DataDirectory[0]

And then we can start to see the abstract thing taking on a well-known form.

x64dbg ntdll functions loaded
x64dbg ntdll functions loaded

Just Some Final Words

There is not much more mystery around the code here, since we can look at the assembly code commented above. The idea here was to show how important it is to understand offsets to understand what is going on when looking at assembly code. Of course, you can use IDA, Ghidra, or Binary Ninja to help you in the analysis process. However, it is also good to understand what is supposed to happen based on the abstract assembly generated. Being honest, it is much more comfortable to look at shellcode or PEB Walking code now, right? Just remember to understand what is happening, and everything will go well.

Thanks for reading this article. If you have any comments, feel free to reach out to me!

References

0%