Initial Ghidra interface before loading the shellcode sample

Ghidra Basics: Reverse-Engineering Cobalt Strike Shellcode and Extracting the C2 Server

Source & attribution. This post is an original English rewrite of “How to Use Ghidra to Analyse Shellcode and Extract Cobalt Strike Command & Control Servers” by Matthew, published on Dec 08, 2023 at Embee Research (embeeresearch.io). All original screenshots are reproduced with attribution; the prose is paraphrased for core-jmp.org readers. For the canonical walkthrough, please read the original.

Executive Summary

Cobalt Strike beacons rarely ship as friendly PE files. The interesting payload is a chunk of position-independent shellcode that resolves the Windows APIs it needs at runtime, calls them via a tiny dispatcher, and reaches out to a hard-coded command-and-control server. This walkthrough — following the methodology of Matthew at Embee Research — shows how to take a raw Cobalt Strike shellcode blob, load it into Ghidra as bare bytes, coax the decompiler into producing a readable view, and then pivot to a live debugger to resolve the API hashes that the static view leaves opaque. The end goal is concrete: pull out the network APIs being called and the C2 server they are pointed at.

Along the way the post walks through the canonical “PUSH <hash> / CALL EBP” pattern that Cobalt Strike (and a lot of other shellcode lineage descended from Metasploit) uses to invoke kernel32 and wininet functions, identifies the underlying ROR13 hashing algorithm by spotting ROR edi, 0xd in the resolver loop, demonstrates how to drive the resolver under x32dbg via Blobrunner so the kernel-supplied function pointers appear in registers, and finishes with the small but high-leverage Ghidra trick of retyping local variables to TEB32 * and PEB * so the DLL-walk code reads like normal C. The methodology generalises to almost any modern API-hashing loader, not just Cobalt Strike.

The Sample

  • SHA256: 26f9955137d96222533b01d3985c0b1943a7586c167eceeaa4be808373f7dd30
  • Source: Malware Bazaar (bazaar.abuse.ch) — archive password infected
  • Family: Cobalt Strike beacon shellcode (raw, position-independent)
  • Goal: Identify API calls and extract the embedded C2 server

Loading the Sample into Ghidra

Because the file is raw shellcode rather than a PE, Ghidra will not auto-detect a format. Drag the file into a new project and Ghidra prompts you to choose the architecture manually. For a 32-bit Cobalt Strike beacon, pick x86 / 32-bit / little endian; the “default” compiler is fine.

Initial Ghidra interface before loading the shellcode sample
The empty Ghidra workspace before the shellcode is imported. Source: original article.
Ghidra Language/Compiler dialog selecting x86, 32-bit, little endian
Selecting the x86, 32-bit, little-endian architecture for a raw-bytes import. Source: original article.

Disassembling the Shellcode

After import, the listing pane shows raw bytes; Ghidra has not yet treated them as code. Place the cursor at offset 0 and either right-click → Disassemble or press D. Ghidra walks forward from that address and turns the bytes into instructions.

Ghidra right-click context menu showing the Disassemble option
The right-click Disassemble menu entry. Source: original article.
Ghidra disassembly listing of the Cobalt Strike shellcode after pressing D
Listing view after disassembly — raw bytes resolved into x86 instructions. Source: original article.

Defining a Function and Getting the Decompiler View

The decompiler will not produce output until Ghidra knows where a function begins. Ghidra has not inferred one because there is no PE entry point. Right-click on the first instruction and pick Create Function (hotkey F); the decompiler pane on the right immediately populates with pseudo-C for the newly-defined function.

Ghidra main window showing the code browser and the empty decompiler pane side by side
Listing and decompiler panes side by side; the decompiler is still empty. Source: original article.
Ghidra right-click Create Function context menu entry
The Create Function right-click action. Source: original article.
Ghidra decompiler pane populated with pseudo-C after defining a function
After F — the decompiler now has output to render. Source: original article.

Locating Function Calls and the PUSH/CALL EBP Pattern

Scrolling through the decompiler output, two things stand out: the shellcode does most of its work through a single dispatcher (here named by Ghidra as FUN_0000008f) and the values passed to that dispatcher are 32-bit constants. Those constants are the API hashes — the shellcode never references the literal string “LoadLibraryA”, it references a precomputed integer fingerprint of it.

Ghidra decompiler highlighting the call to FUN_0000008f
The shellcode funnels most calls through a single resolver function (FUN_0000008f). Source: original article.
Ghidra decompiler view showing 32-bit hash values pushed before API calls
Constant 32-bit values pushed before each call — the API hashes. Source: original article.
Ghidra decompiler showing the unaff_retaddr and code* references in the hash pattern
Ghidra surfaces unaff_retaddr and code * references in the dispatcher — a structural fingerprint of the pattern. Source: original article.
Disassembly listing showing the PUSH hash / CALL RBP pattern used by Cobalt Strike
In the listing view the pattern is unmistakable: PUSH <hash> immediately followed by CALL RBP. Source: original article.

Resolving the First Hashes via Google

The first two hashes can often be resolved with a search engine: there are public hash lists (gists, references, prior analyses) that map common ROR13 outputs back to kernel32! / wininet! function names. Searching for 0x726774c immediately returns LoadLibraryA; searching for 0xa779563a returns InternetOpenA. Add Ghidra inline comments so the resolved names stay attached to the disassembly.

Google search resolving hash 0x726774c to LoadLibraryA
Google search resolving the hash 0x726774c to LoadLibraryA. Source: original article.
Ghidra inline comment annotating the hash as LoadLibraryA
Inline comment in Ghidra to keep the resolved name attached to the disassembly. Source: original article.
Google search resolving hash 0xa779563a to InternetOpenA
Searching the next hash resolves to InternetOpenA. Source: original article.
Ghidra inline comment annotating the hash as InternetOpenA
Second comment added in Ghidra. Source: original article.
SpeakEasy emulation output cross-referencing the same Cobalt Strike API call sequence
SpeakEasy emulation of the same shellcode shows the matching API call sequence, an independent corroboration of the hash resolution. Source: original article.

A Note on the Loading of wininet

Before InternetOpenA can be called, the shellcode has to LoadLibraryA("wininet"). The library name itself is not stored as plain ASCII at rest — the string is built in stack memory just before the call. In Ghidra the bytes appear as a small block being initialised to the values of the characters ‘w’, ‘i’, ‘n’, ‘i’, ‘n’, ‘e’, ‘t’.

Hex view of the wininet DLL name pushed in stack before LoadLibraryA
The DLL name being built on the stack one byte at a time. Source: original article.
Decoded wininet ASCII reference inside Ghidra
Decoded as ASCII the string reads “wininet”. Source: original article.

Why a Debugger Is Needed

For the remaining hashes, search-engine lookups dry up — either the hash collides with nothing public, or it is one of the long tail of wininet!Internet* functions that has not been catalogued. The dispatcher itself ends with a JMP EAX — whatever address the resolver stored in EAX is where execution lands. So the cleanest way to recover the API name is to break right before the JMP and look at EAX.

JMP EAX instruction inside the hash-resolver function
The terminating JMP EAX inside the resolver. Source: original article.
Ghidra graph view of the hash-resolver function with JMP/CALL pattern
Graph view of the resolver with the JMP / CALL structure laid out. Source: original article.
Ghidra graph zoomed in on the JMP EAX hand-off basic block
Zoomed in on the JMP EAX basic block — the ideal breakpoint target. Source: original article.

Loading the Shellcode With Blobrunner

OALabs Blobrunner is a small loader designed for this exact problem: it allocates a fixed virtual address, copies the shellcode into it, sets RWX permissions, and pauses so a debugger can attach before execution starts. The advantage is determinism — the shellcode always lands at the same address, so breakpoints can be planned in advance.

OALabs Blobrunner v0.0.5 release page
Blobrunner release page. Source: original article.
Blobrunner command line loading the shellcode blob
Blobrunner invoked against the unpacked shellcode blob. Source: original article.
Blobrunner reporting the shellcode mapped at address 0x001e0000
Blobrunner reports the shellcode mapped at 0x001e0000. Source: original article.

Attaching x32dbg and Setting Breakpoints

While Blobrunner is paused, attach x32dbg via File → Attach, then set two breakpoints: one at the shellcode entry (0x001e0000) and one at the JMP EAX inside the resolver. The resolver lives at file offset 0x86 in this sample, so the absolute address is 0x001e0000 + 0x86.

x32dbg File menu showing the Attach option used to attach to Blobrunner
Attaching x32dbg to the paused Blobrunner process. Source: original article.
x32dbg command bar entering bp 0x001e0000 to break at the shellcode entry
Setting a breakpoint at the shellcode entry. Source: original article.
x32dbg breakpoint at 0x001e0000 + 0x86 to break on the JMP EAX dispatch
Second breakpoint — on the JMP EAX inside the resolver. Source: original article.
Blobrunner waiting for the user to press a key before continuing execution
Blobrunner waiting for a keypress — this is the “debugger is attached, go!” gate. Source: original article.
x32dbg paused at the initial shellcode entry breakpoint
x32dbg paused at the first instruction of the shellcode. Source: original article.

A Note on CALL EBP

The very first instruction in this Cobalt Strike beacon is POP EBP — a small piece of position-independent-code (PIC) bootstrapping that places the absolute address of the next instruction into EBP. That is why subsequent calls into the resolver use CALL EBP rather than a literal address: the shellcode reuses the value popped at the very start as the call target. To resolve every API in one go, set conditional breakpoints on every CALL EBP in the listing — x32dbg will pause each time and let you read the pushed hash off the stack and the resolved API off EAX immediately after.

POP EBP instruction at the start of the Cobalt Strike helper function
The very first POP EBP — the position-independent bootstrap. Source: original article.
Ghidra showing the instruction that follows the call to FUN_0000008f
The instruction immediately after the resolver call — where EAX still holds the resolved API. Source: original article.
x32dbg register window tracking the EBP value used during CALL EBP operations
Tracking the value held in EBP as the calls progress. Source: original article.
x32dbg stepping into the function with breakpoints on CALL EBP instructions
Stepping through the shellcode with breakpoints on every CALL EBP. Source: original article.

Observing Hash Values in Memory

At each pause, the topmost value on the stack is the API hash being requested. The first one matches what we already resolved statically: 0x726774cLoadLibraryA.

x32dbg stack window showing the hash value 0x726774c at the top of the stack
Hash 0x726774c sitting on the top of the stack. Source: original article.

Viewing Decoded APIs in the Register Window

Step over the resolver until the JMP EAX is the next instruction. At that point EAX holds the resolved function address, and x32dbg shows the symbol next to it. The first call is confirmed as kernel32!LoadLibraryA and its argument on the stack is the just-built "wininet" string.

x32dbg register pane showing EAX resolved to the LoadLibraryA address
EAX resolved to LoadLibraryA. Source: original article.
x32dbg register window labeling the resolved API as LoadLibraryA
x32dbg annotates the resolved API directly in the register pane. Source: original article.
x32dbg stack window with the wininet string argument pushed for LoadLibraryA
The argument on the stack — the string "wininet". Source: original article.

Decoding Additional API Hashes

Continuing past the second CALL EBP, the hash on the stack is 0xa779563a and EAX resolves to InternetOpenA. The third call (hash 0xC69F8957 at file offset 0xCA) resolves to InternetConnectA; the arguments on the stack include the C2 IP 195.211.98[.]91. With three calls in hand, the shellcode’s networking intent is already visible: prepare a WinINet session, then connect outbound to a hard-coded host.

x32dbg second breakpoint stopping on the InternetOpenA hash 0xa779563a
Second breakpoint hit — hash 0xa779563a. Source: original article.
x32dbg stack showing the InternetOpenA hash pushed before CALL EBP
Hash 0xa779563a at the top of the stack. Source: original article.
x32dbg JMP EAX dispatch confirming the resolved API is InternetOpenA
EAX confirmed as InternetOpenA. Source: original article.
x32dbg breaking on CALL EBP at offset 0xCA where hash 0xC69F8957 is pushed
Third CALL EBP at offset 0xCA — hash 0xC69F8957. Source: original article.
InternetConnectA resolved with the C2 IP address 195.211.98[.]91 visible in arguments
InternetConnectA with the C2 IP 195.211.98[.]91 visible in the arguments. Source: original article.
Ghidra Go To dialog navigating to offset 0xCA inside the shellcode
Jumping back into Ghidra at the same offset. Source: original article.
Ghidra showing the hash 0xC69F8957 located at offset 0xCA
Hash 0xC69F8957 located at offset 0xCA. Source: original article.
Ghidra inline comment annotating the hash as InternetConnectA
Comment added in Ghidra so the static view now reads InternetConnectA. Source: original article.
Ghidra view of additional API hash values and their arguments
The remaining hash values and their pushed arguments. Source: original article.
x32dbg conditional breakpoint configuration to automate hash logging
x32dbg conditional breakpoints can automate the hash-then-API logging across the whole resolver. Source: original article.

Identifying the Hashing Algorithm: ROR13

Inside the resolver function, the graph view shows a tight loop — that is the hashing inner loop. The decisive instruction is ROR edi, 0xd (rotate-right by 13). That single instruction is enough to identify the algorithm as the well-known ROR13 hashing used by the Metasploit lineage (and inherited by Cobalt Strike, Sliver, and several others). Mandiant published the canonical pseudocode for it years ago; matching the in-memory bytes against that pseudocode confirms the algorithm.

Ghidra graph view showing the loop block that performs the hashing routine
The tight loop inside the resolver — the hashing inner loop. Source: original article.
Disassembly showing ROR edi, 0xd characteristic of ROR13 API hashing
ROR edi, 0xd — the fingerprint of ROR13. Source: original article.
Mandiant blog pseudocode reference for ROR13 API hashing
Mandiant’s canonical ROR13 pseudocode for cross-reference. Source: original article.

Advanced Notes: Retyping Windows Structures in Ghidra

The resolver does not magically know where kernel32.dll is loaded. It walks the loaded-module list via the TEB → PEB → InMemoryOrderModuleList chain (offsets +0x30, +0xc, +0x14 in the 32-bit case). Out of the box, Ghidra shows those offsets as raw arithmetic on a generic void *. With Ghidra’s pre-shipped TEB32 * and PEB * types (or the third-party AllsafeCyberSecurity data-type repository), right-clicking the local variable and choosing Retype Variable immediately rewrites those references in the decompiler view to the named fields.

Ghidra view of TEB access pattern used to enumerate loaded DLLs
TEB access pattern used to enumerate the loaded DLLs. Source: original article.
Nviso diagram showing TEB and PEB offsets used for DLL list walking
Nviso’s reference diagram of the TEB / PEB / LDR offsets used by Metasploit-lineage shellcode. Source: original article.
Ghidra example of retyping a variable to the TEB32 structure
Retyping a local variable to TEB32 *. Source: original article.
Ghidra right-click menu retyping a variable to TEB32 *
Right-click → Retype VariableTEB32 *. Source: original article.
Ghidra retyping the ProcessEnvironmentBlock variable to PEB *
Same pattern for the PEB pointer. Source: original article.
Ghidra decompiler view after retyping with named TEB and PEB structure offsets
The final cleaned-up decompiler view — offsets are now named fields. Source: original article.

Key Takeaways

  • Raw shellcode in Ghidra needs three nudges before it is readable: pick the architecture, press D to disassemble, press F to create a function. None of these are automatic for non-PE input.
  • The PUSH <hash> / CALL EBP / JMP EAX pattern is the canonical Cobalt Strike (and Metasploit lineage) API-resolution shape; once you can recognise it, the rest of the shellcode reads itself.
  • The first two or three hashes can usually be resolved offline via public hash lists; the long tail needs a live debugger.
  • Blobrunner + x32dbg gives you a deterministic load address and clean breakpoint placement — far easier than trying to attach to a real implant under EDR.
  • The instruction ROR edi, 0xd is a reliable single-instruction signature for the ROR13 hashing algorithm.
  • Ghidra’s Retype VariableTEB32 * / PEB * trick turns the cryptic offset arithmetic of DLL-walks into named field accesses; it is the single highest-leverage Ghidra UI feature for shellcode work.
  • For this sample, the C2 server falls out of the third CALL EBP argument as 195.211.98[.]91 — the static + dynamic flow yields the IOC directly.

Defensive Recommendations

  • Hunt for the PUSH/CALL EBP pattern at memory-scan time. EDRs that scan committed RWX regions for the PUSH imm32 / CALL EBP opcode pair catch most Metasploit-lineage shellcode regardless of which encoder was used.
  • Hunt for the ROR13 byte pattern. The encoded form of ROR edi, 0xd (C1 CF 0D) is a high-fidelity signature; combined with an immediately following ADD ECX, EDI it produces very few false positives.
  • Block outbound traffic to known Cobalt Strike C2s. For the sample analysed here, that means egress blocks and DNS sinkholing for 195.211.98[.]91 and related infrastructure.
  • Apply WinINet ETW or Frida-style API monitoring. Most beacons use the WinINet API surface (InternetOpenA, InternetConnectA, HttpOpenRequestA); telemetry on the per-process WinINet handle table catches them even without payload analysis.
  • Decode hash lists in advance. Maintain an internal mapping of known ROR13 hashes to API names; sharing one across the team turns “run x32dbg” into “grep”.
  • Use Ghidra retyping standards. Adopt a team convention for retyping shellcode locals to TEB32 *, PEB *, and LDR_DATA_TABLE_ENTRY * — uniform analyses let one analyst pick up another’s work midstream.
  • Treat “unknown PE-less .bin” samples as suspect by default. Almost every modern loader unwraps to a position-independent blob; the absence of a PE header is itself an investigative signal.

Conclusion

The workflow above is short on novelty and long on transferable craft. Cobalt Strike shellcode looks scary at first because Ghidra refuses to render anything until it has been hand-bootstrapped, but the underlying loader is small, structured, and stereotyped. Once the resolver function is identified, the rest of the analysis is mechanical: enumerate CALL EBP sites, resolve hashes (statically when possible, dynamically when not), retype TEB/PEB locals, and pull the C2 out of InternetConnectA’s arguments. The methodology is courtesy of Matthew at Embee Research; for the canonical narrative and additional sister tutorials on the same site, please head to the original.

References

Credit and thanks to the original author, Matthew at Embee Research, whose research and screenshots this writeup is built on. Read the canonical source for the full narrative.

Comments are closed.