On May 12, there was a major outbreak of WannaCrypt ransomware. WannaCrypt directly borrowed exploit code from the ETERNALBLUE exploit and the DoublePulsar backdoor module leaked in April by a group calling itself Shadow Brokers.
Using ETERNALBLUE, WannaCrypt propagated as a worm on older platforms, particularly Windows 7 and Windows Server 2008 systems that haven't patched against the SMB1 vulnerability CVE-2017-0145. The resulting ransomware outbreak reached a large number of computers, even though Microsoft released security bulletin MS17-010 to address the vulnerability on March 14, almost two months before the outbreak.
This post—complementary to our earlier post about the ETERNALBLUE and ETERNALROMANCE exploits released by Shadow Brokers—takes us through the WannaCrypt infection routine, providing even more detail about post-exploitation phases. It also describes other existing mitigations as well as new and upcoming mitigation and detection techniques provided by Microsoft to address similar threats.
The following diagram summarizes the WannaCrypt infection cycle: initial shellcode execution, backdoor implantation and package upload, kernel and userland shellcode execution, and payload launch.
Figure 1. WannaCrypt infection cycle overview
The file mssecsvc.exe contains the main exploit code, which launches a network-level exploit and spawns the ransomware package. The exploit code targets a kernel-space vulnerability and involves multi-stage shellcode in both kernel and userland processes. Once the exploit succeeds, communication between the DoublePulsar backdoor module and mssecsvc.exe is encoded using a pre-shared XOR key, allowing transmission of the main payload package and eventual execution of ransomware code.
Exploit and initial shellcodes
In an earlier blog post, Viktor Brange provided a detailed analysis of the vulnerability trigger and the instruction pointer control mechanism used by ETERNALBLUE. After the code achieves instruction pointer control, it focuses on acquiring persistence in kernel space using kernel shellcode and the DoublePulsar implant. It then executes the ransomware payload in user space.
The exploit code sprays memory on a target computer to lay out space for the first-stage shellcode. It uses non-standard SMB packet segments to make the allocated memory persistent on hardware abstraction layer (HAL) memory space. It sends 18 instances of heap-spraying packets, which have direct binary representations of the first-stage shellcode.
Figure 2. Shellcode heap-spraying packet
Initial shellcode execution: first and second stages
The exploit uses a function-pointer overwrite technique to direct control flow to the first-stage shellcode. This shellcode installs a second-stage shellcode as a SYSENTER or SYSCALL routine hook by overwriting model-specific registers (MSRs). If the target system is x86-based, it hooks the SYSENTER routine by overwriting IA32_SYSENTER_EIP. On x64-based systems, it overwrites IA32_LSTAR MSR to hook the SYSCALL routine. More information about these MSRs can be found in Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3C.
Figure 3. First-stage shellcode for x86 systems
Originally, the IA32_SYSENTER_EIP contains the address to nt!KiFastCallEntry as its SYSENTER routine.
Figure 4. Original IA32_SYSENTER_EIP value pointing to KiFastCallEntry
After modification by the first-stage shellcode, IA32_SYSENTER_EIP now points to the second-stage shellcode.
Figure 5. Modified IA32_SYSENTER_EIP value points to the main shellcode
The first-stage shellcode itself runs in DISPATCH_LEVEL. By running the second-stage shellcode as the SYSENTER routine, the first-stage code guarantees that the second-stage shellcode runs in PASSIVE_LEVEL, giving it access to a broader range of kernel APIs and paged-out memory. And although the second-stage shellcode delivered with this malware actually doesn't access any paged pools or call APIs that require running in PASSIVE_LEVEL, this approach allows attackers to reuse the same module for more complicated shellcode.
The second-stage shellcode, now running on the targeted computer, generates a master XOR key for uploading the payload and other communications. It uses system-specific references, like addresses of certain APIs and structures, to randomize the key.
Figure 6. Master XOR key generation
The second-stage shellcode implants DoublePulsar by patching the SMB1 Transaction2 dispatch table. It overwrites one of the reserved command handlers for the SESSION_SETUP (0xe) subcommand of the Transaction2 request. This subcommand is reserved and not commonly used in regular code.
Figure 7. Copying packet-handler shellcode and overwriting the dispatch table
The following code shows the dispatch table after the subcommand backdoor is installed.
Figure 8. Substitution of 0xe command handler
Main package upload
To start uploading its main package, WannaCrypt sends multiple ping packets to the target, testing if its server hook has been installed. Remember that the second-stage shellcode runs as a SYSENTER hook—there is a slight delay before it runs and installs the dispatch-table backdoor. The response to the ping packet contains the randomly generated XOR master key to be used for communication between the client and the targeted server.
Figure 9. Code that returns original XOR key
This XOR key value is used only after some bit shuffling. The shuffling algorithm basically looks like the following Python code.
Figure 10. XOR bit-shuffling code
The upload of the encoded payload consists of multiple packets as shown below.
Figure 11. SMB Transaction2 packet showing payload upload operation
The hooked handler code for the unimplemented subcommand processes the packet bytes, decoding them using the pre-shared XOR key. The picture above shows that the SESSION_SETUP parameter fields are used to indicate the offset and total lengths of payload bytes. The data is 12 bytes long—the first four bytes indicate total length, the next four bytes is reserved, and the last 4 bytes are the current offsets of the payload bytes in little-endian. These fields are encoded with master XOR key.
Because the reserved field is supposed to be 0, the reserved field is actually the same as the master XOR key. Going back to the packet capture above, the reserved field value is 0x38a9dbb6, which is the master XOR key. The total length is encoded as 0x38f9b8be. When this length is XORed with the master XOR key, it is 0x506308, which is the actual length of the payload bytes being uploaded. The last field is 0x38b09bb6. When XORed with the master key, this last field becomes 0, meaning this packet is the first packet of the payload upload.
When all the packets are received, the packet handler in the second-stage shellcode jumps to the start of the decoded bytes.
Figure 12. Decoding and executing shellcode
The transferred and decoded bytes are of size 0x50730c. As a whole, these packet bytes include kernel shellcode, userland shellcode, and the main WannaCrypt PE packages.
Executing the kernel shellcode
The kernel shellcode looks for a kernel image base and resolves essential functions by parsing PE structures. The following figure shows the APIs resolved by the shellcode:
Figure 13. Resolved kernel functions
It uses ZwAllocateVirtualMemory to allocate a large chunk of RWX memory (0x506d70 in this case). This memory holds the userland shellcode and the main PE packages.
Figure 14. RWX memory allocation through ZwAllocateVirtualMemory
The kernel shellcode goes through processes on the system and injects userland shellcode to the lsass.exe process using an asynchronous procedure call (APC).
Figure 15. APC routines for injecting shellcode to a thread in a userland process
Userland shellcode—the start of a new infection cycle
After multiple calls to VirtualProtect and PE layout operations, the shellcode loads a bootstrap DLL using a reflective DLL loading method. The WannaCrypt user-mode component contains this bootstrap DLL for both 64- and 32-bit Windows.
Figure 16. Bootstrap DLL functions
This bootstrap DLL reads the main WannaCrypt payload from the resource section and writes it to a file C:\WINDOWS\mssecsvc.exe. It then launches the file using the CreateProcess API. At this stage, a new infection cycle is started on the newly infected computer.
Figure 17. Dropping main payload to file system
Figure 18. Creating the main payload process
Mitigating and detecting WannaCrypt
WannaCrypt borrowed most of its attack code from those leaked by Shadow Brokers, specifically the ETERNALBLUE kernel exploit code and the DoublePulsar kernel-level backdoor. It leverages DoublePulsar's code execution mechanisms and asynchronous procedure calls (APCs) at the kernel to deliver its main infection package and ransomware payload. It also uses the system file lsass.exe as its injection target.
Mitigation on newer platforms and upcoming SMB updates
The ETERNALBLUE exploit code worked only on older OSes like Windows 7 and Windows Server 2008, particularly those that have not applied security updates released with security bulletin MS17-010. The exploit was limited to these platforms because it depended on executable memory allocated in kernel HAL space. Since Windows 8 and Windows Server 2012, HAL memory has stopped being executable. Also, for additional protection, predictable addresses in HAL memory space have been randomized since Windows 10 Creators Update.
With the upcoming Windows 10 Fall Creators Update (also known as RS3), many dispatch tables in legacy SMB1 drivers, including the Transaction2 dispatch table (SrvTransaction2DispatchTable) memory area, will be set to read-only as a defense-in-depth measure. The backdoor mechanism described here will be much less attractive to attackers because the mechanism will require additional exploit techniques for unlocking the memory area and overwriting function pointers. Furthermore, SMB1 has already been deprecated for years. With the RS3 releases for Windows 10 and Windows Server 2016, SMB1 will be disabled.
Hyper Guard virtualization-based security
WannaCrypt employs multiple techniques to achieve full code execution on target systems. The IA32_SYSENTER_EIP modification technique used by WannaCrypt to run the main shellcode is actually commonly observed when kernel rootkits try to hook system calls. Kernel Patch Protection (or PatchGuard) typically detects this technique by periodically checking for modifications of MSR values. WannaCrypt hooking, however, is too brief for PatchGuard to fire. Windows 10, armed with virtualization-based security (VBS) technologies such as Hyper Guard, can detect and mitigate this technique because it fires as soon as the malicious wrmsr instruction to modify the MSR is executed.
To enable Hyper Guard on systems with supported processors, use Secure Boot and enable Device Guard. Use the hardware readiness tool to check if your hardware system supports Device Guard. Device Guard runs on the Enterprise and Education editions of Windows 10.
Post-breach detection with Windows Defender ATP
In addition to VBS mitigation provided with Hyper Guard, Windows Defender Advanced Threat Protection (Windows Defender ATP) can detect injection of code to userland processes, including the method used by WannaCrypt. Our researchers have also added new detection logic so that Windows Defender ATP flags highly unusual events that involve spawning of processes from lsass.exe.
Figure 19. Windows Defender ATP detection of an anomalous process spawned from a system process
While the detection mechanism for process spawning was pushed out in response to WannaCrypt, this mechanism and detection of code injection activities also enable Windows Defender ATP customers to uncover sophisticated breaches that leverage similar attack methods.
Windows Defender ATP Research Team