Return-oriented programming, or ROP, is a clever technique used to get around the NX (No-eXecute) and DEP (Data Execution Prevention) mitigations in modern CPUs and operating systems. Traditionally, exploiting a vulnerable program has consisted of the following:
Find a programming error in the application (for example, Adobe Flash) that allows specially crafted input to overflow or otherwise corrupt an allocated region in memory.
Inject executable code (shellcode) into the program’s memory
Transfer control to this new code by overwriting control information such as a return address on the call stack.
With this technique, simply opening a document or media file results in arbitrary code being executed on the target’s system. The attacker has control and can download additional malware, exfiltrate information, install spyware or hooks for persistence, etc.
Here's a simple example of a program vulnerable to a memory corruption attack. The attack takes advantage of the insecure gets() function, which reads a string from the console and writes it to a specified location in memory without checking whether there is enough space available.
When the function buggy() is called, the computer stores the return address (the next instruction in main() following the function call) on the program's call stack in memory. After buggy() finishes, the program execution should return to the address 0x00401047 (seen reversed here in the program's memory because x86 is a little-Endian architecture).
The name entered by the user is also stored on the stack in the 8 characters allocated for 'str'.
But if we enter more than 8 characters, the gets() function will happily overwrite adjacent memory:
Note that the final two characters -- the final 'n' in Langton and the null character used to indicate the end of the string -- have clobbered half of the return address. The result is a crash as the control jumps to the garbled address 0x0040006E. But an attacker can go further by including executable shell code in the input string and overwriting the original return address so that program control now jumps to the attacker's own code.
NX (No-eXecute) and DEP (Data Execution Prevention) counter this type of exploit by ensuring at the hardware level that a given section of memory is writable or executable, but not both. Even if the attacker finds a memory vulnerability and injects shellcode, the CPU will refuse to execute those instructions. Return-oriented programming gets past this limitation by (mis)using the existing code in the application in ways that were not intended.
To understand how this works, we’ll start with an analogy. In Wisconsin, there is an executive power known as the Frankenstein veto, which allows a governor to selectively reject individual words of a proposed bill. Here’s an example:
By selectively vetoing single words and phrases, the proposed law was changed from:
[...] the secretary of administration shall lapse to the general fund or transfer to the general fund from the unencumbered balances of the appropriations to state agencies, as defined in subsection (1w) (a), other than sum sufficient appropriations and appropriations of federal revenues, an amount equal to $724,900 during the 2006−07 fiscal year [...]
[...], the secretary of administration shall transfer from the balances of the general fund an amount equal to $330,000,000 during the 2005−06 fiscal year and the 2006−07 fiscal year [...]
By selectively repurposing the existing text, the governor changed an appropriation by nearly three orders of magnitude. (An earlier incarnation of this executive power allowed governors to “veto” individual letters in a bill!)
Similarly, return-oriented programming reuses existing snippets of the vulnerable application’s instructions for purposes that were not intended. By overwriting portions of the call stack, the ROP exploit jumps around the program, each time selectively executing a small number of instructions preceding a function’s 'ret' statement (hence the name).
As we saw above, we can exploit a programming error to overwrite a function's return address on the stack. This allows us to transfer program control to an arbitrary location. Consider the following function fragment:
By setting the return address to 0x0040A4BB while overwriting the stack, we jump into the end of this function, setting the register eax to 0 by XORing it with itself. The return instruction at 0x0040ABD expects to find another return address on the stack, but this too can be overwritten along with the previous address. A chunk of code (mis)used in this fashion is called a ROP gadget, and a ROP exploit is formed by a chain of such gadgets called in sequence due to the intentionally corrupted stack memory. Due to the difficulty of constructing a suitable sequence of bytes to overwrite the stack and control program flow through a sequence of function fragments, this technique is used only as long as necessary; a common approach is to use ROP to bypass DEP, and after that to use more traditional techniques to complete the exploit.
With an understanding of the mechanism behind ROP exploits, we get to the question that prompted this post: does Sky ATP attempt to directly detect ROP exploits? The answer is no, for the following reasons:
ROP detection is redundant in an anti-malware sandbox. ROP is only used as a stepping stone to run malicious code on a device. The purpose of malware is to do something malicious: ransomware, a backdoor to add the victim's computer to a botnet, data exfiltration, etc. Sky ATP's dynamic analysis engine detects a rich set of malicious indicators, whether or not the malware's initial foothold came from a ROP exploit.
ROP exploits are system-specific and fragile, so detecting them in a sandbox is highly unlikely. As we saw previously, ROP exploits target programming errors in an installed application, and so the sandbox must also be configured with the same vulnerable version. Beyond this, since ROP exploits jump around raw executable code, small changes to a sandbox's configuration (different versions of system libraries, hardware variations, etc.) frequently render an attack as inert, resulting in -- at most -- an application crash, not a successful exploit. One should ask anti-malware vendors touting their ROP detection how many times they have detected an actual ROP exploit in the wild.
ROP detection is very resource-intensive. Monitoring system activities at the hardware level needed to observe ROP patterns in the program flow is both very slow and much easier for evasive malware to detect.
The low probability of actually detecting a live ROP exploit must be balanced against the high cost of CPU-level emulation. Multiply this by the number of systems a sample must be run on to ensure some likelihood of an exploit/vulnerability match, and the cost-benefit ratio is abysmal at best. And because a successful exploit would also be detected by behavioral indicators, most anti-malware solutions -- including Sky ATP -- do not perform direct ROP detection.