Juniper Sky Advanced Threat Prevention vs. Locky Malware

By Erdem posted 04-14-2016 14:00

Recommend

Introduction

“Locky” is a new strain of ransomware malware that emerged on February 16th of this year. Ransomware is a type of malware that infects a computer and blocks access to the computer or files on the computer in some way. The most common ransomware technique is encrypting documents and other important files so the content of the files is inaccessible until a ransom is paid, typically using Bitcoin as the method of payment. With Locky, the payoff was 0.5 or 1 BTC for most people (about $200 to $400 USD).

The “Locky” name was given to this malware because it renames all of those encrypted files with a “.locky” extension.

Two malicious files are distributed as part of Locky:

The Microsoft Word document used to infect systems:

SHA-256: 97b13680d6c6e5d8fff655fe99700486cbdd097cfa9250a066d247609f85b9b9
Length: 66048 bytes

The dropped ransomware executable:

SHA-256: 17c3d74e3c0645edb4b5145335b342d2929c92dff856cca1a5e79fa5d935fec2
Length: 184320 bytes

Anti-Virus vs. Locky

How did traditional security systems do on the Locky Word document?

It’s well-known that signature-based security solutions often fail to detect new threats, but virtually all anti-virus solutions missed the Word document when it was first distributed. Even a full-day after the initial distribution, only 3 out of 54 AV vendors available on VirusTotal detected the threat.

Sky ATP vs. Locky

The Locky Word Document

What did Sky ATP see with the Locky Word document?

Sky ATP uses a series of analysis engines to determine whether a file object is malicious or not. Two technologies developed inside of Juniper successfully identify Locky as a threat and we assign both Lock files a score of 7 out of 10 (high threat level).

Specifically, for the Word document:

Our document analysis system determined that the Word document was malicious (and this was the most significant factor in our systems decision that the document was malicious).
Our dynamic analysis system determined that the Word document was malicious as well.

We have a variety of new techniques to extract information from potentially malicious file objects. Without revealing the attributes of malicious documents that we use to determine that a document or executable is malicious, we wanted to show how suspicious Locky is and give a sense of the richness of data available to determine the malicious nature of this threat.

We took a wide variety of good documents and malicious documents from our malware database and examined the traits of Locky. Here are some of the traits specifically seen in Locky and how often those traits are seen in good Word documents and how often they are seen in malicious Word documents:

Good documents	Malicious documents	Trait
0.9%	84.4%	Document has macros
6.6%	50.2%	No title
7.5%	45.3%	Single paragraph document
< 0.1%	39.6%	Obfuscation function calls found
varies	27.6%	Code Page 1251 Windows Cyrillic (Slavic)

It’s impossible to simply block all documents that contain macros because they are often used in legitimate documents (more often with complex documents and spreadsheets), but it’s definitely a very bad sign of the things to come with Locky. Similarly, you wouldn’t want to block all documents containing Code Page 1251, but it seems to confirm that Locky originated from an Eastern European country (or at least someone who had configured their system to use Cyrillic).

Let’s dive into a bit more detail on what the macros did by running a code trace.

The Locky ransomware infection begins with a Microsoft Word document containing Visual Basic macros. By default, such macros are often disabled, so the user is presented with a blank or misformatted page with an instruction to enable macros if the page does not render properly.

When the user enables macros for this document, the infection process begins via the embedded Visual Basic scripts. These scripts use a variety of obfuscation techniques to hide the actual intent of the macros, but these obfuscation attempts are detected by Sky ATP’s static and dynamic analysis engines.

This calls the first VBA routine:

The value of UserForm1.Label1.Caption is a string of values separated by slash characters. (We’ll soon see how this is used to obfuscate the functionality of these scripts.)

After the Split function executes, we have an array DrinkSun with the following elements:

The execution continues by jumping to the “ErrExit” label:

This creates a Microsoft.XMLHTTP object KogdaGe_1 that can make http requests, and then an Adodb.Stream object KogdaGe_2. The GoTo statement jumps over a large section of unnecessary (and invalid!) code which exists only to hinder automated analysis.

The script then creates Shell.Application and WScript.Shell objects, and stores the “Process” environment variable for the Wscript.Shell object, which contains a value of the form:

PROCESS: TEMP=C:\path\to\temp\dir

The TEMP directory referenced here will later be used to store a malicious binary downloaded from a remote server.

Here we find another use of obfuscation: the array KogdaGe_7 contains an encoded URL. (The actual URL and obfuscation algorithm varies between Locky samples.) Running this computation in Python, we get:

So the array KogdaGe_7 actually represents the URL of a malicious binary, encoded to hinder static analysis.

Now the script uses the Microsoft.XMLHTTP object created previously to prepare an HTTP connection ("GET") to the malicious URL we just decoded.

The HTTP connection is used to send a request for the binary at "http://www [DOT] jesusdenazaret [DOT] com [DOT] ve/34gf5y/r34f3345g [DOT] exe".

The value of the TEMP directory from the Process environment variable is extracted and stored as KogdaGe_4.

Now the full path for storing the downloaded binary is constructed by concatenating the TEMP directory with a filename. Note that the innocuous-looking filename "ladybi.txt" is now transformed into "ladybi.exe."

Here we see the first use of the "CallByName" routine being used for obfuscation. This call is equivalent to: KogdaGe_2.Type = 1.

The Adodb.Stream object created above is opened.

This CallByName function call is equivalent to rbp = KogdaGe_1.responseBody.

This is equivalent to KogdaGe_2.write(KogdaGe_1.responseBody), which saves the downloaded binary to the Adodb.Stream:

Equivalent to KogdaGe_2.savetofile("<temppath>\ladybi.exe", 2).

Finally, we reach the end of the script’s execution, and the malicious program now stored as ladybi.exe is executed.

The Locky Executable

Sky ATP detects the executable dropped by the Word macro primarily through the use of behavioral analysis. This determination is a little bit more complex, but Locky behaves similarly to a lot of malware.

Good applications	Malware	Trait
21.8%	49.5%	Accesses hosts file
27.4%	50.4%	DNS resolution
43.6%	67.1%	Excessive sleep calls
0.2%	12.2%	DNS resolution of many domain names with many failures
2.4%	9.7%	Generates new code (typically unpacking or expanding shellcode)
1.7%	3.9%	Posts data to a webserver
< 0.1%	1.9%	Creates PE files with a name already existing in Windows
0.2%	1.1%	System process connects to network

The difficult part is differentiating Locky from good software. As you can see, a lot of good software has similar behaviors. Even DNS lookups will fail sometimes from good software so it’s important to have a robust decision-making system to stitch together all of this information.

After the Visual Basic macros in the document have download the Locky executable file, the encryption and ransom process begins. First, the malware copies itself to a temporary directory under the name “svchost.exe” and relaunches. “svchost.exe” is also the name of an executable distributed as part of Windows that supports services run from dynamic-link libraries so the malware is behaving very suspiciously already.

Then it contacts a command and control (C&C) server to retrieve the key (specific to each computer) that will be used to encrypt the user’s files (as well as files on network drives) and writes these values (and an autostart key) to the registry.

As part of contacting the C&C server, Locky does DNS lookups on a variety of domain names, some of which didn’t exist at the time it was distributed.

Once this is completed, the user’s files are replaced with encrypted versions.

Inside of Sky ATP, all of this information is fed into the Sky ATP machine learning verdict engine that compares the behavior of a potentially malicious piece of code to the code being analyzed.

After analyzing the malware executable, our verdict engine assigns a threat score of 7 out of 10 (any score 7 or higher is considered to be a high threat level).

A text file containing ransom information pops up the give the victim the ransom information.

In case that isn’t enough, Locky helpfully opens an image file containing the same instructions, and also changes the user’s desktop wallpaper to display this image.

Machine Learning

It’s hard to show exactly what our machine learning verdict engine is doing. After all, from just these two files, this is how many different traits were examined and how many ultimately contributed to our malware classifications:

File	Features Examined
Locky Word Document	~216,000
Locky Executable	~20,000,000

Obviously, it’s impossible to represent all of these features, but to illustrate roughly how the machine learning in Sky ATP is working, we took the Word Document feature set and reduced it to two dimensions.

On the X axis we have the distance from separating hyperplane, scaled from 0 to 1, which shows how far the features of the document are from the hyperplane that separates the good from the bad.

If we project these features onto the hyperplane, and decompose them into two components, taking the first component, and scale this component to between 0 and 1, we get the Y axis.

The first component has a strong correlation with the separating hyperplane itself - that is if we split the good from the bad horizontally where Y=0, it would be almost identical to splitting them horizontally where X=0.

In this chart, red dots are examples of good documents and blue dots are examples of malicious documents.

The shading represents the probability of a document being malware given our model, so anything that falls in the blue-shaded area has a probability close to 0 of being classified as malware, while documents that fall in the red-shaded area have a probability close to 1 of being classified as malware.

Locky, in yellow, is firmly on the right side of the hyperplane after the algorithm learns how to separate good documents from malicious ones based on the traits of those documents.

It is important to note that the algorithm hasn’t seen Locky ahead of time, but was able to determine that it is malicious based on the characteristics of the document.

Thanks

I want to thank Asher Langton for helping with the traces and behavioral data, Peter Gael and Roman Sinayev for additional information used in this article.

Blog Viewer