This article will delve into the concept of entropy and its significance in malware analysis. For this exploration, we'll utilize two tools, PESTUDIO (https://www.winitor.com/) and DiE (http://ntinfo.biz/index.html).
Understanding the importance of entropy in malware analysis
Entropy analysis is crucial in malware investigation, as it enables us to determine rapidly if an executable file (.exe; .dll; etc.) is packed or encrypted.
Why does this matter?
It allows us to quickly discover whether the executable file (.exe; .dll; etc..) is a packet or encrypted.
Most legitimate software are neither packed nor encrypted. Based on this fact, we can use EDR tools to pre-screen files with high entropy, which is an initial indicator of possible malware. However, a high entropy level doesn't automatically imply the presence of malware.
Reasons for encrypting or packing a program
Legitimate programs might be encrypted to protect intellectual property. Malicious software, however, often uses encryption to frustrate and delay malware analysts, creating a lengthy, intricate process of reverse engineering to understand the malware's operation. We will discuss the significant differences between these two later in this article.
You can find a formal definition of entropy on Wikipedia (https://en.wikipedia.org/wiki/Entropy_(computing)), but it may be somewhat abstract for those unfamiliar with its practical application. Simply put, entropy is a measure of the randomness of character distribution in a given data set – text, file, etc. High entropy levels (around 6.8 or higher) typically indicate a high level of randomness.
Lyda, R.; Hamrock, J., "Using Entropy Analysis to Find Encrypted and Packed Malware," Security & Privacy, IEEE , vol.5, no.2, pp.40,45, March-April 2007
Does it help to solve the hunting problem? Yes! Yes!
Absolutely! Enterprise EDR security solutions like Fidelis Endpoint, FireEye HX, or RSA ECAT can filter out executable files with entropy levels of 6.8 or higher. This allows you to quickly narrow down a large number of executable files to identify potential malware. Having a database of known good MD5/SHA256 hashes also helps in swiftly eliminating benign files, leaving you to focus on potential threats.
To illustrate the concept, we'll use cmd.exe from c:\windows\system32\ with the MD5 hash of E08FE2DE3DDD22123247D49A11B4F53D.
Using PESTUDIO and DiE, we can promptly check the entropy level of this standard, non-packed/encrypted OS command. Both programs consistently return values relative to a Native Executable value.
Start with PESTUDIO and DiE to quickly show the entropy level of standard none packed/encrypted typical OS command.
As we can see, both programs are giving us pretty much consistent value which is related to Native Executable value (table above).
Internal section values also look nice, showing no encryption.
Without encryption or packing (indicated by a low entropy value), a reverse engineer or threat hunter can quickly inspect the imports section and strings to ascertain the program's true purpose.
Function names, and their associated libraries and purposes, can be easily read and investigated further, for example by googling "NTOpenFile win32 API", leading to Microsoft Win32 API documentation for further analysis.: NTOpenFile
Of course, this is just an example, but by googling other function, we can understand the functionality behind the EXE file.
The goal of packing or encrypting malware is to make imports/strings unreadable to an analyst.
The same file, when packed with the UPX packer (upx -1 -o cmd.ent.exe cmd.exe), surprisingly triggered 11 out of 71 AV engines to mark the UPX-packed cmd.exe as malicious.
It was a little surprising to me when 11/71 AV engines marked packed cmd.exe by UPX as a malicious file.
I will look into this in another article 😊 although I guess that some specific functions imported into the executable part are triggering this alert.
Expect, the packed program has higher entropy, thus giving us an IOC of possible malicious intentions.
If we look into function calls ................
The packed program exhibited higher entropy, suggesting potential malicious intent. If we investigate the function calls, we see only a few, meaning we have no idea about the potential functionality of the program. The remaining functions are often responsible for dynamic function load and unpacking the executable file.