: Offensive Cyber Security Software for Embedding Shellcode in Images”

Abstract


INTRODUCTION
Although steganography has been around for thousands of years, it still has plenty of practical uses in cyber security. Simply put, steganography is the practice of concealing data in an object. Throughout history, steganography has played a significant role in transmitting hidden messages, though other use cases exist, which may include copyright protection, private communications, and data exfiltration. This project focuses on utilizing various steganography techniques to embed machine code (shellcode) in images. Shellcode is used in offensive cyber security engagements to execute low-level functions on a computer, which may enable an operator to gain remote access or exploit obscure bugs in the host computer's operating system, among other possibilities. Most shellcode is flagged as malicious by antivirus software and other threat detection systems like firewalls because they often contain malicious payloads. To remain effective, it is important for offensive security operators to develop new ways of evading these defenses.
Evading detection by firewalls and antivirus is important to ensure reliable execution of the shellcode by the victim computer. Steganography with portable network graphic (PNG) images and malicious shellcode provides a new way to avoid detection while ensuring reliable transmission. PNG images are almost never filtered out by network firewalls or flagged by antivirus software because they are an integral part of web browsing and computer applications. Blocking these files would prevent most websites and computer applications from properly functioning. Because PNG images are (currently) immune to nearly all defensive mechanisms and are saved with lossless compression, they were chosen as the cover object for the steganography methods used in this project.
The purpose of this paper is to present the StegArmory software and describe benchmarking and detection testing results for a cyber security practitioner audience. As such, a complete literature review is beyond the scope of this paper but we refer the reader to Hussain et al. [2] for a comprehensive review of spatial stenography. The StegArmory software is open source and available for download [1] from a GitHub repository.

SOFTWARE DEVELOPMENT
StegArmory, the software developed for this project, enables users to embed and extract payloads from images. StegArmory supports two image-based steganography techniques: least significant bit (LSB), and pixel value differencing (PVD). LSB steganography involves manipulating the least significant bit for the red-green-blue (RGB) values that make up every pixel in the cover image, until the entire payload is embedded. In other words, this method embeds one bit of the payload at a time, where each pixel can store 3 bits in total. According to Sinha [3], LSB steganography is the easiest and most adaptable image-based steganography technique. PVD steganography is more involved and was originally identified by Wu and Tsai [5]. The PVD method iterates over the RGB values that make up each pixel and determines how many bits of the payload to embed in these values based on their visual similarity. Lower similarity allows for more bits to be embedded (up to 5 bits at a time), whereas higher similarity allows fewer bits to be embedded. Embedding 1 bit of data in the RGB values results in the least amount of change to a pixel, whereas embedding 5 bits of data results in the largest distortion. Also, if the addition of the bits results in an invalid pixel value, the pixel pair is skipped. RGB values can range from 0 to 255, and values outside this range are discarded. In theory, this method allows for a higher density of data to be embedded, though it is dependent on the composition of the cover image. The PVD method is also much slower to embed data due to all the checks that take place when determining how much data to embed in a pair of pixels. Between the two methods, LSB is more performant, but is easier to detect (as all pixels are modified). Similarly, PVD is less performant, but is harder to detect, and has the potential to be more space efficient compared to the LSB method. Since one of the objectives of this project was to identify the most efficient and least detectable image-based steganography technique, these results will be discussed in greater detail below. However, it is important to introduce two metrics that were used to evaluate the quality of the output (processed) image. The first metric, structural similarity index measure (SSIM) is used to quantify the perceived quality of the image. Ranging from 0-1, higher SSIM values represent better image quality. The second metric, peak signal-to-noise ratio (PSNR) is similar in that it represents the amount of distortion between images. Ranging from 0-100, a lower PSNR value represents a less distorted image. PSNR however, is more sensitive to even the smallest changes in an image, whereas SSIM only accounts for visually detectable changes. Together, these two measures make it possible to evaluate the quality of images processed using either of the steganography methods in StegArmory.
Developing the LSB and PVD steganography methods in StegArmory presented several challenges that were not obvious in the planning stages. Accessing the RGB values that comprise the pixels in an image was difficult due to the encoding used in the PNG file format. Using the OpenCV2 library in Python made it possible to overcome this challenge without abandoning too much of the original codebase. Corner-case bugs that manifested when embedding and extracting data using the PVD method also presented many challenges. These obscure bugs were difficult to identify and remediate because they only appeared when using StegArmory with certain payloads and cover images. These bugs were eventually addressed by referring to other implementations of the PVD method developed after the original publication by Wu and Tsai [5], as discussed by Tseng and Leng [4]. Performance was also an issue with both the LSB and PVD methods. Since the LSB method was developed first, many simple data structures were used to store and manipulate pixel data. Switching these structures to arrays provided by the NumPy library reduced the code complexity and increased performance of the LSB processing method. These structures were later used in the PVD processing method, though the PVD method was still less performant than desired. The PVD algorithm requires multiple iterations over pixel pairs along with a series of conditional checks, which greatly slows down the processing compared to the LSB method. The performance challenges with the PVD method could be solved by implementing a threaded approach to the data processing, though such modifications would have added unnecessary complexity for a security-focused utility. One of the other major challenges faced in the development of StegArmory was staying organized. A virtual Kanban Board was created on Atlassian's Trello to keep track of the backlog of items, in progress items, in test items, and completed items. This helped keep the project organized, and prevented additional, unnecessary features from being added to the project -something that is easy to do on an open-ended project. All major challenges were overcome by the end of the project, which enabled the first stable release of StegArmory to be published to GitHub on December 1 st , 2020 [1].

PERFORMANCE BENCHMARKS
Three images of varying sizes and one malicious payload were used to benchmark the performance of StegArmory. The benchmark code tested the images and payload against the different steganography methods to identify the pros and cons of each steganography method. The test images are displayed in Figure 1 and test results are listed in Figure 2.  The embed and extract times in Figure 2 make it clear that the LSB method is much faster than PVD. In most scenarios, more data can be embedded in the cover image using the PVD method. The SSIM and PSNR metrics for the images introduce the idea that an image can see a lot of change over fewer pixels (PVD method), but still look very similar (SSIM). In most cases, the PVD processed images have a higher (better) SSIM value than images processed with the LSB method, and lower (better) PSNR values. These results indicate that to embed the largest payloads with the lowest detection rates, PVD steganography should be used. To process data the images as quickly as possible with decent output quality, LSB steganography should be used.

THREAT DETECTION
Another objective of this project was to evade detection by the top antivirus (AV) software. To verify that the test payload embedded with StegArmory could evade popular AV engines, the malicious payload was uploaded to VirusTotal for scanning by nearly 70 different AV programs. The results of the scan are displayed in Figure 3.

Image
Method Of the 69 virus engines available for scanning, 53 of them (76%) correctly identified the malicious file. The malicious file was then embedded in each of the cover images using StegArmory, and the processed images were uploaded to VirusTotal to see how well the payload was hidden. The results for the processed images are tabulated in Figure 4.  The detection rate for all processed images was found to be 0%. This indicates that regardless of the method used, no AV software was able to detect the malicious payload embedded in the cover image.

VISUAL COMPARISON
Next, in an effort to visualize the miniscule changes made by embedding the data, the Online Image Difference tool was used to highlight the difference in pixel values between the original and processed images. The green overlay within the images in Figure 5 highlight the pixels that have changed between the original and StegArmory processed image. A lighter shade of green (or absence of color) represents minimal or no change, and a darker shade represents more change.

CONCLUSION
StegArmory provides an effective method for concealing a malicious file in an image and using the image to make the hidden data available inside a protected network. Overall, the benchmark, threat detection and visual similarity overlay results indicate that both the PVD and LSB methods may be useful under various circumstances. The LSB method is much quicker to embed and extract data but resulting image differences are easier to detect. In contrast, the PVD method is slower to process data but it facilitates larger payload capacity and the resulting image differences are more difficult to detect. Finally, as an open source project, StegArmory enables security professionals to duplicate, modify, and redistribute the software for their own purposes.