Introduction Yara Rule Generator

    A couple of months ago we started to work on a new feature for Joe Sandbox we call Yara Rule Generator. Yara is a well known pattern matching engine built for the purpose of writing simple malware detection rules:


    Yara main use is to detect APT and advanced threats which AV does not detect that quickly. A big part of Joe Security's customers use Yara on a daily basis. Due to that we got many requests about adding a feature to Joe Sandbox to automatically generate Yara rules and finally decided to take up that challenge.

    Today we release a new free service you find at Yara Rule Generator creates Yara rules automatically based on behavior data such as files and memory captured by Joe Sandbox.

    How does the Joe Sandbox Yara Rule Generator work and what kind of rules does it generate? The generator creates three different rules per submitted sample:

    File rules enable to search for the submitted sample. Dropped rules are rules generated out of files which have been created or downloaded by the initial sample during dynamic analysis. Memory opcode rules finally are generated by using memory dumps. File and dropped rules enable  to scan for the particular sample on the file system. Memory opcode rules on the other hand allow to find malware in the process memory (you can specify a process id as a target if you launch Yara or use our batch file to scan all processes) of a target system.

    Further a rule can be a simple or super rule. Simple rule are specific to the submitted sample and its behavior. Therefore they do not match variants of the same malware. Super rules are generic and are built over a set of uploaded samples / behavior. Since they only capture common behavior they often find malware variants:

    To generate rules the Joe Sandbox Yara Rule Generator extracts different kind of behavior data such as:

    • PE structure data (e.g. section names)
    • Strings (unicode and ascii)
    • Code sequences (e.g. entrypoint)
    • Opcodes sequences from HCA (Hybrid Code Analysis)
    All the extracted artifacts are then rated based on knowledge, entropy and location information. After artifact selection a test rule is generated and it's false positive rate measured by using a reference goodware set. Finally the rule is taken if the false positive rate is acceptable.

    For super rules Joe Sandbox Yara Rule Generator uses an efficient clustering algorithm to find common opcode sequences.

    Results look very promising. To test super rules we have generated rules by using malware family sets. We took three samples out of the set and generated super rules. We then infected a test system with a fourth sample of the same family and searched it with our rules:

    Of course also the file and dropped rules work well:

    However please note that the Yara Rule Generator is no silver bullet. Creation of simple and super rule is tricky and far from perfect. During the development of version 1.0.0 we spot lot of areas for improvements. All the rules are well commented and documented. Therefore it is simple to extend or change the rules.

    The Yara Rule Generator has already been deeply integrated into the Joe Sandbox platform and will be shipped with the next major release.

    Happy Rule Creation!

    Update 1:

    We were inspired by yaraGen from Florian Roth as well as

    Happy New Year!

          The Joe Security team wishes you success, satisfaction and many pleasant moments in 2015!

    New Sandbox Evasion Tricks spot with Joe Sandbox 10.5

    Recently we came accross an interesting sample equipped with new tricks to evade sandboxes and other dynamic analysis systems:

     In pseude code:

    The sample sleeps until there is a mouse and foreground window change. Since most malware analysis system only simulate mouse changes they miss to analyze the real malicious payload. With the Cookbook technology of Joe Security one can easily simulate any activites:

    However this is not enough. The sample includes an additional evasion trick:

    Basically the disk is queried for IOCTL_DISK_GET_DRIVE_GEOMETRY_EX. The structure contains information like the media type, sector per track and the number of cylinders of the hard disk. After the query the number of cylinders is compared to value 5000. If there are less than 5000 cylinders the sample simple terminates. Since Joe Sandbox runs on any device including virtual, simulated and native systems one can quickly analyze the malware on real system:

    Finding a DGA in less than one Minute

    Recently, we stumbled upon a malware sample (MD5: 177b75910ae8c0091bafef4950c0b224) that obviously employs a domain generation algorithm (DGA). We analyzed the sample with Joe Sandbox 10.5 which will be released soon.

    As the signature overview highlights, Joe Sandbox has detected that the malware generates random DNS queries:

    Massive injections and system behavior has been detected as well:

    Also the network behavior is quite extensive:

    One of the cool new features of Joe Sandbox 10 is a context based search integrated into the behavior analysis reports. With it you can search any data Joe Sandbox has captured:

    In order to find the DGA, search for the term "DNSQuery":

    It seems explorer.exe is doing some DNS queries. Clicking on the search hits lets one navigate easily to the corresponding data:

    As the cutting outlines, DnsQuery_A has been called 244 times which matches the extensive network behavior. By clicking on the source address one can jump to the function where this DnsQuery API has been called:

    The instructions before the DnsQuery API outline that the domain name is generated generated by the function 12AE200. With the help of the IDA Bridge plugin, one can load memory dumps extracted by Joe Sandbox easily:

    And pull in dynamic behavior data:

    Full Analysis Report available at (use Firefox to open it):

    Analysis of Code4HK with Joe Sandbox Mobile

    As the media and several tech companies already outlined a fake smartphone app is being used to remotely monitor pro-democracy protesters in Hong Kong. We came accross the corresponding malware via our APK Analyzer (

    A full analysis can be downloaded here:

    The Joe Sandbox Mobile analysis is very nice and shows all spying and control payloads of the trojan. Rather then explaining all details we just add here some interesting report cuttings:

    Generic Keylogger Detection with Joe Sandbox X

    In our last blog post we have demonstrated some of the features of our new product Joe Sandbox X by analyzing the recent malware "xslcmd" (MD5: 60242ad3e1b6c4d417d4dfeb8fb464a1). It has been extensively shown how the malware installs itself and that one of its core payload is a keylogger.

    In this post, two new cool features are presented. In combination they allow the payload detection of the xslcmd malware:

    As the signature summary outlines we have added a signature to detect keyloggers generically. Let's have a look how this works.
    Beside the installer (PID 236, sample-cmd) and the launch agent process (PID 241, clipboardd), the startup section of the report also lists the process (PID 253):

    This is actually a process that was started by a Cookbook. As you might already know, Cookbooks are a powerful technology that enables the customization of the analysis procedure in order to influence and change the malware's behaviour. Here is the Cookbook used for the current analysis:

    After loading the sample with the _JBLoadProvidedBin, the text editor is opened with the _JBRunCmd. Then the Cookbook simulates some low-level keyboard strokes via _JBSimulateKeyboardStrokes. In this case, the keyboard numbers/letters "0deconinput0" are typed in. The screenshot reveals the launched text editor and the simulated user input:

    By having a closer look at the launch agent process clipboardd (PID 241) running in the background, it can be observed that the simulated keyboard strokes are written to a log file residing in the user's home directory:

    So to generically detect keyloggers Joe Sandbox X uses a Cookbook to simulate keystrokes and then looks with behaviour signatures for typed key sequences written to files. If such a sequence is found it is obvious that the malware captures and stores keys:

    We are aware that the signature can be evaded. However, due to the agility of Joe Sandbox X it is easy to quickly spot and detect new behaviours. The detection of key loggers is just one of many use cases of _JB Cookbook commands. _JBRunCmd allows the analyst to execute arbitrary (shell) commands which often helps to combat evasive malware. 

    Full analysis report for xslcmd:

    Joe Sandbox X: Automated Dynamic Malware Analysis on Mac OS X

    We are proud to present today Joe Sandbox X - the first automated dynamic malware analysis system for Mac OS X. As with all of our productsJoe Sandbox X executes files in a controlled environment and monitors the behavior of applications for suspicious activities. All activities are compiled into a comprehensive and extensive analysis report.

    There are currently only a moderate number of known malware targeting Mac OS X systems. However, we at Joe Security think that the number of unknown Mac threats is high and since many companies are moving to the Mac world, Mac OS X will become more and more a hot target.

    To show some of the features of Joe Sandbox X, we have analyzed a recent malware named "xslcmd" (MD5: 60242ad3e1b6c4d417d4dfeb8fb464a1) that was deteced by FireEye about two weeks ago (VirusTotal 0 score 0/55 at detection time, 12/55, date of this blog post).

    Joe Sandbox X uses the same report format as Joe Sandbox Desktop and Joe Sandbox Mobile, so it is likely that you are familiar with the report structure. Let us walk through the report. From the behavior analysis report we see that the sample has opened a terminal window:

    Currently, Joe Sandbox X includes about 100 behavior signatures which rate and classify the behavior. The signature summary already gives a nice overview concerning some of the key functionalities of the malware:

    Static analysis which includes a Mach-O parser shows other interesting facts:

    The sample is able to run on three different architectures (Power PC, i386 and x86_64). Usually, Mach-O files only run on one or two architectures. It is likely that three architectures are supported to extend the number of potential target systems.

    Next, the comprehensive process overview shows the child and overlayed (forked then execve'd) processes of the initial sample as well as other processes started during analysis time:

    After startup the sample spawns the launchctl command which is often used to load a launch agent or launch daemon (service processes). The sample also starts itself again (see PID 275). Looking at the behavior of this process reveals its purpose:

    As the report excerpt outlines the sample deletes itself. Looking further down the startup shows that a launch agent process named "clipboardd" is started:

    Clipboardd starts the command "sw_vers" twice. Sw_vers prints version information about the operating system:

    Looking at the "sample-xslcmd" process (see PID 271) in more detail reveals that one of the created files is a plist file in the user's launch agent directory. This file is a necessary file when creating a launch agent and controls some of the launch agent' settings.

    The content of the plist file outlines that the launch agent is not started on system boot (XML tag false), where the other setting ensures that the launch agent keeps running. Also the dropped "clipboardd" file is executable (check out the Mach-O magic header bytes CAFEBABE):

    The launch agent is explicitly started with "launchctl load":

    Looking at the "clipboardd " process (see PID 274) in more detail shows that besides other activites two directories are created: ".fontset" and "BackupData". ".fontset" is hidden on Mac OS X systems due to the point prefix. In the "BackupData" a log file is created containing the term "##Terminal##".

    This is likely a log file of a keylogger (due to the suspicious name). Joe Sandbox X also enables to interact with the analysis machine, so we were able to run the sample again, simulate some user behavior and check the file content in order to verify our assumption:

    The service also tries to open the three files "pxupdate.ini", "chkdiska.dat", and "chkdiskc.dat". These files did not exist so the open calls were not successful.

    It is likely that these are configuration files and that the sample checks if the files exist to prevent reinfection. Finally, two hidden files named ".got" are created in the "Desktop" and "Documents" directory of the user.

    As some of the signatures already detected, the sample is actively communicating with the Internet. The included world map shows at a glance which countries have been contacted:

    When looking at the HTTP traffic, one can see the HTTP POST requests performed on a non-standard port to the fake host "":

    As this blog post outlines Joe Sandbox X enables to quickly understand and detect threats which target Mac systems. We continue our development to increase the number of signatures and also capture more behavior. Joe Sandbox X will be also available in Joe Sandbox Cloud, Joe Sandbox Complete, and Joe Sandbox Ultimate.

    Full analysis report for xslcmd: