Adaptive Internet Simulation

    Nearly any malware today uses the Internet for communication. Often to download second stage malware, to register at its command and control server, or to spread and propagate. By capturing and analyzing suspicious Internet traffic during the execution, a malware analysis system can detect various interesting artifacts such as domains or IPs used to host malicious content. However allowing Internet access for a malware analysis sandbox also has a big drawback: fingerprinting.

    How does fingerprinting work? Check out the chart below:

    The bad guys start by submitting an info collector program to the target sandboxes. The info collector will enumerate all system settings, hardware ids, serial, tokens, etc., and will report back all the data via Internet.
    A new malware is then created which embeds the previously collected unique hardware ids and checks for them early during its execution. If some of the ids match, the malware sample will not exhibit any malicious behavior. This approach is very simple to implement and does not require many resources. Even if the bad guys do not have direct access to a particular sandbox, the fingerprinting can still succeed due to the global sharing of malware samples. 

    Today fingerprinting is used a lot and poses a big problem for malware analysis systems. There was even a public online tracking page ( showing IP addresses, user names, etc., of online sandboxes:

    We also found various info collectors (e.g. 26b79a7370720b0822bb786043b86448) over the last months:


    Adaptive Internet Simulation

    To solve the problem of fingerprinting, sandbox vendors lately introduced randomization. Randomization will generate random values for all ids and serial numbers. However randomization has several shortcomings. First not all serials and ids can be randomized. Many ids are used by the license verification of the operating system, and changing them will trigger the verification check. Next the number of unique ids, names and settings is enormous, and various tokens influence the system.  Finally randomization today is done during the installation of the sandbox. This means that a system is randomized once and then will stay the same for months: more than enough time to do fingerprinting.

    We have come up with a different idea to solve the problem which we call Adaptive Internet Simulation (AIS). AIS is a full network proxy which sits between the sandbox and the Internet:

    The networks proxy has two main goals:

    1. Prevent leaks such as hardware ids and serials.
    2. Simulate where appropriate. 
    To achieve both goals, we developed a port independent protocol identification engine, a flexible configuration syntax to define what traffic is considered a leak, and a generic simulation framework.

    An info collector running on Joe Sandbox with AIS will not be able to leak the collected sensitive data anymore, however it still runs as if it were connected directly to the Internet. So AIS is nearly transparent.

    Extensive tests have shown that AIS works very well without any impact on the behavior of the malware.

    Besides preventing fingerprinting, AIS has a number of additional nice features:

    • Block "noise traffic" from the OS, such as updates or notifications.
    • Simulation in cases where an IP or a domain is not available anymore.
    • Simulation in cases where resources (e.g. files) are no longer available.
    AIS is currently available as a plugin for various Joe Sandbox products.

    Example reports (sample source Threatwave):

    Nymaim - evading Sandboxes with API hammering

    Recently we were investigating interesting piece of malware that was generating quite huge workload in the sandboxed environment. To introduce proper countermeasures we had to fully reverse it. It turned out that the file belongs to the Nymaim family, which is active at least since 2013 [1]. This particular file consists of few layers, first one is meant to slowdown / timeout various sandboxes / replicators / emulators, the last layer hinders static analysis and debugging, layers in the middle are just responsible for decompression and decryption.

    There are many different methods of introducing slowdowns in sandboxed environments, some of them are more effective some of them are not effective at all. Nymaim uses Win32 API hammering, which means that it constantly calls benign Win32 API functions in the loop. It's clearly visible on the WinMain function graph (glued side by side, since the function is way too big):

    Readers familiar with IDA function graphs should notice unusual length of the code.There is a lot of loops and a lot of various API calls:

    API NameNumber of calls

    Without any monitoring tools, execution of WinMain takes around 46 seconds. Now if any of the above APIs triggers some event that is (or should be) monitored in the sandboxed environment you can imagine what would happen to those 46 seconds. So far we have not seen any sandbox able to analyze the malware successfully. During the analysis we were able to identify one unwelcome side effect of the USER32.dll.EnumDisplaySettingsA function call, namely it loads and unloads the vga.dll kernel library during the call:

    ChildEBP RetAddr
    890136f8 828563ef nt!DbgLoadImageSymbols+0x47
    89013714 82a05b21 nt!DbgLoadImageSymbolsUnicode+0x23
    89013750 82a02531 nt!MiDriverLoadSucceeded+0x183
    890137d0 82a8ccf8 nt!MmLoadSystemImage+0x720
    8901391c 8287a8c6 nt!NtSetSystemInformation+0x967
    8901391c 82879969 nt!KiSystemServicePostCall
    890139a0 907a895a nt!ZwSetSystemInformation+0x11
    89013b1c 907a858b win32k!ldevLoadImage+0x215
    89013b54 907a2b9a win32k!ldevLoadDriver+0x78
    89013b70 907aaec4 win32k!ldevGetDriverModes+0x1c
    89013b9c 90806eb6 win32k!DrvBuildDevmodeList+0x134
    89013bfc 90806aea win32k!DrvEnumDisplaySettings+0x3b9
    89013c1c 8287a8c6 win32k!NtUserEnumDisplaySettings+0x27
    89013c1c 772270f4 nt!KiSystemServicePostCall
    001241d0 762f13c4 ntdll!KiFastSystemCallRet
    001241d4 763065c1 USER32!NtUserEnumDisplaySettings+0xc
    00124214 76306502 USER32!EnumDisplaySettingsExA+0xbc
    0012422c 004023d4 USER32!EnumDisplaySettingsA+0x23
    0012fef8 0040698f 885+0x23d4
    0012ff88 760eee1c 885+0x698f
    0012ff94 772437eb kernel32!BaseThreadInitThunk+0xe
    0012ffd4 772437be ntdll!__RtlUserThreadStart+0x70
    0012ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

    This of course triggers driver analysis (in case the sandbox offers it)... 7652 times and the only thing that separates good analysis and total failure is a proper filtering of collected data.

    Second stage of the malware is executed through the callback from EnumResourceTypesA, it decompress and decrypts the final stage, which is heavily obfuscated. Entry point of the final stage suggests that it can be run from within both x86 and x64 processes. It uses simple trick to detect bitness of the process [2] and it contains both x86 and x64 payloads.

    After deobfuscation we were able to identify another simple check to evade analysis, mentioned sample uses GetSystemTime API to verify expiration date, and does not execute after 8th of April 2016. We can easily handle such cases in Joe Sandbox through our Cookbooks system [3], whole operation boils down to adding below line to the Cookbook:

    _SetDate(06, 04, 2016)

    It's always good to re-run the analysis with the different dates to verify if the sample doesn't expire or if it isn't activating in the near future. Usually it's safe to assume the day when the sample was received as the initial date, timestamp from the PE header should work as well (in this case it is GMT Tue Apr 05 22:31:47 2016).

    Last but not least, link to the full Joe Sandbox report (click the picture to open):

     Joe Sandbox Report

    Nymaim proves that it is very important to have a flexible malware analysis system which enables analysts to easily change settings on the analysis machine. Joe Sandbox features an extensive technology called cookbooks. Cookbooks enable to completely define the analysis and allow to change OS settings, simulate user behavior and more. Further Joe Sandbox analyzes malware on physical machines (bare metal) defeating any VM evasions.


    HydraCrypt the badass Ransomware

    2015 was definitely the year of ransomwares and it seems 2016 is no different. Yesterday we came across a new ransomware called HydraCrypt:

    Hydra is no different than other ransomware like Cryptowall or Teslascrpy. However there is one big exception. So far seen ransomware will encrypt your documents (PDF and Office) and pictures. Hydra instead will also encrypt your application settings and database. In detail, Hydra encrypts a huge number of additional files:

     .3dm .3ds .3g2 .3gp .7z .ab4 .accdb .accde .accdr .accdt .ach .act .adb .ads .ai .ait .al .apj .arw .asf .asm .asp .asx .avi .back .bank .bay .bgt .bik .bkf .bk .blend .bpw .c .cdb .cdf .cdr .cdx .ce1 .ce2 .cer .cfp .cgm .class .cls .cmt .cnv .cpi .cpp .cr2 .craw .crt .crw .cs .csh .csl .csv .dac .db .db3 .dbf .dbr .dbs .c2 .dcr .dcs .dcx .ddd .ddoc .dds .der .des .design .dgc .djvu .dng .doc .docm .docx .dot .dotm .dotx .drf .drw .dtd .dwg .dxb .dxf .dxg .ebd .edb .eml .eps .er .exf .fdb .ffd .fff .fh .fhd .fla .flac .flv .fm .fp7 .fpx .fxg .gdb .gray .grey .grw .gry .h .hbk .hpp .ibd .idx .iif .indd .java .jpe .jpeg .jpg .kdbx .kdc .ke .laccdb .lua .m .m4v .maf .mam .maq .mar .maw .max .mdb .mdc .mde .mdf .mdt .mef .mfw .mmw .mos .mov .mp3 .mp4 .mpg .mpp .mrw .mso .myd .ndd .nef .nk2 .nrw .ns2 .s3 .ns4 .nsd .nsf .nsg .nsh .nwb .nx1 .nx2 .nyf .obj .odb .odc . .odf .odg .odm .odp .ods .odt .oil .one .orf .otg .oth .otp .ots .ott .p12 .p7b .p7c .pages .pas .at .pbo .pcd .pct .pdd .pdf .pef .pem .pfx .php .pip .pl .plc .pot .potm .potx .ppam .pps .ppsm .ppsx .ppt .pptm .pptx .prf .ps .psafe3 .psd .pspimage .ptx .pu .puz .py .qba .qbw .r3d .raf .rar .rat .raw .rdb .rm .rtf .rwz .sas7bdat .say .sd0 .sda .sdf .snp .sql .sr2 .srf .srt .srw .st4 .st5 .st6 .st7 .st8 .stc .std .st .stw .stx .svg .swf .sxc .sxd .sxg .sxi .sxm .sxw .tex .tga .thm .txt .vob .vsd .vsx .vtx .wav .wb2 .wdb .wll .wmv .wpd .wps .x11 .x3f .xla .xlam .xlb .xlc .xll .lm .xlr .xls .xlsb .xlsm .xlsx .xlt .xltm .xltx .m4a .wma .d3dbsp .xlw .xpp .xsn .yuv .zip .zip .sie .unrec .scan .sum .t13 .t12 .qdf .tax .pkpass .bc6 .bc7 .sdn .sidd .mddata .itl .itdb .icxs .hvpl .hplg .hkdb .mdbackup .syncdb .gho .cas .map .wmo .itm .sb .fos .mov .vdf .ztmp .sis .sid .ncf .menu .layout .dmp .blb .esm .vcf .vtf .dazip .fpk .mlx .kf .iwd .vpk .tor .psk .rim .w3x .fsh .ntl .arch00 .lvl .snx .cfr .ff .vpp_pc .lrf .m2 .mcmeta .vfs0 .mpqge .kdb .db0 .dba .rfl .hkx .bar .upk .das .iwi .litemod .asset .forge .ltx .bsa .apk .re4 .lbf .slm .epk .rgss3a .pak .big .wallet .wotreplay .xxx .desc .m3u .js .css .rb .png .w2 .rwl .mrwref .3fr .xf .pst .dx .tiff .bd .tar .gz .mkv .bmp .dot .xml .xmlx .dat .html .gif .mcl .ini .mte .cfg .mp3 .qbi .qbr .cnt .v30 .qbo .lgb .qwc .qbp .af .qby .1pa .qpd .set .nd .rtp .qbwin .log .qbbackup .tmp .temp1234 .qbt .qbsdk .syncmanagerlogger .ecml .qsm .qss .qst .fx0 .fx1 .mx0 .fpx .fxr .fim .3DM .3DS .3G2 .3GP .7Z .AB4 .ACCDB .ACCDE .ACCDR .ACCDT .ACH .ACT .ADB .ADS .AI .AIT .AL .APJ .ARW .ASF .ASM .ASP .ASX .AVI .BACK .BANK .BAY .BGT .BIK .BKF .BK .BLEND .BPW .C .CDB .CDF .CDR .CDX .CE1 .CE2 .CER .CFP .CGM .CLASS .CLS .CMT .CNV .CPI .CPP .CR2 .CRAW .CRT .CRW .CS .CSH .CSL .CSV .DAC .DB .DB3 .DBF .DBR .DBS .C2 .DCR .DCS .DCX .DDD .DDOC .DDS .DER .DES .DESIGN .DGC .DJVU .DNG .DOC .DOCM .DOCX .DOT .DOTM .DOTX .DRF .DRW .DTD .DWG .DXB .DXF .DXG .EBD .EDB .EML .EPS .ER .EXF .FDB .FFD .FFF .FH .FHD .FLA .FLAC .FLV .FM .FP7 .FPX .FXG .GDB .GRAY .GREY .GRW .GRY .H .HBK .HPP .IBD .IDX .IIF .INDD .JAVA .JPE .JPEG .JPG .KDBX .KDC .KE .LACCDB .LUA .M .M4V .MAF .MAM .MAQ .MAR .MAW .MAX .MDB .MDC .MDE .MDF .MDT .MEF .MFW .MMW .MOS .MOV .MP3 .MP4 .MPG .MPP .MRW .MSO .MYD .NDD .NEF .NK2 .NRW .NS2 .S3 .NS4 .NSD .NSF .NSG .NSH .NWB .NX1 .NX2 .NYF .OBJ .ODB .ODC . .ODF .ODG .ODM .ODP .ODS .ODT .OIL .ONE .ORF .OTG .OTH .OTP .OTS .OTT .P12 .P7B .P7C .PAGES .PAS .AT .PBO .PCD .PCT .PDD .PDF .PEF .PEM .PFX .PHP .PIP .PL .PLC .POT .POTM .POTX .PPAM .PPS .PPSM .PPSX .PPT .PPTM .PPTX .PRF .PS .PSAFE3 .PSD .PSPIMAGE .PTX .PU .PUZ .PY .QBA .QBW .R3D .RAF .RAR .RAT .RAW .RDB .RM .RTF .RWZ .SAS7BDAT .SAY .SD0 .SDA .SDF .SNP .SQL .SR2 .SRF .SRT .SRW .ST4 .ST5 .ST6 .ST7 .ST8 .STC .STD .ST .STW .STX .SVG .SWF .SXC .SXD .SXG .SXI .SXM .SXW .TEX .TGA .THM .TXT .VOB .VSD .VSX .VTX .WAV .WB2 .WDB .WLL .WMV .WPD .WPS .X11 .X3F .XLA .XLAM .XLB .XLC .XLL .LM .XLR .XLS .XLSB .XLSM .XLSX .XLT .XLTM .XLTX .M4A .WMA .D3DBSP .XLW .XPP .XSN .YUV .ZIP .ZIP .SIE .UNREC .SCAN .SUM .T13 .T12 .QDF .TAX .PKPASS .BC6 .BC7 .SDN .SIDD .MDDATA .ITL .ITDB .ICXS .HVPL .HPLG .HKDB .MDBACKUP .SYNCDB .GHO .CAS .MAP .WMO .ITM .SB .FOS .MOV .VDF .ZTMP .SIS .SID .NCF .MENU .LAYOUT .DMP .BLB .ESM .VCF .VTF .DAZIP .FPK .MLX .KF .IWD .VPK .TOR .PSK .RIM .W3X .FSH .NTL .ARCH00 .LVL .SNX .CFR .FF .VPP_PC .LRF .M2 .MCMETA .VFS0 .MPQGE .KDB .DB0 .DBA .RFL .HKX .BAR .UPK .DAS .IWI .LITEMOD .ASSET .FORGE .LTX .BSA .APK .RE4 .LBF .SLM .EPK .RGSS3A .PAK .BIG .WALLET .WOTREPLAY .XXX .DESC .M3U .JS .CSS .RB .PNG .W2 .RWL .MRWREF .3FR .XF .PST .DX .TIFF .BD .TAR .GZ .MKV .BMP .DOT .XML .XMLX .DAT .HTML .GIF .MCL .INI .MTE .CFG .MP3 .QBI .QBR .CNT .V30 .QBO .LGB .QWC .QBP .AF .QBY .1PA .QPD .SET .ND .RTP .QBWIN .LOG .QBBACKUP .TMP .TEMP1234 .QBT .QBSDK .SYNCMANAGERLOGGER .ECML .QSM .QSS .QST .FX0 .FX1 .MX0 .FPX .FXR .FIM .$$$ .$DB .001 .002 .003 .113 .73B .__A .__B .AB .ABA .ABBU .ABF .ABK .ACP .ACR .ADI .AEA .AFI .ARC .AS4 .ASD .ASHBAK .ASV .ASVX .ATE .ATI .BAC .BACKUP .BACKUPB .BAK2 .BAK3 .BAKX .BAK~ .BBB .BBZ .BCK .BCKP .BCM .BDB .BFF .BIF .BIFX .BK1 .BKC .BKUP .BKZ .BLEND1 .BLEND2 .BM3 .BMK .BPA .BPB .BPM .BPN .BPS .BUP .CAA .CBKCBS .CBU .CK9 .CMF .CRDS .CSD .CSM .DA0 .DASH .DBK .DIM .DIY .DNA .DOV .DPB .DSB .FBC .FBF .FBK .FBU .FBW .FH .FHF .FLKA .FLKB .FPSX .FTMB .FUL .FWBACKUP .FZAFZB .GB1 .GB2 .GBP .GHS .IBK .ICBU .ICF .INPROGRESS .IPD .IV2I .JBK .JDC .KB2 .LCB .LLX .MBF .MBK .MBW .MDINFO .MEM .MIG .MPB .MV_ .NB7 .NBA .NBAK .NBD .NBF .NI .NBK .NBS .NBU .NCO .NDA .NFB .NFC .NPF .NPS .NRBAK .NRS .NWBAK .OBK .OEB .OLD .ONEPKG .ORI .ORIG .OYX .PAQ .PBA .PBB .PBD .PBF .PBJ .PBX5SCRIPT .PBXSCRIPTPDB .PQB .PQB-BACKUP .PRV .PSA .PTB .PVC .PVHD .QBB .QBK .QBM .QBMB .QBMD .QBX .QIC .QSF .QUALSOFTCODE .QUICKEN2015BACKUP .QUICKENBACKUP .QV~ .RBC .RBFRBK .RBS .RDB .RGMB .RMBAK .RRR .SAV .SBB .SBS .SBU .SDC .SIM .SKB .SME .SN1 .SN2 .SNA .SNS .SPF .SPG .SPI .SPS .SQB .SRR .STG .SV$ .SV2I .TBK .TDB .TIBKP .TIG .IS .TLG .TMP .TMR .TRN .TTBK .UCI .V2I .VBK .VBM .VBOX-PREV .VPCBACKUP .VRB .WBB .WBCAT .WBK .WIN .WJF .WPB .WSPAK .XBK .XLK .YRCBCK .~CW .QBI .QBR .CNT .DESv30 .QBO .LGB .QWC .QBP .AIF .QBA .TLG .QBY .1PA .QPD .SET .IIF .ND .RTP .TLG .WAV .Qbwin .log .QBBackup .tmp .Temp1234 .qbt .QBSDK .log .QWC .log .SyncManagrLogger .log .ECML .QSM .QSS .QST .Fx0 .Fx1 .Mx0 .FPx .FXR .FIM .$$$ .$db .001 .002 .003 .113 .73b .__a .__b .ab .aba .abbu .abf .abk .acp .acr .adi .aea .afi .arc .as4 .asd .ashbak .asv .asvx .ate .ati .bac .backup .backupb .bak2 .bak3 .bakx .bak~ .bbb .bbz .bck .bckp .bcm .bdb .bff .bif .bifx .bk1 .bkc .bkup .bkz .blend1 .blend2 .bm3 .bmk .bpa .bpb .bpm .bpn .bps .bup .caa .cbkcbs .cbu .ck9 .cmf .crds .csd .csm .da0 .dash .dbk .dim .diy .dna .dov .dpb .dsb .fbc .fbf .fbk .fbu .fbw .fh .fhf .flka .flkb .fpsx .ftmb .ful .fwbackup .fzafzb .gb1 .gb2 .gbp .ghs .ibk .icbu .icf .inprogress .ipd .iv2i .jbk .jdc .kb2 .lcb .llx .mbf .mbk .mbw .mdinfo .mem .mig .mpb .mv_ .nb7 .nba .nbak .nbd .nbf .ni .nbk .nbs .nbu .nco .nda .nfb .nfc .npf .nps .nrbak .nrs .nwbak .obk .oeb .old .onepkg .ori .orig .oyx .paq .pba .pbb .pbd .pbf .pbj .pbx5script .pbxscriptpdb .pqb .pqb-backup .prv .psa .ptb .pvc .pvhd .qbb .qbk .qbm .qbmb .qbmd .qbx .qic .qsf .qualsoftcode .quicken2015backup .quickenbackup .qv~ .rbc .rbfrbk .rbs .rdb .rgmb .rmbak .rrr .sav .sbb .sbs .sbu .sdc .sim .skb .sme .sn1 .sn2 .sna .sns .spf .spg .spi .sps .sqb .srr .stg .sv$ .sv2i .tbk .tdb .tibkp .tig .is .tlg .tmp .tmr .trn .ttbk .uci .v2i .vbk .vbm .vbox-prev .vpcbackup .vrb .wbb .wbcat .wbk .win .wjf .wpb .wspak .xbk .xlk .yrcbck .~cw

      The list includes e.g. *.db, *.ini or *.dat. So what does that mean? It means that all your application settings are gone. Same for. stored password and login for e.g. Firefox:

      Besides it will also encrypt files in your Recycle Bin and your System Restore folder. This is bad ass and makes your computer nearly useless. Of course it also comes with all the other functionality of traditional ransomware:

      Full Joe Sandbox 13 analysis:

      Update 1:

      Malware Traffic Analysis is a nice analysis of the threat from a network level: kudos,

      Spider charts, Deep OLE, 950+ and more

      Over the last couple of weeks, we have been very busy and have added new features to Joe Sandbox. In this post, we are going to show you our favorites. These features cross the complete space of malware analysis analysis and include new visualizations, analysis and more.

      Generic Classification

      In order to quickly determine the malicious payload we have added a spider chart visualization to the analysis report:

      Joe Sandbox also generates a new classification label:

      All classification figures are available in the Joe Sandbox reports (XML, JSON) as raw formats.  The complete classification algorithm is open and therefore enables customized tuning. Our spider charts help to quickly determine the type of the malware without requiring any in-depth technical understanding of the malware. By clicking on the malware icons you can get a detailed description. Besides the spider chart, we also introduced new pie charts for many analysis data as well as for the famous behavior graphs:

      Example report:

      Deep Static Analysis of OLE files

      The static analysis includes code analysis and deobfuscation for VB macros. Documents with VB macros have become a common way to deliver payload to the end user system. With the new static code analysis deep analysis and detection of such malware is possible. 

      WMI Analysis

      WMI is an extensive interface on Windows to query information about the system. It is also difficult to intercept and analysis. Therefore, it is often used by malware to detect and fingerprint the analysis system. With the latest release of Joe Sandbox all WMI activity is captured.

      Inspection of HTTPS Traffic

      Full HTTPS traffic inspection has been added to Joe Sandbox Cloud. HTTPS analysis is also possible for URL analysis with IE.

      950+ Behavior Signatures 

      We have developed many new behavior signatures. Our complete set has currently over 950 signatures. Many of the new signatures are highly advanced:

      USB Fake Drive

      Want to see if malware infects USB drives? Want to see if malware spreads via network shares? No problem! We have functionality to create a USB fake drive:

      Network shares are simulated with our Adaptive Internet Simulation Technology:

      The features outlined are just a selection. There are various other extensions and improvements which were developed. We have also planned some great new features for 2016! So watch out!

      Happy New Year!

      The Joe Security team wishes you success, satisfaction and many pleasant moments in 2016!

      Introducing Behavior Graphs in Joe Sandbox 13

      We are proud to release today Joe Sandbox 13! The 13 release includes a couple of very cool new features, including:

      • Support for Windows 10
      • 70 new behavior signatures
      • Analysis advice signatures
      • Static unpackers for VBE and SWF
      • Live system performance statistics in the web interface
      • COM Analysis
      • String analysis in compressed files
      • Static file analysis for Flash
      • Static PE file analysis for dropped / downloaded files
      • New tricks to prevent VM-detection
      • ASN detection for IPs 
      • Code obfuscation detection for Hybrid Code Analysis (HCA)
      • Behavior graphs
      • Hybrid Decompilation (HDC) Plugin
      • Big performance improvements

      Beside of the Hybrid Decompilation (DEC) Technology we have also developed a new feature called Behavior graphs. Behavior graphs are new graphs which display the behavior of a sample. They show processes, IPs, domains, dropped files as well as behavior signatures in a connected graph. The graph coloring is very simple and intuitive while the format is clean and well structured.

      We also invested lot of brain power into shrinking and compressing the graph so that it stays small and clear.

      Below you find several behavior graphs with the corresponding Joe Sandbox analysis report:

      Pure Innovation: Hybrid Decompilation with Joe Sandbox DEC

      Joe Security is proud to announce its latest innovative technology - Hybrid Decompilation (HDC). This unique new feature builds upon Hybrid Code Analysis (HCA) to empower the malware analyst with extensive code analysis capabilities. Existing Joe Sandbox reports already include a hybrid low-level disassembly for each relevant function found during the analysis, which combines information from both static and dynamic analyses. Thanks to the Joe Sandbox DEC plugin implementing HDC, Joe Sandbox reports can now also display an equivalent C high-level source code representation for each function, which constitutes a huge boost to the process of reverse engineering.

      Here is a very simple report extract to illustrate the purpose of this Hybrid Decompilation feature.
      Suppose we have the following disassembly for a function in a Joe Sandbox report:

      Joe Sandbox DEC will generate the following corresponding C source code:

      E00406D20(CHAR* _a4) {
       long _t4;
       void* _t6;
       _t6 = CreateMutexA(0, 1, _a4);
       _t4 = GetLastError();
       if(_t4 == 0xb7) {
       if(_t6 != 0) {
        return ReleaseMutex(_t6);
       return _t4;

      As seen in this example, the source code highlights the function parameters and local variables, and makes its control structures and function calls much more explicit.


      Decompilation 101

      The process of translating machine or assembly code to source code is called decompilation. From a high-level perspective, decompilation is the reverse of compilation: it starts with low-level machine code and builds a higher-level representation in several incremental stages. Decompilation is usually much more efficient and gives better results if it can use symbolic information found in the binary file or in associated debug files.
      Decompilation can also be seen as a natural extension to disassembly, and indeed the first stage of a decompilation engine is a disassembler. But besides the usual difficulty of code and data separation during disassembly, the decompilation process must also solve the following issues:

      • Rebuild function prototypes and infer local variables while getting rid of register and stack references.
      • Generate high-level control structures (if, switch/case, do/while/for loops) from basic jumps and compares, discovering calls to known APIs and libraries (such as PE file imports).
      • Retrieve high-level type information (including compound types such as structures and unions). 
      • Assign the correct arguments to function calls.

      This schema presents the global architecture of a generic decompilation engine:

      Decompilation builds on techniques developed initially for compilation, such as control and data-flow analyses, register allocation, loop transformation and alias analyses. But decompilation has its own challenges and it is usually considered extremely difficult to automatically decompile an arbitrary machine code, and even more so for obfuscated malware code for which no symbolic information is available. Keeping this in mind, the goal of Joe Sandbox DEC is to provide the user with a fast decompilation of the most relevant functions found in the analyzed sample, together with a measurement of the quality of the decompilation.

      Hybrid Decompilation

      Compared to a generic decompilation engine, Hybrid Decompilation introduces three powerful features:

      • Instead of running on the initial PE file, which may be packed or contain hidden code, HDC runs on PE files generated from dynamic memory snapshots which give an accurate picture of the code which is actually executed.
      • HCA provides input information to HDC such as known Windows API function calls, discovered used string values and statement execution status. This is akin to retrieving symbolic information and is very useful for achieving better decompilation results.
      • HDC has an extensive knowledge of Windows API types and function prototypes, thus enabling the use of high-level types in the output source code files.

      These features make HDC a big improvement over a purely static decompilation engine:

      • "Better decompilation code coverage": all function entry points discovered by the powerful heuristics of HCA are made available as decompilation entry points.
      • "Better decompilation quality": in particular, knowledge of indirect call targets as provided by HCA makes decompilation both faster and more precise.
      • "Decompiled source code commenting": observed runtime information such as statement execution status and variable value can be added to the decompiled source code in the form of comments.

      Some Hybrid Decompilation Source Code Outputs

      Let us now have a look at some actual examples of HDC-generated C source codes to get a taste of the power of Hybrid Decompilation.
      The first decompiled source code is extracted from the sample studied in blog post

      E0040912A(void* __edi, void* __eflags, long _a4) {
       long _v8;
       long _v12;
       long _v16;
       struct _SYSTEMTIME _v32;
       void* _t13;
       void* _t17;
       void* _t28;
       signed int _t29;
       void* _t30;
       void* _t32;
       CHAR* _t35;
       _t32 = __edi;
       _t35 = _a4;
       GetSystemTime( &_v32);
       if(_v32.wMonth >= 0xb && _v32.wYear >= 0x7da) {
        ExitProcess(0); // executed
       _t13 = E004070C0();
       _t40 = _t13;
       if(_t13 != 0) {
        E00408A06(_t29, __eflags, _t35);
       } else {
       E004084F7(_t29, _t40, _t35);
       _t41 =  *0x4011e8 - 1;
       if( *0x4011e8 == 1) {
       E00409029(_t28, _t30, _t32, _t41);
       _t17 = E00408220();
       if(_t17 != 0) {
        return _t17;
       } else {
        if(_v32.wMonth >= 7 && _v32.wYear >= 0x7da) {
         CreateThread(0, 0, E004098A0, 0, 0,  &_a4);
         CreateThread(0, 0, E00407180, 0, 0,  &_v8);
         CreateThread(0, 0, E00407230, 0, 0,  &_v12);
         if( *0x4011dc == 1) {
          CreateThread(0, 0, E00407A80, 0, 0,  &_v16);
        goto L14;

      The source code really highlights the condition on the system time under which the sample immediately terminates (see lines 18 to 20). Thanks to HDC, the comment of line 20 gives us the information that the evasive behavior has been triggered for the analyzed sample’s run. Still, the static component of Hybrid Decompilation gives information about what occurs in the non-evasive case. In particular, several thread creations may occur at lines 44-48, and the corresponding calls to CreateThread have explicit call arguments including the reference to the function executed by the new thread.
      Our second decompiled source sample is a function belonging to a PE file dropped by the Rombertik malware (see analysis

      E00401960() {
       void _v5;
       void _v6;
       void _v7;
       int _v12;
       int _v16;
       char _v280;
       long _v292;
       long _v308;
       void* _v316;
       char _v572;
       char _v828;
       char* _t36;
       char* _t39;
       void* _t42;
       intOrPtr* _t44;
       int _t46;
       intOrPtr* _t47;
       intOrPtr* _t51;
       int _t54;
       void* _t59;
       _Unknown_base(*)()* _t62;
       void* _t69;
       void* _t71;
       void* _t74;
       int _t77;
       int _t79;
       long _t82;
       void* _t94;
       void* _t98;
       void* _t99;
       void* _t100;
       _v316 = 0x128;
       _t77 = 0x100;
       _t36 =  &_v828;
       goto L1;
       _t79 = 0x100;
       _t39 =  &_v572;
       do {
         *_t39 = 0;
        _t39 = _t39 + 1;
        _t79 = _t79 - 1;
       } while (_t79 != 0);
       _v12 = 0x100;
       CryptStringToBinaryA("aWV4cGxvcmUuZXhl", 0x10, 1,  &_v572,  &_v12, 0, 0);
       while(1) {
        _t42 = CreateToolhelp32Snapshot(2, 0); // executed
        _v12 = _t42;
        Process32First(_t42,  &_v316); // executed
        do {
         _t44 = "chrome.exe";
         do {
          _t44 = _t44 + 1;
         } while ( *_t44 != 0);
         _t46 = StrCmpNA( &_v280, "chrome.exe", _t44 - "chrome.exe"); // executed
         if(_t46 != 0) {
          _t47 =  &_v572;
          if(_v572 == 0) {
           if(StrCmpNA( &_v280,  &_v572, _t47 -  &_v572) != 0) {
            _t51 = "firefox.exe";
            do {
             _t51 = _t51 + 1;
            } while ( *_t51 != 0);
            if(StrCmpNA( &_v280, "firefox.exe", _t51 - "firefox.exe") != 0) {
             goto L39;
            _t99 = OpenProcess(0x1fffff, 0, _v308);
            if(_t99 == 0) {
             goto L39;
            _t59 = GetProcAddress(GetModuleHandleA("kernel32.dll"), "CreateFileW");
            if(_t59 == 0) {
             goto L39;
            _v6 = 0;
            if(ReadProcessMemory(_t99, _t59,  &_v6, 1, 0) != 0 && _v6 != 0xe9) {
             _t62 = E00402690();
             _t100 = _t100 + 8;
             if(_t62 != 0) {
              _t94 = CreateRemoteThread(_t99, 0, 0, _t62, 0, 0, 0);
              if(_t94 != 0) {
               WaitForSingleObject(_t94, 0xffffffff);
            goto L38;
           if(E00402AE0( &_v828) == _v292) {
            goto L39;
           _t99 = OpenProcess(0x1fffff, 0, _v308);
           if(_t99 == 0) {
            goto L39;
           _t69 = GetProcAddress(LoadLibraryA("Wininet.dll"), "HttpSendRequestW");
           if(_t69 == 0) {
            goto L38;
           _v5 = 0;
           if(ReadProcessMemory(_t99, _t69,  &_v5, 1, 0) == 0 || _v5 == 0xe9) {
            goto L38;
           } else {
            goto L35;
          do {
           _t47 = _t47 + 1;
          } while ( *_t47 != 0);
          goto L20;
         _t71 = E00402AE0( &_v828);
         _t82 = _v292;
         if(_t71 == _t82) {
          goto L39;
         _t99 = OpenProcess(0x1fffff, 0, _t82);
         if(_t99 == 0) {
          goto L39;
         _t74 = GetProcAddress(LoadLibraryA("Ws2_32.dll"), "WSASend");
         if(_t74 == 0) {
          goto L38;
         _v7 = 0;
         if(ReadProcessMemory(_t99, _t74,  &_v7, 1, 0) == 0 || _v7 == 0xe9) {
          goto L38;
         } else {
          goto L35;
         _t98 = _v12;
         _t54 = Process32Next(_t98,  &_v316); // executed
        } while (_t54 != 0);
        if(_t98 != 0) {
         CloseHandle(_t98); // executed
        Sleep(0x1388); // executed
        *_t36 = 0;
       _t36 = _t36 + 1;
       _t77 = _t77 - 1;
       if(_t77 != 0) {
        goto L1;
       } else {
        _v16 = 0x100;
        if(CryptStringToBinaryA("ZXhwbG9yZXIuZXhl", 0x10, 1,  &_v828,  &_v16, 0, 0) == 0) {
         _v16 = 0;
        goto L4;

      The decompiled source code makes it clear that this function is in charge of enumerating all processes (infinite while loop starting at line 48), and to look for browser names such as “iexplore.exe” (call to StrmCmpNA at line 62, the browser name is Base64 encoded using the call to CryptStringToBinaryA on "aWV4cGxvcmUuZXhl"at line 47), “chrome.exe” (line 57), “firefox.exe” (line 67). Once a process corresponding to a particular browser is found, the function tries to create a hook in the browser memory loaded DLLs: different functions starting addresses are used for that purpose (CreateFileW for Firefox at line 74, HttpSendRequestW for Internet Explorer at line 104, and WsaSend for Chrome at line 131). Once a suitable address has been found for the hook (calls to ReadProcessMemory at lines 81, 109 and 136), the actual hook injection is performed with a call to CreateRemoteThread at line 88.

      Our last decompiled source code example is extracted from the Dyre Banking Trojan. This malware achieves persistence by registering as the “Google Update” system service using the following function:

      E00402900(short* _a4) {
       signed int _t2;
       void* _t5;
       int _t13;
       void* _t20;
       void* _t25;
       _t2 = OpenSCManagerW(0, 0, 2);
       _t20 = _t2;
       if(_t20 != 0) {
        WriteConsoleW(0, 0, 0, 0, 0);
        while(1) {
         _t5 = CreateServiceW(_t20, L"googleupdate", L"Google Update Service", 0xf01ff, 0x10, 
                              2, 1, _a4, 0, 0, 0, 0, 0);
         if(_t5 != 0) {
         if(RtlGetLastWin32Error() != 0x431) {
          return CloseServiceHandle(_t20) | 0xffffffff;
         } else {
          _t25 = OpenServiceW(_t20, L"googleupdate", 0xf01ff);
          if(_t25 == 0) {
           goto L7;
          } else {
           _t13 = DeleteService(_t25);
           if(_t13 != 0) {
           } else {
            goto L7;
         goto L9;
        return 0;
       } else {
        return _t2 | 0xffffffff;

      Once a handle to the service manager is obtained (lines 8-10), the sample tries to create a “Google Update Service” (line 13) in a loop starting at line 12. If it manages to do so, it exists the loop (line 16), otherwise it checks whether the service creation of line 13 fails with a ERROR_SERVICE_EXISTS error code 0x431 (line 18). If this is the case, it tries to delete the existing service (lines 22 to 27) then loops to restart the malicious service creation (line 29).


      Thanks to its Hybrid Decompilation technology, Joe Sandbox DEC outputs a decompiled function which is much more readable than the associated disassembly, and thus gives a quick and precise insight about the function's functionalities. As a whole, the process of retro-engineering a complex malware is made more efficient by pinpointing hard to decompile functions and let the analyst concentrate on their study by falling back on the still available disassembly code only when necessary.