December 11, 2020

Mozitools - Mozi Botnet related tools

Introduction

Mozi is the name of a new malware that has been seen for the first time in September 2019. According to netlab360, it seems that Mozi is a new IoT Botnet using P2P (Peer to Peer) based on the DHT (Distributed Hash Tables).
“New” may not be the right word because Mozi has a lot of similarities with Gafgyt, IoT Reaper and Mirai.

As you may know, more and more IoT devices are connected to the internet. Most of them aren’t secure because they contain unpatched vulnerabilities or default passwords.
In Mozi and other botnet, a compromised host will in turn try to compromise new hosts. So the more hosts are part of the botnet, the more hosts will be compromised.

Thus, it’s important to be able to track a botnet. Tracking active nodes to understand the botnet size and the associated danger. But also tracking botnet commands to be able to detect attacks as soon as possible.

When learning about Mozi, I wasn’t able to find any tools available publicly to study the samples and the botnet.

I wanted to develop a few tools that could be used to do the following actions:

Unpack a sample;
Decode the configuration;
Track the botnet.

You can find the tool on GitHub : Kn0wl3dge/mozitools

Implementation

Unpacking the sample

A simple strings will show that the sample is packed using UPX. UPX is the most known packer. It’s an Open Source tool to compress executables.

However, UPX doesn’t seem to be able to uncompress the sample.

When packing a binary, UPX adds two interesting headers:

struct l_info
{
  uint32_t l_checksum;
  uint32_t l_magic; // "UPX!"
  uint16_t l_lsize;
  uint8_t l_version;
  uint8_t l_format;
};

struct p_info // This one seems corrupted from the previous UPX error
{
  uint32_t p_progid;
  uint32_t p_filesize;
  uint32_t p_blocksize;
};

So, it seems that Mozi developer overwritten p_filesize and p_blocksize with nullbytes two avoid unpacking using UPX.
A good point is that the p_filesize field is also available at the end of the binary and should be equals to p_blocksize.

So, we can recover the p_filesize by reading four bytes at the offset end - 12 of the file. Then we look for the first UPX! string know that l_info_offset = find("UPX!") - 4. Now, we can modify p_filesize and p_blocksize at l_info_offset + 16 and l_info_offset + 20 by the four recovered bytes.

Then, we can successfully unpack the Mozi.a file.

If you want more information on this part, you should definitely check out the following articles.

https://blag.nullteilerfrei.de/2019/12/26/upx-packed-elf-binaries-of-the-peer-to-peer-botnet-family-mozi/ by Born
https://cujo.com/upx-anti-unpacking-techniques-in-iot-malware/ by Albert Zsigovits

Decoding the configuration

The Mozi configuration looks like this:

[ss]botv2[/ss][dip]192.168.2.100:80[/dip][hp]88888888[/hp][count]http://ia.51.la/go1?id=17675125&pu=http%3a%2f%2fv.baidu.com/[idp][/count]

This configuration is encrypted using a simple XOR and hardcoded in the unpacked sample. The XOR key is also hardcoded in the unpacked sample and its value is 4E665A8F80C8AC238DAC4706D54F6F7E in hexadecimal.

Because the configuration seems to always start with [ss], we can look for 151529D2 = "[ss]" ^ XOR_KEY[:4] in the unpacked sample. The config size is fixed at 428 bytes and is padded with null-bytes. Which mean we’ll see a repetition of the xor key in the unpacked sample because x ^ 0 = x.

According to 360Netlab, the configuration could contain the following tags:

ss: Bot role
ssx: enable/disable tag [ss]
cpu: CPU architecture
cpux: enable/disable tag [cpu]
nd: new DHT node
hp: DHT node hash prefix
atk: DDoS attack type
ver: Value in V section in DHT protocol
sv: Update config
ud: Update bot
dr: Download and execute payload from the specified URL
rn: Execute specified command
dip: ip:port to download Mozi bot
idp: report bot
count: URL that used to report bot

A digital signature of 96 bytes is appended to that configuration followed by a 4 bytes version. Two ECDSA384 public keys are used to verify the legitimacy of the encrypted and decrypted configuration file.

Mozi firstly decode its hardcoded configuration and then will query other nodes through the DHT protocol to obtain a newer configuration.

That’s where I had the idea to track the Mozi configuration.

Tracking the Mozi botnet

I firstly wanted to request URLHaus API for all Mozi links, then download the samples and extract the configuration.

But after reading more article about this botnet, I understood that I could do something better.

Mozi uses the BT-DHT protocol (Bittorent Distribued Hash Tables), part of the P2P system. The goal of a DHT is to provide a lookup table to find a node able to send a specific data from a given key. Mozi implements a part of this protocol and will put the latest known configuration in the responses of some requests if the sender “looks like” a Mozi node.

To be more precise, each Mozi node will query the P2P network with a request called find_node slightly modified. Basically, Mozi uses a 4 bytes flag in the version field where the first one is randomly generated, the second one is hardcoded into the configuration, and the 2 others are computed using the firsts ones. When a find_node query is received by a Mozi node, the same computation will be applied on the 2 firsts bytes of the flag and then a comparison will be made with the 2 lasts ones. If the flag is correct, the Mozi node will respond with the config in the nodes field instead of the list of the nearest nodes.

Mozi nodes are easy to find in a DHT network since their 20 bytes node ID (node hash) begin with 888888888 as described in the hardcoded configuration. So I just have to generate the 12 remaining bytes which will give me an ID near a Mozi node.

The best way to see this is through wireshark. Here’s a traditional DHT traffic:

This kind of response to a find_node query happens at the beginning of a search when a node new ask known trackers (boostraping nodes) for the nearest node to a Mozi one. Then I’ll query nodes in the responses to ask what’re the nearest nodes they know to a Mozi node. And after few iterations, I began obtain responses like this one:

This is really interesting since the id of all nodes begin with 3838383838383838 which is corresponding to the node hash prefix we saw earlier in the static configuration (88888888) in the field hp.

The next iterations will directly query the Mozi nodes and the responses will look like this:

Here, you can clearly recognize the pattern you saw earlier when watching the encrypted configuration. So a Mozi configuration has successfully been recovered from a fake node.

Next, I automated the Mozi tracker to find new nodes, decrypt the configuration and import the results in an Elastic database.

Results

I have been running the tracking script in a Kubernetes cluster for about two weeks without any scaling. The network consumption was about 10MB/s.

I have been able to identify about 75.000 IP address that when queried responded with a Mozi configuration. Which mean that they must be Mozi node part of the botnet.

Three nodes types have been identified by count order: botv2, sk and bot. Four slightly different configuration have been identified in a total of more than 10 million configuration imported in the database.

No attack orders have been detected during the two weeks tracking.

Then I made some modifications to the tracker to be able to extract the port and the node ID.
I also increased so node caching system to reduce the number of stored config since I have limited space.

Again, after 2 weeks I obtained the following results.

This is interesting because the number of unique IP is not the same as the number of unique node ID.
I’ll explain this later in this report.

The histogram view is also interesting. As we can see a pattern where there are less nodes identified between 6:00pm and 3:00am.
I wasn’t able to understand why this happen as I am node looking IP addresses in a specific location.

The pattern at 4:00am is caused by the reboot of my cluster for the daily transactional update.

Finally, I am also wondering what does the sk node type mean? Between the two tests, the three same configurations have been identified. It’s interesting to see multiples configurations persist on the P2P network.

Limitations

Earlier we saw that the number of unique IP address is different from the number of node ID. When playing with Kibana, the first thing we see is that some IP have multiple open ports that send back a Mozi configuration.

So, is Mozi randomly changing its port because it got restarted or is this some kind of NAT? Honestly, I am not sure about that, because a NAT system shouldn’t forward random ports. But we’re seeing the same group of ports over the time which could indicate multiples hosts or Mozi instances.

The same thing applies to the Mozi ID. Because, it’s randomly generated at Mozi start, it should be different at each reboot. The ID is composed of 8 bytes prefix and 12 bytes random suffix which mean 256^12 possibilities. However, it is possible to see the same ID under a different IP address. This could be explained by a compromised host behind a dynamic IP.

What’s next?

I definitely need to look deeper by doing some reverse engineering on a sample. It could allow me to have a better understanding of the bot types and the persistence the main configurations.

It could also be interesting to look at the EC signature. If someone is able to recover the private key or find a vulnerability in the implementation, then it would be possible to generate a valid configuration and spread it on the DHT network which could kill the botnet.

I am also wondering what to do with all this data. Since compromised hosts are then attacking other live hosts to increase the botnet size, it could be interesting to add the IP addresses to a bad reputation list or in an IOC list.

Conclusion

I think it wasn’t a complex and huge python project but I learned a lot of things about botnet, malware packing, and P2P / DHT protocol.
However, a lot of questions are left incomplete. I definitely need to look deeper to find the answers.

Do not hesitate to check out the source code if you’re interested in extracting Mozi configuration or creating your own tracker. Everything is available on GitHub : Kn0wl3dge/mozitools