We’ll tell you a cool story about how “third parties” tried to interfere with the work of our customers, and how this problem was solved.
How it all started
It all began in the morning of October 31, the last day of the month, when many desperately need to manage to close urgent and important issues.
One of the partners who keeps in our cloud several virtual machines of the clients he serves, reported that from 9:10 to 9:20 at once several Windows servers running on our Ukrainian site did not accept connections with the remote access service , users could not access their desktops, but after a few minutes the problem seemed to resolve itself.
We raised the statistics of the communication channels, but did not find any bursts of traffic or failures. Looked at the statistics on the load on computing resources – no anomalies. And what was that?
Then another partner, who places another hundred servers in our cloud, reported the same problems that some of their clients noted, and it turned out that in general the servers are available (they respond correctly to the ping test and other requests), but the service Remote access on these servers either accepts new connections, then rejects them, while it was a question of servers on different sites, traffic to which comes from different data transmission channels.
And let’s look at this traffic. A packet with a request to establish a connection arrives at the server:
xx:xx:xx.xxxxxx IP xxx.xxx.xxx.xxx.58355 > 192.168.xxx.xxx.3389: Flags [S], seq 467744439, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
The server receives this packet, but the connection rejects:
xx:xx:xx.xxxxxx IP 192.168.xxx.xxx.3389 > xxx.xxx.xxx.xxx.58355: Flags [R.], seq 0, ack 467744440, win 0, length 0
This means that the problem is clearly caused not at all by some malfunctions in the infrastructure, but by something else. Maybe all users have problems with licensing on Remote Desktops? Maybe some malware managed to infiltrate their systems, but today it has been activated, as it was with XData and Petya a couple of years ago?
While we sorted it out, we received similar requests from several more clients and partners.
And what happens on these machines?
The event logs are full of messages about trying to find a password.
Typically, such attempts are logged on all servers where the standard port (3389) is used for the remote access service and access is allowed from anywhere. The Internet is full of bots that constantly scan all available connection points and try to find a password (for this reason, we strongly recommend using complex passwords instead of “123”). However, the intensity of these attempts that day was too high.
What to do?
Recommend clients to devote a lot of time to change the settings of a huge number of end users to switch to another port? Not a good idea, customers will not be happy. Recommend to allow access only through VPN? In a hurry and panic to raise IPSec-connections, from whom they are not raised – perhaps, it will be not convenient to clients either. Although, I must say, we always recommend hiding the server on a private network and are ready to help with the settings, and for those who like to sort out, we’ll independently share instructions for setting up IPSec / L2TP in our cloud in site-to-site or road mode -warrior, and if anyone wants to raise a VPN service on their own Windows server, we are always ready to share tips on how to raise a standard RAS or OpenVPN. But no matter how cool we were, this was not the best time to conduct educational work among customers, since it was necessary to eliminate the problem with minimal strain for users as quickly as possible.
The solution we implemented was as follows. We set up an analysis of the passing traffic in such a way as to monitor all attempts to establish a TCP connection to port 3389 and select from it addresses that within 150 seconds try to establish connections with more than 16 different servers on our network – these are the sources of the attack ( Of course, if one of the clients or partners has a real need to establish connections with so many servers from the same source, you can always add such sources to the “white list.” Moreover, if in one class C network for these 150 seconds, more than 32 addresses were detected, it makes sense to block the entire network. The blocking is set for 3 days, and if no attacks from this source were made during this time, this source is automatically removed from the “black list”. The list of blocked sources is updated every 300 seconds.
We are ready to share the source code of such a system, it has nothing super complicated (these are a few simple scripts written literally in a couple of hours), and at the same time it can be adapted and used not only to protect against such an attack, but also to identify and block any network scan attempts.
In addition, we made some changes to the settings of the monitoring system, which now closely monitors the reaction of the control group of virtual servers in our cloud to an attempt to establish an RDP connection: if the reaction did not follow for a second, this is an occasion to pay attention.
The solution turned out to be quite effective: there are no more complaints from customers and partners, as well as from the monitoring system. The “black list” regularly includes new addresses and entire networks, which indicates that the attack continues, but no longer affects the work of our customers.
Safety in numbers
Today we learned that other operators have faced a similar problem. Someone still believes that it was Microsoft who made some changes to the remote access service code (if you remember, we suspected the same thing on the first day, but we rejected this version very soon) and promises to do everything possible to find a solution soon. Someone simply ignores the problem and advises clients to protect themselves (change the connection port, hide the server on a private network, and so on). And on the very first day, we not only solved this problem, but also created some groundwork for a more global threat detection system, which we plan to develop.