In our previous cloud-based material, we talked about how to protect IT resources in a public cloud and why traditional antiviruses are not quite suitable for these purposes. In this post we will continue the topic of cloud security and talk about the evolution of WAF and what is better to choose: hardware, software or the cloud.
What is WAF?
More than 75% of hacker attacks are aimed at vulnerabilities in web applications and sites: such attacks are usually invisible to the information security infrastructure and information security services. Vulnerabilities in web applications carry, in turn, the risks of compromise and fraud of user accounts and personal data, passwords, credit card numbers. In addition, vulnerabilities in the website serve as an entry point for cybercriminals in the corporate network.
Web Application Firewall (WAF) is a firewall that blocks attacks on web applications: SQL injection, crossite scripting, remote code execution, brute force, and auth bypass. Including attacks using zero-day vulnerabilities. Application firewalls provide protection by monitoring the contents of web pages, including HTML, DHTML and CSS, and filtering potentially malicious requests via HTTP / HTTPS.
What were the first decisions?
The first attempts to create a Web Application Firewall were made in the early 90’s. At least three engineers who work in this area are known. The first is Gene Spafford, professor of computer science at Purdue University. He described the architecture of the application firewall with a proxy and in 1991 published it in the book “UNIX Security in Practice”.
The second and third – were information security specialists William Cheswick and Marcus Ranum from Bell Labs. They developed one of the first prototypes of application firewalls. DEC was engaged in its distribution – the product was released under the name SEAL (Secure External Access Link).
But SEAL was not a complete WAF solution. It was a classic network firewall with advanced functionality – the ability to block attacks on FTP and RSH. For this reason, the first WAF solution today is considered the product of Perfecto Technologies (later Sanctum). In 1999, they introduced the AppShield system. At that time, Perfecto Technologies was developing information security solutions for e-commerce, and online stores became the target audience for their new product. AppShield was able to analyze HTTP requests and blocked attacks based on dynamic information security policies.
Around the same time as AppShield (in 2002), the first open source WAF appeared. They became ModSecurity. It was created in order to popularize WAF technologies and is still supported by the IT community. ModSecurity blocks attacks on applications based on a standard set of regular expressions (signatures) – tools for checking queries against a pattern – OWASP Core Rule Set.
As a result, the developers managed to achieve their goal – new WAF solutions began to appear on the market, including those built on the basis of ModSecurity.
Three generations is history
It is customary to distinguish three generations of WAF systems that have evolved as technology develops.
First generation. Works with regular expressions (or grammars). This includes ModSecurity. The system provider examines the types of attacks on applications and generates patterns that describe legitimate and potentially malicious requests. WAF checks these lists and decides what to do in a particular situation – block traffic or not.
An example of regular expression discovery is the open source Core Rule Set project already mentioned. Another example is Naxsi, which is also open source. Systems with regular expressions have several drawbacks, in particular, when a new vulnerability is discovered, the administrator has to create additional rules manually. In the case of a large-scale IT infrastructure, there may be several thousand rules. Managing so many regular expressions is quite difficult, not to mention that checking them can reduce network performance.
Regular expressions also have a fairly high level of false positives. The famous linguist Noam Chomsky proposed a classification of grammars, in which he divided them into four conditional difficulty levels. According to this classification, regular expressions can describe only firewall rules that do not imply deviations from the pattern. This means that attackers can easily “trick” the first generation WAF. One of the methods to combat this is to add special characters to application requests that do not affect the logic of malicious data, but violate the signature rule.
Second generation. To get around performance and accuracy issues with WAF, second-generation application firewalls have been developed. Parsers appeared in them that are responsible for identifying strictly defined types of attacks (HTML, JS, etc.). These parsers work with special tokens that describe requests (for example, variable, string, unknown, number). Potentially malicious sequences of tokens are placed in a separate list with which the WAF system is regularly checked. This approach was first shown at the Black Hat 2012 conference in the form of the C / C ++ lib injection library, which allows detecting SQL injections.
Compared to first-generation WAFs, specialized parsers can work faster. However, they did not solve the difficulties associated with manually configuring the system when new malicious attacks appeared.
Third generation. The evolution in the third generation detection logic consists in the application of machine learning methods, which make it possible to bring the detection grammar as close as possible to the real SQL / HTML / JS grammar of protected systems. This detection logic is able to adapt the Turing machine to cover recursively enumerated grammars. Moreover, earlier the task of creating an adaptable Turing machine was unsolvable until the first studies of neural Turing machines were published.
Machine learning provides a unique opportunity to adapt any grammar to cover any type of attack without creating lists of signatures manually, as required when detecting the first generation, and without developing new tokenizers / parsers for new types of attacks, such as deployments of Memcached, Redis, Cassandra, SSRF, as required by the second generation methodology.
Combining all three generations of detection logic, we can draw a new diagram in which the third generation of detection is represented by a red outline. This generation includes one of the solutions that we implement in the cloud together with Onsek, the developer of the adaptive protection platform for web applications and the Valarm API.
Now, the detection logic uses feedback from the self-tuning application. In machine learning, this feedback loop is called reinforcement. Typically, there are one or more types of such reinforcements:
- Analysis of application response behavior (passive)
- Scan / Fuzzer (Active)
- Report files / interceptor / trap procedures (post factum)
- Manual (determined by the supervisor)
As a result, third-generation detection logic also solves the important accuracy problem. Now it is possible not only to avoid false positives and false negatives, but also to detect valid true negative results, such as detecting the use of the SQL command element in the control panel, loading web page templates, AJAX requests related to JavaScript errors, and others.
Next, we consider the technological capabilities of various WAF implementation options.
Hardware, software or cloud – what to choose?
One of the options for implementing application firewalls is a hardware solution. Such systems are specialized computing devices that the company installs locally in its data center. But in this case, you have to purchase your own equipment and pay money to the integrators for its setup and debugging (if the company does not have its own IT department). At the same time, any equipment becomes obsolete and deteriorates, so customers are forced to lay a budget for updating hardware.
Another WAF deployment option is a software implementation. The solution is installed as an add-on for any software (for example, ModSecurity is configured on top of Apache) and runs on the same server with it. Typically, such solutions can be deployed both on a physical server and in the cloud. Their minus is the limited scalability and support from the vendor.
The third option is to configure WAF from the cloud. Such solutions are provided by cloud providers as a subscription service. The company does not need to purchase and set up specialized hardware; these tasks fall on the shoulders of the service provider. An important point – modern cloud-based WAF does not imply the migration of resources to the provider’s platform. A site can be deployed anywhere, even on-premise.
Why now more and more often look towards cloudy WAF, we will tell further.
What can WAF in the cloud
In terms of technological capabilities:
- The provider is responsible for the updates. WAF is provided by subscription, so the service provider monitors the relevance of updates and licenses. Updates concern not only software, but also hardware. The provider upgrades the server fleet and is engaged in its maintenance. He is also responsible for load balancing and redundancy. If the WAF server fails, traffic is immediately redirected to another machine. Rational distribution of traffic allows you to avoid situations when the firewall enters fail open mode – does not cope with the load and stops filtering requests.
- Virtual patch. Virtual patches restrict access to compromised parts of the application until the vulnerability is closed by the developer. As a result, the customer of the cloud provider gets the opportunity to calmly wait until the supplier of one or another software publishes official “patches”. Doing this as quickly as possible is a priority for the software provider. For example, in the Valarm platform, a separate software module is responsible for virtual patching. An administrator can add custom regular expressions to block malicious requests. The system makes it possible to flag some queries with the Confidential Data flag. Then their parameters are masked, and they themselves under no circumstances are transferred outside the working area of the firewall.
- Built-in perimeter and vulnerability scanner. This allows you to independently determine the network boundaries of the IT infrastructure using the data of DNS queries and the WHOIS protocol. After WAF automatically analyzes the services running inside the perimeter (performs port scanning). The firewall is able to detect all common types of vulnerabilities – SQLi, XSS, XXE, etc. – and detect errors in the software configuration, for example, unauthorized access to the Git and BitBucket repositories and anonymous calls to Elasticsearch, Redis, MongoDB.
- Attacks are monitored by cloud resources. As a rule, cloud providers have large amounts of computing power. This allows you to analyze threats with high accuracy and speed. In the cloud, a cluster of filtering nodes is deployed through which all traffic passes. These sites block attacks on web applications and send statistics to the Analytics Center. It uses machine learning algorithms to update blocking rules for all protected applications. Such adapted security rules minimize the number of false positives for the firewall.
Now a little about the features of cloudy WAF in terms of organizational issues and management:
- Transition to OpEx. In the case of cloud-based WAF, the cost of implementation will be zero, since all the hardware and licenses have already been paid by the provider, payment for the service is made by subscription.
- Different tariff plans. A user of a cloud service can quickly enable or disable additional options. Function management is implemented from a single control panel, which is also protected. Access to it is via HTTPS, plus there is a two-factor authentication mechanism based on the TOTP protocol (Time-based One-Time Password Algorithm).
- DNS connection. You can change the DNS yourself and configure routing on the network. To solve these problems, it is not necessary to recruit and train individual specialists. As a rule, technical support of the provider can help with the configuration.
WAF technologies have evolved from simple firewalls with empirical rules to complex security systems with machine learning algorithms. Now application firewalls have a wide range of functions that were difficult to implement in the 90s. In many ways, the emergence of new functionality was made possible thanks to cloud technology. WAF solutions and their components continue to evolve. As well as other areas of information security.