Rate Limiting: Basic Traffic Control Technique With Some Complexities

The holiday season is here and so is bad bot traffic, which rose by 50 percent. The surge of year-end online shopping unsurprisingly attracts bad actors who are seeking out the unprepared and poorly defended. Many businesses tend to encounter difficulties with cybersecurity as they see huge increases in transactions.

One of the most common largely bot-driven attacks businesses have to deal with is distributed denial of service or DDoS. This attack may be used by cybercriminals for DDoS-for-ransom schemes, by state-sanctioned threat actors that seek to disrupt economic activities, by anti-capitalism activists, and by other businesses that attempt to deprive their competitors of access to more customers.

One of the best solutions against DDoS is rate limiting, which is essentially a technique that regulates network traffic to prevent certain users from overwhelming system resources. However, rate limiting is not as simple as it sounds, so doing it as a DIY cyberattack prevention solution may not work for everyone.

The nature of rate limiting

What is rate limiting in its more complex aspects? For one, rate limiting is designed to operate within applications, not on the web server. It is different from similar server-based functions that involve toggles and thresholds for automatic traffic suspension or throttling.

Rate limiting solutions are designed to automatically determine if the requests made by specific users (based on their IP address) are normal or malicious. The determination is mostly based on the time elapsing between requests. If a user associated with a specific IP address makes an abnormal number of requests within a specific timeframe, the rate-limiting system would refuse to serve the succeeding requests or cut the traffic down.

In rate-limited applications, a message may be flashed to inform users that they are making too many requests. Often, it is through these messages that users find out that their device has been infected by a malware or bot that generates automated requests to flood a specific target server.

Another aspect of rate limiting that may not be as straightforward as what many would expect is the algorithms. There are different types of algorithms involved. Three of the most common ones are the fixed-window, sliding-window, and leaky bucket algorithms.

Fixed-window algorithm imposes a restriction on the number of requests that can be made by a user or a specific IP address within a specified timeframe. A user may be blocked from connecting to the server after it exceeded the number of allowable requests but will be able to make additional requests once the full minute (within which the limit was imposed) elapses.

Sliding-window rate limiting is comparable to fixed-rate limiting, but it is different when it comes to the determination of the timeframe in which a restriction is imposed. In fixed-window rate limiting, requests are counted and limits are enacted in full minutes (i.e 1:00-1:01, 1:01-1:02, etc) regardless of when the request is made. In the sliding-window algorithm, the timeframe starts at the time a request is made. Hence, a request made on 1:00:33 will have a full-minute timeframe of 1:00:33 to 1:01:33, not the usual 1:00 to 1:01.

Leaky bucket rate limiting is a non-timeframe-based algorithm. Instead, it follows a first-come-first-serve protocol wherein a queue of requests is established. A specific queue size is set by the administrator, and requests made when the queue is already full are automatically dropped or refused.

Organizations will have to decide which specific algorithm to use relative to their specific needs and preferences. The timeframe-based algorithms are often good enough, but there are instances when going “leaky bucket” makes more sense. For example, multiple connections from a single IP address may be allowed to achieve faster multi-segment downloads of large files.

Multi-segment downloads are good for delivering better user experiences, but it can quickly exhaust resources. Hence, a fixed queue of requests is a good compromise. It may reduce the number of users served per timeframe, but it provides a significantly better experience because of the faster download speeds every user gets

Types of rate limiting controls

Rate limits can be on a user, geography, or specific server basis. Administrators can limit network traffic based on a specific user, IP address, or device. The restriction may also be determined based on the location of the device through which a request is made. Additionally, it can be based on the server to which the request will be sent and processed.

This may sound confusing to some, though, since it was mentioned earlier that rate limiting works within applications, not on web servers. Here, rate limiting may actually be implemented through a server. To clarify, rate limiting through servers is just an option meant to achieve certain goals.

Ultimately, the rate limiting system still runs within applications and obtains most of the information it needs to activate restrictions from the applications. Rate limiting through servers is usually a way to optimize server use, putting higher limits to more active servers and lower on less active (or sometimes less reliable) ones. It may also be a way to avoid unpleasant customer experiences by reducing or avoiding the processing of requests at servers that are currently unreliable or malfunctioning.

More than DDoS protection

Aside from DDoS, rate limiting may also be used to address credential stuffing. This is a form of cyberattack wherein bots inject stolen login credentials into login forms in an attempt to gain access. The stolen usernames and passwords are not necessarily intended for a specific website or login form, but bots are created to try using these credentials to log in anyway, knowing that many use the same usernames and passwords across multiple online accounts.

Rate limiting is also used as protection against a brute force attack. This attack is similar to credential stuffing, except that it is based on guesswork. The credentials used to attempt to log in are randomly generated by the bots. Websites or apps that do not impose limits on the number of unsuccessful login attempts to provide opportunities for this attack to work.

Additionally, rate limiting helps prevent business sabotages like inventory denial. There are bots designed to simulate customers and hoard inventory by adding them to the shopping cart (without checking them out) or by initiating transactions and not completing them. Rate limiting can help reduce the adverse impact of this scheme.

Moreover, rate limiting helps prevent attempts to scrape or steal sensitive data. For example, some unscrupulous ecommerce businesses collect pricing information from their competitors to help them set prices that are more attractive to bargain-hunting customers or those who frequent comparison websites. Rate limiting limits their ability to obtain the information they seek frequently or in real time.

In conclusion

Rate limiting is a highly useful method in addressing the problem of bad bots. However, it is important to see it as a tool that can involve multiple features, functions, or configurations. Some rate limiting solutions are better than others. There are those that offer advanced bot protection or sophisticated bot management that includes advanced defense for applications, microservices, and APIs. It is advisable to thoroughly understand it and look at rate limiting solutions carefully.