Load balancing client connections to application servers is crucial for the application experience. Over the years that load balancers have been deployed, a number of algorithms have been invented and optimized to deliver improved performance.
Round robin is the most straightforward load balancing algorithm. Client requests get distributed to application servers in simple rotation. For example, if you have three application servers: the first client request goes to the first application server in the list, the second client request to the second application server, the third client request to the third application server, the fourth to the first application server, and so on.
Weighted round robin is similar to the round robin algorithm, but it adds the ability to spread the incoming client requests across the server pool according to the relative capacity of each server. It is most appropriate for spreading incoming client requests across servers with varying capabilities or available resources. The administrator assigns a weight to each application server based on criteria of their choosing that indicates the relative traffic-handling capability of each server in the pool.
The servers assigned a higher weight get allocated a higher percentage of the incoming requests. In a three-server pool, if Server 1 is twice as powerful as Servers 2 & 3, then the weighing could be set like this:
Least connections load balancing is a dynamic load balancing algorithm where client requests are distributed to the application server with the least number of active connections when the client request is received. In cases where application servers have similar specifications, one server may get overloaded due to longer-lived connections. This algorithm considers the dynamic connection load and doesn't send requests to servers that cannot handle them.
Weighted least connection extends the least connection algorithm to account for differing application server characteristics. The administrator assigns a weight to each application server based on relative processing power and available resources. Load balancing decisions get based on active connections and the assigned server weights (e.g., if there are two servers with the lowest number of connections, the server with the highest weight is preferred).
Resource-based (or adaptive) load balancing makes decisions based on status indicators retrieved by the load balancer from the application servers. The status gets determined by an agent running on each server. LoadMaster queries each server regularly for this status information and then appropriately sets a dynamic weight for each server. This load-balancing method is essentially performing a detailed health check on the application servers.
In the fixed weighting load balancing algorithm, a weight gets assigned to each application server based on criteria representing each server's relative traffic-handling capability. The application server with the highest weight will receive all of the traffic. If the application server with the highest weight isn't available to handle more connections, the load balancer will direct all traffic to the next highest-weight application server.
The weighted response time algorithm uses an application server's response time to calculate a server weight. The application server that is responding the fastest receives the next request. A use case for weighted response time load balancing is where rapid application response time is the paramount concern.
The source IP hash load balancing algorithm uses the client's source and destination IP addresses to generate a unique hash key to tie the client to a particular server. As the key can be regenerated if the session disconnects, this allows reconnection requests to get redirected to the same server used previously. This is called server affinity. This load balancing method is most appropriate when a client must always return to the same server on each successive connection, like in shopping cart scenarios where items placed in a cart on one server should be there when a user connects later.
The URL hash load balancing algorithm is similar to the source IP hashing, except that the hash created is based on the URL in the client request. This ensures that any client requests to a particular URL always go to the same back-end server. A typical use case would be to direct traffic to an optimized media server that can play video or an optimized server for a particular task.
DNS load balancing is a commonly used technique for load balancing in simple scenarios and also for distributing traffic across multiple data centers, possibly in different geographic regions.
Unsurprisingly, the simple static Round Robin algorithm is the most common. As it's so simple to set up, it is often used to test whether a load balancer and server pool are communicating. In many simple deployments, after this initial setup, it remains the default until more dynamic algorithms are required. The weighted version of round robin often gets used when the backend servers are not identical, but needs are still simple.
Many factors will influence the choice of which load balancing algorithm to choose. The base choice should be to deploy dedicated load balancers rather than rely on DNS-based load balancing.
Use the algorithm that delivers the needed application experience with the lowest overhead. If the traffic patterns and load on the infrastructure are well known, then using simple algorithms with a properly sized application server pool may be sufficient. If load levels are not predictable, or if available infrastructure dictates a server pool with varying resource levels, then the more dynamic algorithms that take into account the state of the servers and the network would be more appropriate and deliver the best application experience.
When deciding which algorithm to use, remember that the load balancer may host other functionality such as proxy services, TLS/SSL encryption offloading, or Global Server Load Balancing (GSLB). The resources available to the load balancer should be able to support the core load balancing algorithms in use, plus any other services running on that instance.
The Kemp LoadMaster consultancy team is always available to assist you in making the right choices and getting the best user application experience from your available infrastructure.