The load balancer market is expected to grow to £4.7billion by 2023, and trends such as mobile broadband, multi-cloud and hybrid cloud, virtualisation, remote working, and bring your own device (BYOD) have helped to fuel this growth. The result is that tremendous pressure is being placed on IT departments to ensure high availability for mission-critical applications such as ERP, communication and collaboration systems, and virtual desktop infrastructure (VDI).
The need for high availability
High availability, which is the ability of a system or system component to be continuously operational for a desirably long period of time, can help IT departments implement an architecture that uses redundancy and fault tolerance to enable continuous operation and fast disaster recovery. This is true for every element of the data centre—from high availability for applications to high availability for the load balancer or application delivery controller (ADC) that manages network traffic within and across the data centres in an environment.
High availability begins with identifying and eliminating single points of failure in the infrastructure that might trigger a service interruption—for example, by deploying redundant components to provide fault tolerance in the event that one of the devices fails. Load balancing, whether provided through a standalone device or as a feature of an ADC, facilitates this process by performing health checks on servers, detecting potential failures, and redirecting traffic as needed to ensure uninterrupted service.
While ensuring fault tolerance for servers is obviously critical, a high availability architecture must also consider the load balancing layer itself. If this becomes unable to perform its function effectively, the servers below run the risk of overflow, potentially compromising their own health as well as application performance and application availability. This makes redundancy just as important for the load balancer or ADC as for any other component in the data centre.
As with a high availability server cluster, there are several ways in which load balancers or ADCs can be deployed to provide high availability, including:
- Active-standby – The most common configuration, the active-standby model includes a fully redundant instance of each ADC which is brought online only in the event that its primary node fails. Each active ADC can be configured differently, though each active-standby pair will share the same configuration.
- Active-active – In this model, multiple similarly configured ADCs are deployed for routine use. In the event that one node fails, its traffic is taken over by one or more of the remaining nodes and load balanced as needed to ensure consistent service. This approach assumes that there will be sufficient capacity available across the cluster for it to function even when one ADC is unavailable.
- N+1 – Providing redundancy at a lower cost than active-standby, an N+1 configuration includes one or more extra ADCs that can be brought online in the event that any of the primary ADCs fails.
In each case, rapid failover enables fault tolerance and disaster recovery for the load balancing function so that application performance and application availability are not affected by the failure. Failover and traffic management is typically managed through a version of the Virtual Router Redundancy Protocol redundancy standard.
Key high availability features for load balancing or ADC
In addition to ensuring high availability for your ADC, you should also make sure that your ADC provides high availability for the applications whose traffic it manages. In the event that a server fails, the ADC can reroute traffic to another available server in the cluster. Key features that enable this function include:
- Load balancing methods – There are several methods that can be used for server selection, including round robin, least connections, weighted round robin, weighted least connections, fastest response, and more. Your ADC should offer all these options to allow the most suitable configuration for your environment and priorities.
- Health monitoring – To ensure rapid failover with little or no downtime, server health should be continuously assessed based on a number of indicators, including:
o Time series of total bytes in and out from each server
o Time series of traffic rates (in Mbps) in and out from each server
o Percent of error traffic over range
o Number of good SSL connections
o Average application server latency by service
o Client-side latency SRTT, max, min, and average as a time series
o Custom health checks such as measuring the response time for specific SQL database queries
Why this is so critical?
As enterprises become further dependent on the Internet to get business done, the threat of downtime can become a competitive disadvantage. With downtime estimated to cause losses of around £780,000 per week for a company with roughly 10,000 employees, the direct losses are substantial and a primary reason why businesses need to establish a high availability solution. Apart from the direct cost of downtime we also see business continuity, in terms of reputation and data loss, as another factor encouraging businesses to ensure high availability is implemented. Firstly, reputation will improve as the business and brand is known for its reliability versus its competitors. Secondly, reducing risk of data loss is essential as due to the severe penalties incurred under the terms of the GDPR. A highly available infrastructure also mitigates the negative impact of outages to revenue and productivity.