Load Balancers – bringing zen to your holistic SEO strategy
In the 1990’s, over two thousand search engines competed for a share of the Internet search market. Google, who launched in 1999, disrupted the search market and within two years captured from 65 to 95% share. Their influence on the search market is so pervasive that the company name is now a verb to describe searching.
Google struggled to find a business model until 2003 when it launched its Google AdWords Program. Since that time its Pay Per Click (PPC) advertising model has grown in to a $52B per year business. Google discovered that searches for specific items often led to purchases from vendors that the search results linked to. So instead of charging for impressions, as was the default at the time, Google charged its customers for links that were clicked. Google search has the highest conversion rate and one of the lowest Cost Per Acquisition (CPA) ratios among advertising programs.
How did Google come to dominate?
Google became the dominant search engine thanks to its famous PageRank algorithm. This introduced third party site validation via a link-authority system. The more sites that linked to a page were taken as an indicator of how important it was. And the page was returned higher up in search results. This changed the way search engines worked. It made searching much faster and gave Google a huge competitive advantage. Other search engine vendors were forced to adopt similar techniques.
How did this affect the Web?
Google iterated on the original PageRank algorithm and introduced additional algorithms to complement it on a regular basis. Commercial website owners, who were dependent on organic visitors and paid traffic, had to continuously adjust their sites to accommodate the changing algorithms. As did non-commercial website operators who had information to share. Google’s goal was to provide its search users with the best and most relevant results. This, in turn, provided its AdWords clients with the best set of customers. The algorithms were continuously adjusted with this goal in mind.
Webmaster’s have it tough!
In addition to keeping their sites optimised for the evolving Google algorithms, website owners were also faced with the following challenges:
- Spreading traffic between multiple servers for availability and performance
- Connecting sessions to the appropriate servers for larger global websites
- Maximising Site or page speed – the time it takes a website to be loaded up in a browser
- Delivering SSL Secure Pages – a growing importance for Google as it tries to ensure a more secure internet
- Integrating 3rd Party components – websites are generally made up of various components. Integrating them on a single platform has become difficult
- Content Security, guarding against DDoS Attacks
- Implementing and maintaining PCI DSS Compliance for financial transactions
- Content duplication from Content Delivery networks, replicating content and servers for failover to guard against downtime
Why do Webmasters have it tough?
The basic design of the Internet is a major stumbling block for website owners, surprising as this might seem! One of the core Internet protocols is the Domain Name Service (DNS). It was designed early in the life of the Internet, and its function is to link human readable identifiers, like https://www.google.com, with machine friendly numeric addresses, like 22.214.171.124. Every device, browser, Internet enabled application and Internet services like email, rely on DNS to function. Because of its architecture DNS is often a weak point in website provision. It presents a single point of failure. Whilst it’s true that each service registered with DNS can be assigned three separate addresses, this basic redundancy doesn’t provide much of a safety net for popular and busy websites. Additionally, standard DNS has no provision for routing access requests to globally distributed server farms, nor does it provide any defense against malicious denial of service attacks, or non-malicious traffic spikes that can occur if Stephen Fry or Lady Gaga mentions a website on Twitter. To handle these issues, website providers need more robust tools.
Application Delivery Controller (ADC) – the silver bullet?
ADC’s (also known as load balancers) are specialised servers that distribute the load between web servers and provide a number of critical support features that lessen the load placed on them. ADC’s sit in front of web server pools and take processor intensive tasks off the web servers so that they can be focused on efficient content delivery. This makes ADC’s a valuable adjunct to ecommerce and web application servers as they can be used to enhance the service provided.
An ADC load balancer is made up of a suite of software tools working in concert. They can be deployed as dedicated devices with the software preinstalled, on to existing server hardware, or as a virtual server on VMware vSphere, Microsoft Hyper-V, Amazon Web Services, Microsoft Azure, or other cloud services. Think of the ADC load balancer as a complex system of network optimisation tools that include:
- A web server component
- A number of traffic and routing optimisation algorithms
- Image caching (reducing web server load)
- Content caching
- Content Switching and Rewriting
- SSL Encryption/Decryption (further load reducing)
- Single Sign On (preventing users having to login when being switched between servers)
- GEO and DNS failover
- Cookie Persistence
See the KEMP Glossary for more information on these terms.
Why you should deploy an ADC load balancer.
ADC load balancers, such as KEMP’s LoadMaster, will improve your web provision a lot. They are usually deployed in highly available redundant configurations. This ensures that the ADC is not a single point of failure on the network. Uptime is important to make sure that the site is available when Google’s web crawlers attempt to index the site, as well as ensuring a better service to users. ADC’s can group hundreds of backend servers and present them as an opaque pool to clients making service requests. This is irrespective of the server pool being made up of local servers, or also includes servers from cloud providers in a hybrid cloud. Indeed in a hybrid cloud scenario, the ADC load balancers can allocate sessions on web servers based on the geolocation of the client and the server. Which may, or may not be the closest server to the client. Factors such as the network connection state between client and servers are also taken into account.
ADC load balancers also cache content and will return data from the cache when they can. This reduces the load on the actual content servers. In addition to caching, the ADC load balancers can also compress data before sending it over the network.
Google is attaching more weight in its algorithms to websites that are secure and operate on SSL. But adding SSL to a website can add about 25% overhead in terms of processing requirement. Just to encrypt and decrypt the data as it flows through the web server. An ADC load balancer allows for SSL-Offloading. This hands off the task of SSL encryption handling to the ADC. This enables the web servers to do what they do best and serve content. Some load balancers in the KEMP LoadMaster family have dedicated hardware for SSL processing
But, perhaps content switching is an ADC load balancers biggest contribution to improved website provision. When it comes to Content Management systems, web servers, eCommerce systems, blogs and content delivery networks, website providers have to pick components that work together or run them on different servers. For example, Word Press, the world’s most popular blogging platform, is built with PHP, but many hosting companies using Microsoft IIS don’t provide PHP libraries. Many site owners are therefore forced into picking blog CMS platforms that aren’t as rich or have to run them as sub-domains, which isn’t always optimal for Search Engine Optimisation (SEO). ADC load balancers provide content rewriting and this allows users to publish blogs as mywebshop.com/blog instead of blog.mywebshop.com, even if the blogging system is on a different server. The ADC load balancer is the front-end handler of website requests and it can pull the blog from a server on the same or a remote network. This means site owners can have a suite of web servers presented under a single domain name, whilst having a myriad of different content delivery systems on separate servers. Which is good for optimisation of the search engine results for a domain.
Clearly, there are many factors that have to be considered and monitored in order to deliver a robust website. Optimising the site for how Google algorithms work is key, so is ensuring that the site is responsive, content rich and always available. Putting an ADC load balancer in front of your content servers will go a long way to delivering on these later points, and free your web team to concentrate on delivering great content.