Request a load balancer for your managed server to distribute the workload among servers, provide redundancy, or allow for a sorry page.
This article applies to: Managed Servers
A load balancer can be used to direct traffic to one or more servers behind the load balancer. Three ways a load balancer is useful are:
- Distribute the work load among servers. This increases the amount of traffic a service can accommodate.
- Provides for redundancy between servers and data centers. This allows for higher availability.
- Allows web "sorry" services. If all web servers are down, a sorry page can be displayed to users.
The server farm currently has two load balancers in a redundant configuration. This means if one load balancer is lost, it fails over to the other. The load balancers are housed in different buildings. There are two sets of load balancers. One set will be behind a firewall. The new load balancers will "share state" which means in the event of failure, clients will be directed to the same server behind the load balancer.
The following load balancer features require configuration choices. The options are described below.
Determine if a Server is Up: Probe
How the load balancer determines if the server is "alive" and able to receive traffic.
How traffic is divided between the servers.
Client Session Persistence: Stickyness
Whether and how the load balancer sends a specific client's requests to the server that client has been using.
All Servers are Down: Sorry Server
A server assigned to deliver a "Sorry" message if all servers are down.
Before sending users to a server, the load balancer tests to be sure that the server is up. This test is called the probe. The default probe checks (or pings) only to be sure that the server is powered up. Using the default probe means that a client may be sent to a server that is up, even if the web server software is down. In that case, the client will get no response.
It is recommended to use a more robust probe. Other probes can test whether the web server software is running by requesting that it load a page such as index.html or a custom page that tests for both the web server software and any underlying database software as well.
The current probe options are:
- TCP: A tcp connection is attempted: if the server refuses the connection or the connection times out, the test is considered to have failed.
- HTTP: An http page is loaded and the service is considered up if the page returns an HTTP 200 (OK) return code. The page must be unauthenticated, and would not give a 200 if it redirects.
- SSL (Appropriate only for services using SSL between load balancer and server): An SSL connection is attempted and a simple SSL operation is attempted (called the "HELLO"). If both succeed, the SSL session is terminated and the service is recorded as "up."
The balance method determines how traffic is allocated to each server.
- Round Robin: By default, the load balancing method used is round robin. This means that traffic is directed to each server in turn. (First hit to first server, second hit to second server, etc.)
- Source IP (hash): The client is assigned to a server based on an algorithm using the client's IP address.
- Least Connections: The client is assigned to the backend server that has the least number of client connections.
Persistence or stickyness determines whether the load balancer sends a particular client's requests to the server that client has been using. Without stickyness, a client may start a session on one server but then get switched to a different server.
By default, there is no stickyness, however, the load balancing method using source IP hash has some inherent stickyness. Other stickyness options are:
- By client IP: The client IPs are saved in a table, and sent back to the same server. There is an optional idle timeout. By default, the sticky table is set to clear after the session has been idle for 24 hours (1440 minutes). The idle timeout can be set for a particular group of application servers. A good rule of thumb is the application timeout plus 5 minutes, so for an application with a 30 minute timeout, the sticky table would clear in 35 minutes.)
- By cookie: The load balancer inserts a cookie in the HTTP sent to the client recording which server this client was sent to. When the load balancer receives a subsequent request from that client, it uses the cookie to determine which server to send the client to.
For web servers only, if the load balancer determines that all of the servers are down, clients are directed to a sorry server which offers a message explaining that all servers are down. The sorry server is assigned in advance.