The proliferation of servers for various tasks in the enterprise, e-commerce and Internet service provider domains has led to great scalability and manageability challenges. Today most e-businesses deploy multiple servers devoted to Web, FTP, DNS, e-mail, secure socket layer and other such applications. However, although clustering has long held the promise of scalability and availability, it remains a distant dream and is very complex to configure and manage.
There are many Internet traffic-management products available to solve that problem. In general, they fall into three categories: load-balancing switches, also referred to as Layer 4 to 7 switches; PC/server-based products; and server side agents.
Load-balancing switches extend the functionality of a traditional Layer 2 or 3 switch into higher layers by examining the content beyond the packet header. In general, switch-based products offer better reliability and performance because they have lesser code, feature ASIC-based performance and contain no moving parts such as disk drives. Server-based products are basically applications prepackaged with an operating system and server. Load-balancing applications residing on a server can become performance bottlenecks in concurrent connection capacity and throughput situations-the very same reason one deploys multiple servers for an application. Server side agents approach the problem by running special software that runs on each server in the farm, but they can interfere with the application and are platform-dependent. They require an additional switch to connect all the servers and can pose a reliability and performance problem by adding more software on the servers.
Load-balancing switches offer a low-cost way to manage server farms. They operate on industry standards and do not require any special hardware or software on servers. The load balancer is the front end for the server farm and is assigned a virtual Internet Protocol address. The VIP is the published address to the external world, and is the only one the external world needs to know.
The switch is configured to send the entire HTTP traffic to server A, B and D; FTP traffic to Server A, C and E; and DNS traffic to server B. When a client request is received by the HTTP port on the VIP, the switch translates the destination IP and MAC address to one of the servers selected based on the user-selected load-balancing algorithm.
When the server replies, the switch changes the real server IP and MAC address to its own VIP and MAC address before forwarding it. System administrators can now move services or applications from one server to another without service outage, gracefully shut down a server for maintenance or add more servers on the fly.
Load balancers support different load-balancing techniques, which can be simple to set up and configure, yet are powerful enough to use the full potential of all servers. Several load-balancing methods are available, using common algorithms developed in studies of buffering and traffic distribution:
- Round robin. Assigns connections sequentially among servers in a logical community.
- Least connections. The server with the least number of connections gets the next connection.
- Weighted distribution. Divides the load among the servers based on user-supplied percentage or weight.
Weighted methods can be used to ensure that high-performance servers receive more of the traffic load. That provides investment protection by leveraging existing servers along with powerful new servers.
Each server can also be set up for the maximum total number of connections it can handle to make sure it does not ever get overloaded. In the case of overflow or complete local server farm outage, the load balancer can send the requests transparently to remote backup servers or do an HTTP redirect to a remote server.
Load-balancing switches can perform server-level and application-level health checks. For example, users can configure Foundry Networks' ServerIron switch to retrieve a specific URL successfully to determine the availability of the application. The HTTP success and return codes can be customized as well. Malfunctioning Web servers returning errors such as "Object not found" are automatically detected. If a server or an application goes down, traffic is transparently redirected to other servers.
Servers can be assigned internal IP addresses that are nonroutable on the Internet, and let the load balancer perform network address translation. That makes servers more secure, and the IP address management much easier and more cost effective. Load balancers can also support access control lists (ACLs) and the ability to filter traffic based on TCP or UDP ports. For example, the load balancer can be set up to restrict access to specific applications or service from a given address or a subnet. Foundry's ServerIron also supports Cisco-syntax access control lists and extended ACLs in addition to extensive filtering capabilities. Load balancers can protect the server farm against malicious TCP SYN attacks because they intercept and process all traffic.
Ever wanted to give priority to your shoppers over the browsers? Or want to prioritize order booking over FTP traffic? Load-balancing switches can prioritize traffic based on TCP or UDP ports. Load balancers can prioritize packets based on a combination of destination address and destination port number and ensure that traffic destined for critical applications gets priority over others.
Web hosting companies can use load-balancing switches that support large numbers of virtual IP addresses and real servers to host a number of sites on a small set of real servers for scalability and availability. Statistics for each virtual IP address can be tracked.
Some load-balancing switches support a unique feature called SwitchBack to benefit throughput-intensive applications such as streaming video. SwitchBack allows the servers to respond back to clients directly on the return path. The switch continues to inspect the incoming traffic and changes only the destination MAC address to the selected real server.
The VIP is defined on the loop-back interface of real servers. The real servers will not respond to Address Resolution Protocol requests for the VIP, but will accept traffic on the loop-back interface. The replies, typically much larger than requests, go back directly at wire speed through the wire-speed, nonblocking gigabit switch. When used with SwitchBack, Foundry's ServerIron utilizes the conserved memory space by preventing processing of the reply traffic. That can double the concurrent connection capacity to 1,000,000 and provides up to 64 Gbits/second of throughput.
Load-balancing switches are built in several different ways-with a central processor and memory; or with distributed processors on each port. The distributed architecture will still have a central processor for management and coordination, and may lead to per-port restrictions if the memory per port is limited.
Central-processor architecture avoids such overhead and can use the entire memory and processing resources, no matter how many ports are used. In general, the load balancer is connected to routers on one or two ports, and to servers from all other ports. A distributed architecture with limited memory per port encounters a bottleneck that is determined by the number of ports connected to the routers.
Load-balancing switches vary in how many concurrent connections they can handle. In general, they must be able to handle the sum of the maximum connections for each server in the farm. During a traffic surge, the load-balancing switch will need to process a large number of open-connection requests and a few closed-connection requests. The difference between the two will result in a large number of open connections that the switch must be able to handle. If a switch is limited by any per-port constraints for maximum open connections, the router port again becomes the bottleneck for the whole server farm.
Load-balancing switches can be used for a variety of applications. The fundamental approach remains the same-make a smart decision based on the information in the IP packet. There are at least three other major applications beyond server load balancing. Global server load balancing can be used to load-balance across multiple sites and provides backup against complete site outages transparently. Load balancing can be used to enhance firewalls, by using the load balancer on both sides of a set of firewalls to remove it as a single point of failure and performance bottleneck. Finally, a load balancer can switch Internet traffic transparently to a group of caching servers for client or Web acceleration.