Blog campur-campur

Load Balancing

Fahmi Rizwansyah says:

From Wikipedia, the free encyclopedia

In computer networking, load balancing is a technique to spread work between two or more computers, network links, CPUs, hard drives, or other resources, in order to get optimal resource utilization, maximize throughput, and minimize response time. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy. The balancing service is usually provided by a dedicated program or hardware device (such as a multilayer switch).
It is commonly used to mediate internal communications in computer clusters, especially high-availability clusters.


For Internet services
One of the most common applications of load balancing is to provide a single Internet service from multiple servers, sometimes known as a server farm. Commonly load-balanced systems include popular web sites, large Internet Relay Chat networks, high-bandwidth File Transfer Protocol sites, NNTP servers and DNS servers.
For Internet services, the load balancer is usually a software program which is listening on the port where external clients connect to access services. The load balancer forwards requests to one of the "backend" servers, which usually replies to the load balancer. This allows the load balancer to reply to the client without the client ever knowing about the internal separation of functions. It also prevents clients from contacting backend servers directly, which may have security benefits by hiding the structure of the internal network and preventing attacks on the kernel's network stack or unrelated services running on other ports.
Some load balancers provide a mechanism for doing something special in the event that all backend servers are unavailable. This might include forwarding to a backup load balancer, or displaying a message regarding the outage.
An alternate method of load balancing which does not necessarily require a dedicated software or hardware node, is called round robin DNS. In this technique, multiple IP addresses are associated with a single domain name (i.e. www.example.org); clients themselves are expected to choose which server to connect. Unlike the use of a dedicated load balancer, this technique is not "transparent" to clients, because it exposes the existence of multiple backend servers. The technique has other advantages and disadvantages, depending on the degree of control over the DNS server and the granularity of load balancing which is desired.
A variety of scheduling algorithms are used by load balancers to determine which backend server to send a request to. Simple algorithms include random choice or round robin. More sophisticated load balancers may take into account additional factors, such as a server's reported load, recent response times, up/down status (determined by a monitoring poll of some kind), number of active connections, geographic location, capabilities, or how much traffic it has recently been assigned. High-performance systems may use multiple layers of load balancing.
In addition to using dedicated hardware load balancers, software-only solutions are available, including open source options. Examples of the latter include the Apache web server's mod_proxy_balancer extension and the Pound reverse proxy and load balancer.


Persistence
An important issue when operating a load-balanced service is how to handle information that must be kept across the multiple requests in a user's session. If this information is stored locally on one back end server, then subsequent requests going to different back end servers would not be able to find it. This might be cached information that can be recomputed, in which case load-balancing a request to a different back end server just introduces a performance issue.
One solution to the session data issue is to send all requests in a user session consistently to the same back end server. This is known as "persistence" or "stickiness". A large downside to this technique is its lack of automatic failover: if a backend server goes down, its per-session information becomes inaccessible, and sessions depending on it are lost.
Assignment to a particular server might be based on a username, client IP address, or random assignment. Due to DHCP, Network Address Translation, and web proxies, the client's IP address may change across requests, and so this method can be somewhat unreliable. Random assignments must be remembered by the load balancer, which creates a storage burden. If the load balancer is replaced or fails, this information can be lost, and assignments may need to be deleted after a timeout period or during periods of high load, to avoid exceeding the space available for the assignment table. The random assignment method also requires that clients maintain some state, which can be a problem, for example when a web browser has disabled storage of cookies. Sophisticated load balancers use multiple persistence techniques to avoid some of the shortcomings of any one method.
Another solution is to keep the per-session data in a database. Generally this is bad for performance since it increases the load on the database: the database is best used to store information less transient than per-session data. (Interestingly, to prevent a database from becoming a single point of failure, and to improve scalability, the database is often replicated across multiple machines, and load balancing is used to spread the query load across those replicas.)
Fortunately there are more efficient approaches. In the very common case where the client is a web browser, per-session data can be stored in the browser itself. One technique is to use a browser cookie, suitably time-stamped and encrypted. Another is URL rewriting. Storing session data on the client is generally the preferred solution: then the load balancer is free to pick any backend server to handle a request.

Load balancer features
Hardware and software load balancers can come with a variety of special features.

* Asymmetric load: A ratio can be manually assigned to cause some backend servers to get a greater share of the workload than others. This is sometimes used as a crude way to account for some servers being faster than others.
* Priority activation: When the number of available servers drops below a certain number, or load gets too high, standby servers can be brought online
* SSL Offload and Acceleration: SSL applications can be a heavy burden on the resources of a Web Server, especially on the CPU and the end users may see a slow response (or at the very least the servers are spending a lot of cycles doing things they weren't designed to do). To resolve these kinds of issues, a Load Balancer capable of handling SSL Offloading in specialized hardware may be used. When Load Balancers are taking the SSL connections, the burden on the Web Servers is reduced and performance will not degrade for the end users.
* Distributed Denial of Service (DDoS) attack protection: load balancers can provide features such as SYN cookies and delayed-binding (the back-end servers don't see the client until it finishes its TCP handshake) to mitigate SYN flood attacks and generally offload work from the servers to a more efficient platform.
* HTTP compression: reduces amount of data to be transferred for HTTP objects by utilizing gzip compression available in all modern web browsers
* TCP offload: different vendors use different terms for this, but the idea is that normally each HTTP request from each client is a different TCP connection. This feature utilizes HTTP/1.1 to consolidate multiple HTTP requests from multiple clients into a single TCP socket to the back-end servers.
* TCP buffering: the load balancer can buffer responses from the server and spoon-feed the data out to slow clients, allowing the server to move on to other tasks.
* HTTP caching: the load balancer can store static content so that some requests can be handled without contacting the web servers.
* Content Filtering: some load balancers can arbitrarily modify traffic on the way through.
* HTTP security: some load balancers can hide HTTP error pages, remove server identification headers from HTTP responses, and encrypt cookies so end users can't manipulate them.
* Priority queuing: also known as rate shaping, the ability to give different priority to different traffic.
* Content aware switching: most load balancers can send requests to different servers based on the URL being requested.
* Client authentication: authenticate users against a variety of authentication sources before allowing them access to a website.
* Programmatic traffic manipulation: at least one load balancer allows the use of a scripting language to allow custom load balancing methods, arbitrary traffic manipulations, and more.
* Firewall: Direct connections to backend servers are prevented, for security reasons

In telecommunications
Load balancing can be useful when dealing with redundant communications links. For example, a company may have multiple Internet connections ensuring network access even if one of the connections should fail.
A failover arrangement would mean that one link is designated for normal use, while the second link is used only if the first one fails.
With load balancing, both links can be in use all the time. A device or program decides which of the available links to send packets along, being careful not to send packets along any link if it has failed. The ability to use multiple links simultaneously increases the available bandwidth.
Major telecommunications companies have multiple routes through their networks or to external networks. They use more sophisticated load balancing to shift traffic from one path to another to avoid network congestion on any particular link, and sometimes to minimize the cost of transit across external networks or improve network reliability.

Relationship with failover
Load balancing is often used to implement failover — the continuation of a service after the failure of one or more of its components. The components are monitored continually (e.g., web servers may be monitored by fetching known pages), and when one becomes non-responsive, the load balancer is informed and no longer sends traffic to it. And when a component comes back on line, the load balancer begins to route traffic to it again. For this to work, there must be at least one component in excess of the service's capacity. This is much less expensive and more flexible than failover approaches where a single "live" component is paired with a single "backup" component that takes over in the event of a failure. In a RAID disk controller, using RAID1 (mirroring) is analogous to the "live/backup" approach to failover, where RAID5 is analogous to load balancing failover.

Network Load Balancing Services (NLBS)
is a proprietary Microsoft implementation of clustering and load balancing that is intended to provide high availability and high reliability, as well as high scalability. NLBS is intended for applications with relatively small data sets that rarely change (one example would be web pages), and do not have long-running-in-memory states. These types of applications are called stateless applications, and typically include Web, File Transfer Protocol (FTP), and virtual private networking (VPN) servers. Every client request to a stateless application is a separate transaction, so it is possible to distribute the requests among multiple servers to balance the load. One attractive feature of NLBS is that all servers in a cluster monitor each other with a heartbeat signal, so there is no single point of failure.

Configuration Tips:
* The network load balancing service requires for all the machines to have the correct local time. Ensure the Windows Time Service is properly configured on all hosts to keep clocks synchronized. Unsyncronized times will cause a network login screen to pop up which doesn't accept valid login credentials.
* The server console can't have any network card dialogue boxes open when you are configuring the "Network Load Balancing Manager" from your client machine.
* You have to manually add each load balancing server individually to the load balancing cluster after you've created a cluster host.
* To allow communication between servers in the same NLB cluster, each server requires the following registry entry: a DWORD key named "UnicastInterHostCommSupport" and set to 1, for each network interface card's GUID (HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\WLBS\Parameters\Interface\{GUID})
* NLBS may conflict with some Cisco routers, which are not able to resolve the IP address of the server and must be configured with a static ARP entry.

Cheers, frizzy2008.