web application server cluster is a group run by the same web application server component of the cluster system, the outside, like a server. The load balanced server cluster to achieve the purpose of optimizing system performance, a large number of clustered servers to access request, distributed to different nodes in the system for processing. In order to achieve higher efficiency and stability, which is also Web-based enterprise application must have features.
High reliability can be seen as a redundant system configuration. For a specific request, if the application server can not be processed, then the other on the server can not effectively deal with it? For an efficient system, if a Web server fails, the other server can immediately take its place on the application to process the request, and the process for users, to the extent possible and transparent, the user is not aware to!
Stability of the application to decide whether to support the growing number of user requests, it is the application's own ability. Stability of many factors that affect system performance is an effective means of measurement, including the cluster system can support while the maximum number of users accessing the system and processing time required for a request.
Many of the existing server load balancing methods, the extensive research and using the following two methods:
DNS load balancing method of RR-DNS (Round-Robin Domain Name System)
The following load balancer, we will discuss these two methods.
Rotation scheduling DNS RR-DNS (Round-Robin Domain Name System)
DNS (Domain Name Server) in the data file name will be mapped to host its IP address. When you type a URL in the browser (for example: www.loadbalancedsite.com), the browser will send the request to the DNS, asking them to return the corresponding IP address of the site, which is known as DNS queries. When the browser gets the IP address of the site after it passed the IP address to connect to the site to be visited, the page displayed to the user.
Domain Name Server (DNS) usually contains a single IP address with the IP address of the site map list of the names. In our example above, the illusion, www.loadbalancedsite.com this site mapped IP address 22.214.171.124.
In order to use DNS load balanced servers, for the same site in terms of the DNS server also has several different IP addresses. These IP addresses represent a cluster of different machines, and logically mapped to the same site name. Through our example can better understand this, www.loadbalancedsite.com will publish the following three IP addresses to a cluster of three machines:
In this case, DNS server contains the following mapping table:
When the first request reaches the DNS server, the return is the first machine's IP address 126.96.36.199; When the second request arrives, the return of the second machine's IP address 188.8.131.52, and so on. When the fourth request arrives, the first machine's IP address will be returned again, called cyclically.
DNS Round Robin using the above techniques, one site for all requests to be evenly distributed and the group of machines. Therefore, in this technology, all the nodes in the cluster is visible to the network.
DNS rotation schedule advantage
DNS Round Robin of the biggest advantages is easy to implement and costs low:
The price is low, easy to set up. To support the rotation schedule, the system administrator only needs to make some changes to the DNS server, and in many newer versions of DNS servers have been added this feature. For Web applications, do not make any changes to the code; In fact, Web application itself is not aware load-balancing configuration, even in front of it.
Simple network of experts on the need to set, or when there is a problem of maintenance.
The disadvantage of rotation scheduling DNS
The software-based load balancing methods exist two deficiencies, one does not support real-time during the association, one does not have high reliability.
• consistency between the server does not support. Consistency is the server load balancing system should have the ability, through it, the system can be part of server-side session information, or the underlying database level, then the user's requests to the appropriate server. The rotation scheduling DNS not have such intelligent features. It is through cookie, hidden fields, URL rewriting in one of three ways to make similar judgments. When the user signs through the text-based approach to establish a connection with the server after all of its follow-up visits are connected to the same server. The problem is that the server's IP is being temporarily stored in the browser cache, once the record date, you will need to re-establish the connection, then with a user's request is likely to be different server for processing, then all previous session information will be lost .
Does not support high reliability. Imagine a cluster with N nodes. If one node destruction, then all requests to visit the node will not respond, which no one would wish to see. More advanced router by every certain time interval, the node checks if there is destruction of the nodes will be removed from the list of ways to solve this problem. However, due to the Internet, ISPs will be stored in a large number of DNS cache, in order to save access time, therefore, DNS updates will become very slow, so that some users may visit some sites that no longer exists, or Some new sites are not accessible. So, even though DNS rotation schedule to a certain extent to solve the load balancing problem, but this situation is not very optimistic about the change and effective.
In addition to the above described rotation scheduling methods, there are three DNS load-balancing process allocation method, these four methods listed below:
Ø Round robin (RRS): the average allocation of work to the server (for the actual service performance of the same host)
Ø Least-connections (LCS): Connect the server to a less assignment more work (IPVS table to store all active connections. For the actual service performance of the same host.)
Ø Weighted round robin (WRRS): a large-capacity server to assign more work. Dynamic load information can be adjusted up or down. (Host for the actual service performance is inconsistent)
Ø Weighted least-connections (WLC): taking into account their capacity to connect the server to allocate more less work. Capacity through a user-specified weights to illustrate the dynamic load information can be adjusted up or down. (Host for the actual service performance is inconsistent)
Load balancing via virtual IP address, the problem of scheduling turns facing many problems. Use a load balancer cluster system, the external view, like with a single server the same IP address, of course, the IP address is virtual, it maps the cluster address for each machine. So, to some extent, the load balancer is to report the cluster IP address leakage to the outside network.
When the request reaches the load balancer, it rewrites the request header, and assigned to a cluster of machines. If a machine is removed from the cluster, the request will not be sent to the server no longer exists, because all the machines on the surface have the same IP address, even if a cluster node is removed, the address does not change. Moreover, internet cache on the DNS entry is no longer a problem. When returning a response, the client can see only from the load balancer on the results returned. In other words, clients are operating against a load balancer, for the more back-end operations, the client is concerned, is completely transparent.
Advantage of load balancing device
• server consistency load balancer to read the client sends a request contained in each of the cookies or url explained. Based on the read out information, the load balancer can rewrite the header and the request is sent to the appropriate cluster node, the node maintains the corresponding session information requested by the client. In the HTTP traffic, server load balancers can provide consistency, but not through a safe way (for example: HTTPS) to provide such services. When a message is encrypted (SSL), the load balancer is not allowed to deliver the information hidden in one session.
• Recovery mechanisms through access to high reliability Recovery occurs when a cluster node can not process the request, need to redirect requests to another node. There are two main fault recovery:
• Request-level failover. When a node in the cluster can not handle the request (usually due to down machine), the request is sent to other nodes. Of course, directed to other nodes at the same time, save on the original node's session information will be lost.
• transparent session failover. When a reference fails, the load balancer will be sent to other nodes in the cluster to complete the operation, which is transparent to the user. As the need for transparent session failover node with the corresponding operational information, so in order to achieve this function, all nodes in the cluster must have a common storage area or a common database to store session information data to provide a separate process for each node during the recovery session required when operating information.
• statistical measurements. Since all Web application requests must go through the load balancing system, the system can determine the number of active sessions, in any instance to access the number of active sessions, response times, Gao Feng load times, and at peak and trough of the The number of sessions, there are other more. All these statistics can be well used to adjust the overall system performance.
The disadvantage of the load balancer
The disadvantage is that the hardware routing cost, complexity, and a single point of failure. Since all requests are via a single hardware load balancer to pass, so the load balancer on any failure will result in the collapse of the entire site.
HTTPS request load balancing
As mentioned above, it is difficult in those from HTTPS requests on load balancing and session information maintenance processing. Because of these requests for information have been encrypted. Load balancer is not capable of handling such requests. However, there are two ways to solve this problem:
Agent network server hardware SSL decoder proxy server is located in the server cluster, first by its acceptance of all requests and decrypt, and then processed these requests based on header information re-sent to the appropriate nodes, this approach does not require hardware support, but will increase the burden of additional proxy server.
Hardware SSL decoder, it is the request reaches the load balancer before it is decrypted by the processing. In this way than the proxy server is faster processing speed of some. But the price is high, and achieve more complex.