Load Balancing Overview
In this tutorial, we are going to discuss about Load balancing overview, an important system design concept. Load balancing is an important concept in system design, and it’s also a common topic that comes up on system design interviews for tech roles.
Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.
Modern high‑traffic websites must serve hundreds of thousands, if not millions, of concurrent requests from users or clients and return the correct text, images, video, or application data, all in a fast and reliable manner.
A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance. A load balancer can be a hardware or software system, and it has implications for security, user sessions, and caching. If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it.
Load balancing is a essential component of System Design, as it helps distribute incoming requests and traffic evenly across multiple servers. The primary objective of load balancing is to ensure high availability, reliability, and performance by avoiding overloading a single server and avoiding downtime.
Usually a load balancer positioned between the client and the server accepting incoming network and application traffic and distributing the traffic across multiple backend servers using various algorithms. By balancing application requests across multiple servers, a load balancer reduces the load on individual servers and prevents any one server from becoming a single point of failure, thus improving overall application availability and responsiveness.
There are many algorithms for deciding exactly how to distribute tasks. The most commonly used classes of algorithms are Round Robin, Least Load, and User and Resource Hashing.
To take advantage of full scalability and redundancy, we can try to balance the load at each layer of the system.
As shown in the above image, mainly Load balancers can be added at 3 locations
- Between the user and the web server
- Between web servers and an internal platform layer, like application servers or cache servers
- Between internal platform layer and database.
Important terminology and concepts
Load Balancer: A device or software that distributes network traffic across multiple servers using predefined rules or algorithms.
Load Balancing Algorithm: The method used by the load balancer to determine how to distribute incoming traffic among the backend servers.
Backend Servers: The servers that receive and process requests forwarded by the load balancer. These are also known as the server pool or server farm.
SSL/TLS Termination: The process of decrypting SSL/TLS-encrypted traffic at the load balancer level, which relieves the decryption workload on backend servers and allows for centralized SSL/TLS management.
Session Persistence: A mechanism for ensuring that successive requests from the same client are sent to the same backend server, hence keeping session state and giving a consistent user experience.
Health Checks: The load balancer runs periodic checks to assess the availability and performance of the backend servers. Unhealthy servers are removed from the server pool until they have recovered.
How does Load Balancer work?
Load balancers distribute incoming network traffic among different servers or resources in order to ensure optimal use of computing resources and prevent overload. Here are the general steps that a load balancer follows to distribute traffic.
- The load balancer receives a request from a client or user.
- The load balancer evaluates the incoming request and determines which server or resource should handle the request. This is done based on a predefined load-balancing algorithm that takes into account factors such as server capacity, server response time, number of active connections, and geographic location.
- The load balancer forwards the incoming traffic to the selected server or resource.
- The server or resource processes the request and sends a response back to the load balancer.
- The load balancer receives the response from the server or resource and sends it to the client or user who made the request.
That’s all about Load Balancing Overview in system design. If you have any queries or feedback, please write us at contact@waytoeasylearn.com. Enjoy learning, Enjoy system design..!!