Understanding Load Balancing Through Chocolates and Kids

Imagine you are hosting a party with a bunch of kids, and you have a bowl full of chocolates. You notice that every kid is rushing to grab as many chocolates as they can—some get more, while others get none. You realize this isn’t fair and decide to organize the distribution properly so that every child gets chocolates evenly.

To solve this, you introduce Uncle Bob, whose responsibility is to ensure that chocolates are distributed fairly among the kids. But how exactly should Uncle Bob do this? Let’s explore different strategies:

1. Round Robin Algorithm

The first strategy is simple: Uncle Bob distributes chocolates one by one in a circular fashion.

The first chocolate goes to the first child, the second to the second child, the third to the third, and so on.
After reaching the last child, he starts over from the first child again.

This approach ensures chocolates are distributed evenly among all the kids. Since chocolates are given out in a circular motion, this method is called the Round Robin Algorithm.

However, there's a problem—some kids are bigger and need more chocolates, while others are smaller and don’t need as much. This strategy doesn’t consider individual needs.

2. Sticky Round Robin Algorithm

Another approach is to assign a specific type of chocolate to each child.

For example, Snickers is assigned to the first child, Cadbury to the second, Mars to the third, and so on.
Whenever a specific chocolate appears in the bowl, it is always given to the same child.

This is called the Sticky Round Robin Algorithm because a particular child is "stuck" to a specific type of chocolate.

However, this method has a flaw: if there are more Mars chocolates in the bowl than Cadbury, the distribution becomes uneven. Similarly, in real-world applications, if all users from a specific country (e.g., India) are assigned to one server, that server may become overloaded while others remain underutilized.

3. Weighted Round Robin Algorithm

To address this issue, we introduce a weighting system.

Suppose we distribute chocolates based on the marks children get in school.
The higher the marks, the more chocolates a child receives.

In the real world, weighted load balancing assigns different weights to servers based on their capacity. A server with a higher weight receives more requests, just as a child with higher marks gets more chocolates. However, the drawback is that the weights must be manually assigned and updated, which can be tedious and error-prone.

4. Hashing-Based Allocation

Another strategy is to use hashing to distribute chocolates evenly.

Each candy is assigned a number.
To determine which child receives a candy, we take the candy's number and apply the modulus operation (remainder).
For example, if we have four children, we do candy_number % 4, which will always result in a number between 0 and 3 (corresponding to one of the four kids).

This ensures an even distribution of chocolates. Similarly, in real-world applications, consistent hashing is used to distribute user requests to servers efficiently.

5. Least Connections Algorithm (Dynamic Load Balancing)

Now, let’s consider a dynamic approach.

Before Uncle Bob starts distributing chocolates, some kids may have already taken chocolates on their own.
If he follows the round-robin method, kids who already have chocolates will end up getting even more.

To solve this, Uncle Bob first identifies the child with the least chocolates and gives them the next candy. This ensures fairness before distributing chocolates evenly.

This approach is known as the Least Connections Algorithm, which ensures that servers with fewer active connections receive more requests first. Since real-world servers do not start with zero requests, this method prevents overloading one server while others remain idle.

6. Least Response Time Algorithm

Now, let’s refine our approach further by considering how quickly kids finish their chocolates.

Suppose we observe that bigger kids eat chocolates faster than smaller kids.
Our assumption is that if a child finishes their chocolates quickly, they probably received fewer chocolates in the first place.
So, we prioritize giving chocolates to the kids who finish faster.

This method is called the Least Response Time Algorithm, where requests are sent to the server that processes them the fastest. While this assumption generally works, it may not always be accurate, just as some kids may eat faster simply because they like chocolates more!

Conclusion

By applying both static and dynamic load balancing algorithms, we ensure that chocolates (or requests) are distributed evenly among children (or servers).

Static Algorithms (Round Robin, Sticky Round Robin, Weighted Round Robin, Hashing) follow predefined rules and do not adapt in real-time.
Dynamic Algorithms (Least Connections, Least Response Time) adjust based on current conditions, ensuring a more efficient distribution.

Understanding these concepts will help you design efficient systems that distribute loads effectively, just as Uncle Bob ensures every child at the party gets their fair share of chocolates!