How Sharing Candies 🍬at a Party Teaches Us About Consistent Hashing
Photo by Adi Goldstein on Unsplash
Introduction
Imagine you're excited to throw a party. At the party, you plan to give candies to everyone who comes. You and three friends are arranging this party, and each person should bring some candy. The question is: Who should bring what?
The Idea: Modulus
You come up with a brilliant idea: use modulus (remainder when dividing numbers) to decide the candy assignments.
What is Modulus?
Modulus is the remainder when you divide one number by another.
Assigning Candies Using Modulus
Let's say there are 3 friends, and each candy needs to be assigned to a friend:
Take the first candy and divide it by 3. If the remainder is
0
, assign it to the first friend.Take the second candy, divide it by 3. If the remainder is
1
, assign it to the second friend.Repeat the same process for all candies.
Now, if you want to know who should bring the 56th candy, just divide 56 by 3
. The remainder tells you which friend is responsible.
What Happens If a Friend Leaves?
If one friend decides not to come, you have to reassign the candies. Now there are only two friends left, so you divide the candies by 2
instead of 3
.
This means recalculating everything—a tedious process.
The Game of Circles
To make things easier, you introduce a circle-based system.
How It Works:
Create a Circle: Divide the circle into 10 parts.
Assign Numbers to Friends: Based on their names, assign numbers (e.g., John = 2, Rebecca = 7).
Assign Numbers to Candies: Use the same method to give each candy a number.
Now, you can easily assign candies to friends by going clockwise around the circle:
- Start with a candy, find the nearest friend on the circle, and assign it.
Handling a Missing Friend
If a friend like Tom leaves, you don’t need to recalculate everything. You just reassign Tom’s candies to the next nearest friend on the circle.
The Problem of Unequal Distribution
Sometimes, candies might cluster around one friend, like John, making him bring more candies than others. To solve this, you introduce virtual nodes.
Introducing Virtual Nodes
Add virtual balls for each friend (e.g., "Ball John" and "Ball Rebecca").
Spread these virtual balls randomly around the circle.
When assigning a candy, you check which ball it lands near:
If it’s "Ball John," assign the candy to John.
If it’s "Ball Rebecca," assign the candy to Rebecca.
This ensures a more even distribution of candies among friends.
Consistent Hashing
This entire concept is called consistent hashing.
Why "Consistent"?
In consistent hashing:
When a friend (node) leaves or joins, you don’t need to reassign all candies (data).
Only the candies belonging to the leaving friend are reassigned.
Hashing Algorithm and Hash Space
Hashing Algorithm: Converts a name (e.g., "John") or candy to a number.
Hash Space: The range of numbers produced by the hashing algorithm (e.g., 1 to 100).
The circle is divided based on this hash space. All friends (nodes) and candies (data) are placed on the circle according to their hashed numbers.
Conclusion
Using consistent hashing:
You can organize the party efficiently.
You ensure candies (data) are distributed evenly and reassigned with minimal effort when someone leaves or joins.
The system becomes reliable, consistent, and fault-tolerant.
With this, your party is a success, and so is your distributed data system!