Let’s start by understanding about few terminologies and some facts related to this article and come on common ground.
Hashing is the practice of taking a string or input key, a variable created for storing narrative data, and representing it with a hash value, which is typically determined by an algorithm and constitutes a much shorter string than the original.
Web caching is the activity of storing data for reuse, such as a copy of a web page served by a web server. It is cached or stored the first time a user visits the page and the next time a user requests the same page, a cache will serve the copy, which helps keep the origin server from getting overloaded.
Imagine visiting e-commerce, where the page is being requested over and over again, it’s wasteful to repeatedly download it from the server. An obvious idea is to use a Web cache, which stores a local copy of recently visited pages.
If there is a local copy already present there is no need to hit the server and the response will be faster from a local copy creating a win-win situation for all. Distributed System as said by Andrew Tanenbaum is
A collection of independent computers that appear to its users as one computer.
Distributed systems consist of multiple computers that operate concurrently, fail independently, and do not share a common clock. They must be synchronized to be consistent.
The original motivation for consistent hashing (in 1997) was Web caching. The idea has real applications. Consistent hashing gave birth to Akamai, which to this day is a major player in the Internet, managing the Web presence of tons of major companies. (Quantitatively, Akamai serves 10–30% of all internet traffic).
~CS168 Stanford University
Teradata used this technique in their distributed database, released in 1986, although they did not use this term. Teradata still use the concept of a hash table to fulfill exactly this purpose.
Let’s suppose in an organization there is a single server to render all the requests. All the read and write operations are being performed over the same. Let’s say the product is read intensive. After some time, the user base grows and the first thing that comes in mind to keep rendering the read requests is to create Read Replicas the single Master server. Now the write operations are being addressed to the Master and Read requests to the replicas.
Now let’s say over the time traffic increases for write operations. Maybe it’s festival time and everybody is busy writing wishes to friends and family. The write operations are overflowing. One idea that comes into mind is to create shards. As a result, five shards are created. If this were a Relational database it could get messy here as it will be difficult to maintain join operations. Referential integrity, the parent/child relationship between tables usually maintained by an RDBMS, won’t be automatically maintained if the parent and child rows are on separate shards. One approach can be to denormalize but that can get terrible too.
Similarly, let suppose in an organization we plan to cache the data for people working in that organization. Now, this data can be huge to be rendered by a single machine. Suppose we maintain 5 machines M0, M1, M2, M3, M4 to render results. What we do is hash the request and generate a unique number. Generally, a unique number is not generated. We use algorithms like MD5 to generate let’s say 32 bit hashes. But here let’s consider we generate a unique number after hashing. A request comes in with RequestId 1, we hash this to get the unique number 21. Now let’s say we use a modulo operator to find the server number address.
requestId = 1 h(requestId) = h(1) = 21 //let's suppose h(requestId) % 5 = 21 % 5 = 1
This will work perfectly fine and will distribute the load evenly across machines. But this situation is something very ideal. The load can increase with time and there can be failures with any existing machine. In such a scenario there will always be an issue. Suppose the load increases and the organization decides to add one more resource. In such a case,
requestId = 1 h(requestId) = h(1) = 21 //let's suppose h(requestId) % 6 = 21 % 6 = 3
The resource will be looked upon at M3 but at the time of storage we had 5 machines and thus we stored it at M1. One bad solution can be to redistribute the resources across all 6 machines but that can be a very costly operation. Also what if a node goes off. There will be 4 nodes now and the request will be pointed on the wrong machine. As here we are considering the case of creating caches, if resources are not found the request will go to the original resource server and create a local copy at the machine too. This is something that will create inconsistency in data. As the same data will now be present across two nodes and if the user updates it on one there will be a data mismatch for the same key across nodes. So this solution is something not preferred.
To the savior comes consistent hashing into the picture. Both the cases discussed above were to provide a proper understanding of the problem in a wide frame.
How consistent hashing helps?
What we need are the objects to remain at the same place whatever be the number of machines or servers.
Here we hash both key and server. Let’s say we do a 32 bit MD5 of key or id of an object and similar 32 bit MD5 of a server’s IP. This gives us numbers that can be chosen from 0 to 2³²-1. So we have machines m0 → m2³²-1. Let’s consider a small subset and create a flat array to understand well.
Consider objects x0, x1, x2, x4 and machines m1, m2 and m3. Each object is assigned to the next machine to its right. x0 →m2, x2 →m1, x1 & x4 → m3. Here we have hashed all the machines and objects and pushed respectively to this flat array-like structure.
We get an object, hash it, find and start lookup to its right to find the next machine hash h(m), and assign this object to this machine. The first element of an array is considered to be at the right of the last element of an array.
So this can be visualized as a circular node connection something like below in Figure: 3. The positioning of machines and objects is by the hashed output. This is just being said to create a visualization of how it works.
Caches and objects both hash to points on this circle, an object is stored on the cache server that is closest in the clockwise direction. Assuming reasonable hash functions, by symmetry, the expected load on each of the n cache servers is exactly a 1/n fraction of the objects.
Now here let’s say there is a situation where we have to add one more cache machine server as the load increases, we can add it easily and it will require movement of the only 1/n of the total number of objects as machines are distributed evenly.
Here in Figure 4, machine m4 is added between m1 and x2. Previously, the object x2 was referenced at m1, now the lookup will change. All the data of m1 will be moved to m4 and when x2 will look up to the right or in a clockwise direction, it will query server m4 for results.
In this example of web caching it will be easy as requests will go to m4 and it will not find the resource. This will be a cache miss and data will be fetched from the original server of resource and a copy will be created at m4. Also, the pending requests at m1 will timeout in some time.
If you observe, currently servers do not look very uniformly distributed. It can be the case that most of the traffic is being rendered by a single machine due to non-uniform distribution. Like here, there is a lot of gap from m3 to m1. This can keep m1 at more load than m2 and m3 as most of the object hashes will occupy this space between m3 and m1. So what do we do?
The idea here is to create virtual nodes. Now, what does this mean? This is something algorithmic not hardware-oriented.
What we do is hash a server multiple times with different hash functions. What does this mean? Observe in Figure 4, m1 is present at around 15°, m4 at around 25°, m3 at around 220° and m2 at around 315°. Our current hash function h(x) is responsible to keep these servers at this position, what we do now is hash them 3 times separately with different hash functions to get a different value each time and place them respectively over this circular arrangement.
Here, in Figure 5, multiple virtual copies of m1, m2, m3, and m4 are created. This makes the servers’ distribution more uniform. A single hash function was used to define the initial position of these servers and now multiple different hash functions are used to create more virtual positions of these servers.
This increases the randomness of servers and thus makes object storage uniform across servers. Consider the case where we add a high capacity machine to this ring. We can create more virtual copies of it by using more number of hash functions and that high capacity server will be more present in this ring than others.
This is how we can utilize the capacity of different servers. The sensible approach is to make the number of virtual copies of a cache server proportional to the server’s capacity.
References: CS168 Stanford University