Multi-leader database replication. Is it all worth it?
We see similar scenarios of multi-leader when we work over multiple devices and they need to be synced with each other like google drive, wiki pages, and many others.
Here multiple devices updating are leaders whereas all other devices are followers. So you might have got a picture of it.
But yes, let’s talk about why we require multi-leader databases.
- Support writes in multi-datacenter to increase performance across data centers.
- Tolerance of network problems across data centers.
- Independent operation of all other data centers even if one data center goes down
But there are always tradeoffs.
How should we handle writing conflicts across leaders? Collision of auto-incrementing keys? Fix messed up data across multi-leader because of any bug or issue? and more
The simplest way to solve a problem is not to have pain in the first place
How can we not have conflicts in multi-leader?
If we ensure that writing to a particular leader always goes through the same leader, then conflicts won’t occur. This is increasing the write performance across the data centers but what if one data center fails, we have to ultimately deal with conflicts :(
Ways to deal with conflicts
Now there are two ways to resolve conflicts
- Application resolve conflicts: The application detects the conflicts and a snippet of code resolves it or the database can do it. One of the ways is the last-to-write wins (LWW) technique.
Let me know in the comments if want a blog to discuss the various techniques. I would be happy to do that :) - User resolve conflicts: Store all the conflict version and next time the user read the data, prompt the user and write the result back to the database.
How do multiple leaders communicate among themselves?
Each node writes from one node and forwards those writes to another node
One designated root node forwards writes to all the other nodes.
In such architectures, there is a possibility that it leads to infinite replication loops. They are infinitely looping across all the nodes.
How to avoid infinite replication loops?
Each node has a unique identifier and the replication log is tagged with the identifiers of all the nodes it is passed through.
And whenever a node receives a replication that has its identifier, the data change is ignored as it knows it has already been processed and infinite looping is prevented
But there is one more issue over here, a single point of failure.
If one node fails it will affect other nodes as well.
How to prevent a single point of failure?
So here fault tolerance is better because messages can travel along a different path, avoiding a single point of failure.
But definitely, it comes with its complexities, like one network link may be faster than others and some messages may overtake others and hence lead to data inconsistencies in a concurrent environment. We can use version vectors for these.
Version vector is a whole different blog, hence not discussed here.
This might have given you a fair amount of multi-leader architecture, so do you think is all this worth it? Or should we avoid multileader?
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems: by Martin Kleppmann