Distributed Systems 6.2: Raft
Vložit
- čas přidán 27. 07. 2024
- NOTE: There are some mistakes in this video. Please watch this one instead, in which the bugs are fixed: • Distributed Systems 6....
This video is part of an 8-lecture series on distributed systems, given as part of the undergraduate computer science course at the University of Cambridge.
Accompanying lecture notes: www.cl.cam.ac.uk/teaching/202...
Full lecture series: • Distributed Systems le...
Thank you, Martin, for such an amazing trip through such a complex topic like consensus algorithms.
This part was omitted in the "Designing Data-Intensive Applications" book and I am really happy that you continue sharing the knowledge by the CZcams channel.
Good luck with your research!
Node state transitions in Raft [0:01]
Follower, Candidate, Leader
Raft(1/9): intialisation [3:42]
currentTerm, votedFor, log, commitLength -> on disk
1st leader election [7:11]
Raft(2/9): voting on a new leader [9:28]
Raft(3/9): collecting votes [12:56]
Raft(4/9): broadcasting messages [16:31]
Raft(5/9): replicating from leader to followers [19:23]
Raft(6/9): followers receiving messages [21:31]
Raft(7/9): updating followers' logs [26:24]
Raft(8/9): leader receiving log acknowledgements [30:14]
Raft(9/9): leader committing log entries [33:28]
Thank you for posting all of these!
Amazing! super clear!
You are so good .. I love you. Namastey from Bangalore 🙏🌷
very well explained thanks for this
en.wikipedia.org/wiki/Raft_(algorithm) is also explaining some of the parts of RAFT quite well.
How do you deal with quorums in the case of a dynamic number of nodes? I.e. in a Peer-to-Peer application where peers can suddenly disconnect (possibly without sending a leave message, etc.), and later re-connect. if a lot of nodes suddenly disconnected, then we could end up with less than a quorum of nodes remaining if they don't update the necessary quorum.
Perhaps lowering the quorum limit in the case of repeated failed elections?
There is an extension of Raft that adds a reconfiguration protocol, which can be used to add or remove nodes. But that still requires some amount of central control over the consensus system, which is difficult to achieve in peer-to-peer systems. The whole world of blockchains essentially explores ways of achieving consensus in peer-to-peer systems (especially where peers might be untrusted).
will the term number overflow if it is an integer or long number? how to solve the issue?
Most systems just choose a number type with sufficiently many bits that it will never overflow within the lifetime of the system. 64 bits would allow the system to run for millions or even billions of years without overflowing.
Can u share that implementation code
What happens if the leader crashed during the total order broadcast?
Eventually the remaining nodes will elect a new leader.
This one is really hard 🤕
What if the Candidate is not able to get a quorum of votes for multiple terms? The probability of this occurrence is super low, but just wanted to know if this would be some sort of "stale mate". 🤔
Using randomized election timeouts for each node, mitigates this problem
Martin well explained the Raft algorithm. IMO this is another great introduction to Raft: czcams.com/video/vYp4LYbnnW8/video.html