CSE138 (Distributed Systems) L9: primary-backup replication, chain replication, latency & throughput
Vložit
- čas přidán 27. 04. 2021
- UC Santa Cruz CSE138 (Distributed Systems) Lecture 9: reasons to do replication; strong consistency (informally); primary-backup replication; chain replication; latency and throughput; midterm review
Recorded April 27, 2021
Professor Lindsey Kuper users.soe.ucsc.edu/~lkuper/
Course website: decomposition.al/CSE138-2021-03
Schedule of topics: decomposition.al/CSE138-2021-0... - Věda a technologie
Thanks for revision of po fifo causal and to
Thanks a lot for such valuable lessons.! I heard about state machine replication and not sure what it is and how it relates to replication. Do you mind explaining it a bit? Thanks in advance!!
I discuss state machine replication a few lectures later in the course: czcams.com/video/wHpB44jtS4g/video.html
While going over the Primary Backup replication protocol, the primary sent a broadcast message to all other backups. But the backups did not deliver the message to the other backups but just itself, why is this the case? How do I distinguish a case when a message has to be sent to all replicas and itself, versus just itself?
Not sure I understand what you're proposing. Are you concerned about message loss between the primary and the backups?
@@lindseykuperwithasharpie No, I am not concerned about that. When we went over reliable broadcast, all recipient processes of a broadcast message had to broadcast the message to the other processes as well. I was just wondering why the replicas didn't do the same in the Primary Backup replication scenario.. Hence the question, when is the recipient of a broadcast message supposed to broadcast a message versus unicast it?
Ah, so it sounds like you're wondering if the broadcast to the backups should be a reliable broadcast -- in other words, one that either every correct process delivers or no one does. Well, it could be! But ask yourself what you'd gain from doing that, and what the tradeoff would be.
@@lindseykuperwithasharpie I see what you're saying.. In case of Primary Backup Replication, there is not much to be gained if the backups broadcast the update to each other since all the backups have already received the update, but for whatever reason if one of the updates from the Primary to the Backup failed, it would have helped to have the broadcast be a reliable broadcast for a faster acknowledgment to the primary..
Nice lecture. Thank you. Does chain replication solve consensus?
I begin to talk about consensus a little at 53:20 in this video (and you can watch the next lecture for a lot more on this topic), but the short answer to your question is no. In fact, chain replication (and any technique for strongly consistent replication) ultimately requires consensus.
In primary-backup replication, do clients block until they get a response back from the primary?
No, they can make other requests to the primary. But if they've made a write and it hasn't been acknowledged yet, then all bets are off for being able to read the write.
@@lindseykuperwithasharpie Thanks for your lectures! I have a follow-up; if primary can process simultaneous writes, how do we ensure strong consistency or total order? For example, if C1 makes a (Write, X, 4) and C2 makes (Write, X, 5) immediately after and the primary sends backup RPC calls to replicate it, but backup may receive the calls in order [(Write, X, 5), (Write, X, 4)], doesn't this violate total order? Are we assuming a FIFO channel between primary and backup?
Great question! Yep, for both primary-backup and chain replication, the storage nodes should communicate with each other in FIFO fashion.