CSE138 (Distributed Systems) L12: Paxos and consensus wrap-up, passive/active replication

Sdílet
Vložit
  • čas přidán 10. 05. 2021
  • UC Santa Cruz CSE138 (Distributed Systems) Lecture 12: Paxos wrap-up: nontermination, Multi-Paxos, fault tolerance; other consensus protocols: Viewstamped Replication, Zab, Raft; passive vs. active (state machine) replication; implementing read-your-writes and causal consistency
    Recorded May 11, 2021
    Professor Lindsey Kuper users.soe.ucsc.edu/~lkuper/
    Course website: decomposition.al/CSE138-2021-03
    Schedule of topics: decomposition.al/CSE138-2021-0...
  • Věda a technologie

Komentáře • 26

  • @theSquashSH
    @theSquashSH Před 3 lety +5

    Awesome course! I've only had a superficial understanding of Raft and Paxos but always wanted to learn one level deeper without getting into the formal proof weeds and this is perfect. Thanks for teaching this!

  • @420_gunna
    @420_gunna Před rokem

    Is there anything interesting to say on the difference/relationship between Active(StateMachine)/Passive replication in this lecture (~1:26:00) and the concepts of Statement-Based/Row-Based replication as I've heard them described in database literature? Are these pretty much the same thing to you? Thank you! :)

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před rokem

      I wasn't familiar with the terms "statement-based replication" and "row-based replication", but from a quick look, they do seem kinda analogous to active replication and passive replication, respectively. See also op-based vs. state-based CRDTs for another point of comparison.

  • @oz5219
    @oz5219 Před 2 lety +1

    Dr Kuper, in the case of multi paxos, if an acceptor do accept(n ,val2, 2 as sequence number), and a new proposer come along, do propose(n+1), does the acceptor return promise(n+1, val2, 2), or does it return something like promise(n+1, val1,1,val2, 2) (I mean does it return the whole history of its phase 2)?

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      Good question!

    • @klindatv
      @klindatv Před 2 lety

      @@lindseykuperwithasharpie is possibile that will it take the most frequent vote in the max last propose (round) ?

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      @@klindatv Not sure I understand your question. Can you rephrase or add some more context?

  • @yoshitoki23
    @yoshitoki23 Před 2 lety

    When you say the other consensus algorithms "bake in" leader election, I'm confused about what you mean by leader. Does you mean leader as in the primary node in primary-backup replication, or leader as in a distinguished role within a consensus algorithm that gets to propose values?

  • @bharathram3977
    @bharathram3977 Před 10 měsíci

    @18:50 Regarding the stalemate problem of how to choose a leader, why can't we have randomised timeouts between the proposers? That way, there'd be a very low possibility of indefinite stalemate coz eventually one proposer's Prepare(n) + Accept(n,foo) are going to be accepted before the other proposer's Prepare(n) reaches. Also, with this idea, we can more/less gaurantee termination & get away with the restriction imposed by FLP right?

  • @yoshitoki23
    @yoshitoki23 Před 2 lety

    So it seems like there's 2 ways to implement total order broadcast: have all writes go through a single leader node that implicitly orders all the writes, or allow writes to go to any node, and run a consensus algorithm to explicitly decide its order. Why do one or the other? I imagine running a consensus algorithm on every write is very expensive, but on the other hand, it allows clients to write to any node, and doesn't have to deal with leader failover and election.

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      Those are indeed two ways to implement total-order broadcast. Actually, many TOB algorithms have been proposed. Check out "Total order broadcast and multicast algorithms: Taxonomy and survey" ( dl.acm.org/doi/abs/10.1145/1041680.1041682 ) for an attempt to taxonomize them. However, all options ultimately boil down to consensus, since the two problems are equivalent. Having all writes go through a single leader requires consensus to establish the leader.
      One practical reason to have the consensus cluster be separate from the nodes you want to be broadcasting to would be if you want consensus to be an independent service that you would expect lots of other systems to want to use, too.

  • @agent_smith
    @agent_smith Před 2 lety

    I just want to confirm that by definition, Causal Consistency does not include Read Your Writes.. You included this for the assignment where students are supposed to design a distributed system that provides causal consistency.. Is this true for all implementations of a Distributed System that provides causal consistency or would it be possible to design a distributed system where we provide Causal consistency but not RYW consistency?

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      No, causal consistency *does* include RYW, in the sense that all executions that observe causal consistency observe RYW too. It's just that the way that we *informally* talk about causal consistency is a bit fuzzy. If we say things like "writes that have a causal order are seen by all processes in the same (causal) order", that could make us think that an execution that contains just one write and one read and violates RYW is vacuously causally consistent, because it only has one write. However, this problem is made moot if we imagine that every replica has an initializing write.

    • @agent_smith
      @agent_smith Před 2 lety

      @@lindseykuperwithasharpie I see what you're trying to say, but if was just about informally talking about this then RYW Consistency should not have subsumed Causal Consistency when we went over Consistency Models in Lecture 10.
      Also, going by your last point if this is the case then wouldn't RYW consistency be a special case of of Causal Consistency where we have just one write?

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      What do you mean by subsumed? If you mean that all executions that observe causal consistency observe RYW too, then that's correct. The set of executions that observe CC is a subset of the set of executions that observe RYW.

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety +1

      I think it's confusing to say that RYW is "a special case of causal consistency", since there are lots of executions that observe RYW but violate causal consistency. I'd say instead that RYW is one aspect of causal consistency. RYW is one of four "session guarantees" originally proposed by Doug Terry and collaborators; interestingly, with the four session guarantees taken together, we get causal consistency.

    • @agent_smith
      @agent_smith Před 2 lety +1

      @@lindseykuperwithasharpie Yes, that's what I meant.. Thank you for clearing this up!

  • @meetme2meat
    @meetme2meat Před 4 měsíci

    Why are we not talking about gossip protocol ..

  • @agent_smith
    @agent_smith Před 2 lety +1

    What is Causal metadata and could you provide some pointers/reading materials on what it looks like? Also how does a client know which replicas to send causal metadata to and for what type of requests?

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      Have a look at the earlier lecture in the series that covers implementing causal broadcast. In the causal broadcast example, vector clocks are causal metadata. czcams.com/video/buBGyECx69c/video.html
      In the setup I'm discussing here, clients include causal metadata in all their requests except for the first one they make. They get causal metadata back from the replica in the response to their first request and thread it through on all subsequent requests.

    • @agent_smith
      @agent_smith Před 2 lety +1

      @@lindseykuperwithasharpie Silly me, didn't realize vector clocks were actually causal metadata! Thank you so much for patiently answering my questions, I sincerely appreciate it 😊