CSE138 (Distributed Systems) L14: Dynamo: Merkle trees, quorum consistency, tail latency

Sdílet
Vložit
  • čas přidán 18. 05. 2021
  • UC Santa Cruz CSE138 (Distributed Systems) Lecture 14: Dynamo: review of old ideas (availability, network partitions, eventual consistency, application-specific conflict resolution); intro to: anti-entropy with Merkle trees, gossip, quorum consistency, tail latency
    Recorded May 18, 2021
    Professor Lindsey Kuper users.soe.ucsc.edu/~lkuper/
    Course website: decomposition.al/CSE138-2021-03
    Schedule of topics: decomposition.al/CSE138-2021-0...
  • Věda a technologie

Komentáře • 15

  • @abhishekkumarsrivastava3480

    Thanks for sharing the Quality-content

  • @bharathram3977
    @bharathram3977 Před 9 měsíci

    @1:15:55 You mentioned that if N=W and if there failures (of nodes, networks etc.), then writes can enever happen. Just wanted to point out that this might not be the case always, due to the idea of "sloppy quorum" and "preference list". The number of nodes in preference list is > N and from this list, "top N healthy" nodes are choosen for read/write operations. Then, they use the idea of Hinted-Handoff to replicate data back once nodes have recovered

  • @AkhilNairjedi18
    @AkhilNairjedi18 Před 3 lety

    Could you please share the blog post you mentioned in the video at around 53 minutes? The one with Merkle trees having different number of elements.

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 3 lety +5

      Here you go: decomposition.al/blog/2019/05/31/how-i-learned-about-merklix-trees-without-having-to-become-a-cryptocurrency-enthusiast/

    • @AkhilNairjedi18
      @AkhilNairjedi18 Před 3 lety

      Thank you!

    • @chrisho6836
      @chrisho6836 Před 3 měsíci

      ​@@lindseykuperwithasharpie I think the original question should be answered by pointing to the partitioning of data (and therefore keys).
      "The coordinator is in charge of the replication of the data items that fall within its range. In addition to locally storing each key within its range, the coordinator replicates these keys at the N-1 clockwise successor nodes in the ring."
      It actually only ever makes sense that two replicas in charge of the same range compare Merkle trees.
      The more interesting conversation that seems to have been omitted from the paper is, with Dynamo's choice of eventual consistency, how do we ensure comparing 2 Merkle trees that represent the exact same point-in-time? Or perhaps always expect discrepancies and accept the redundancy of anti-entropy recovery (triggered by Merkle tree discrepancies) and normal replication.

  • @yoshitoki23
    @yoshitoki23 Před 2 lety

    Could you share your tips for how to read CS papers?

  • @ashokbyahatti
    @ashokbyahatti Před 2 lety

    Thanks a lot for sharing these awesome lectures! I have one question - Is quorum consistency with R=2,W=2 any better than R=1,W=3 in in terms of consistency when there are concurrent writes. From what I understand, we could have replicas disagreeing with each other in both the scenarios. Please confirm if my understanding is correct.

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 2 lety

      Are we talking about a setting where N=3? I'm not sure why writing to 2 would be "better in terms of consistency" than writing to 3.

    • @chrisho6836
      @chrisho6836 Před 3 měsíci

      What may have been confusing was that the lecture had W going from N (3) to N-1 (2) as an answer to solving the consistency problem (with concurrent writes). To answer your question, no. Adjusting R and W for all R+W>N does not affect our consistency nor does it address the concurrent write problem. The dynamo paper mentions the utilization of vector clocks and maintaining all values that are causally unrelated (for manual conflict resolution).

  • @thachnnguyen
    @thachnnguyen Před 4 měsíci

    It's very unsettling that the concurrent write scenario (x=3 and x=4) is not satisfactorily addressed. What is the resolution of what x is? LWW based on what? If R1 goes by x=3, x=4 sequence, it records x=4, whereas the other node records x=3. How is that resolved? Merkle tree tells a node the diff, then what? Which node should make the change so that the system is consistent (though not necessarily correct). How does the banking system resolve this issue?

    • @lindseykuperwithasharpie
      @lindseykuperwithasharpie  Před 4 měsíci

      If the updates are concurrent, are a few options that Dynamo-like systems support. You could store both concurrent updates and let the client sort it out. Or you could break the tie with physical timestamps, meaning that someone will lose their write. If neither of these options is workable for a given application, then a Dynamo-like system may be the wrong choice for that application.

    • @chrisho6836
      @chrisho6836 Před 3 měsíci

      The dynamo paper mentions the utilization of vector clocks and maintaining all values that are causally unrelated - to be resolved manually. For the DynamoDB offering: "If applications update the same item in different Regions at about the same time, conflicts can arise. To help ensure eventual consistency, DynamoDB global tables use a last writer wins reconciliation between concurrent updates, in which DynamoDB makes a best effort to determine the last writer. This is performed at the item level. With this conflict resolution mechanism, all the replicas will agree on the latest update and converge toward a state in which they all have identical data."
      Note that DynamoDB, undoubtedly named after the original Dynamo service, is not the exact same service - but I thought it would be helpful to include how DynamoDB (what the public gets to use) resolves this.