Distributed Systems 5.2: Quorums

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • Accompanying lecture notes: www.cl.cam.ac.uk/teaching/212...
    Full lecture series: • Distributed Systems le...
    This video is part of an 8-lecture series on distributed systems, given as part of the undergraduate computer science course at the University of Cambridge. It is preceded by an 8-lecture course on concurrent systems for which videos are not publicly available, but slides can be found on the course web page: www.cl.cam.ac.uk/teaching/212...

Komentáře • 47

  • @elvis0305
    @elvis0305 Před 2 lety +20

    absolutely gold, got to understand the whole quorum thing in less than 10 minutes ! thanks Martin!

  • @dwyanewan3087
    @dwyanewan3087 Před 3 lety +10

    This is the author of the famous book Designing Data-Intensive Applications (DDIA)!

  • @andrewsouthpaw
    @andrewsouthpaw Před 3 lety +17

    7:18 that diagram is so simple yet helped me understand the r + w > n formula *way* better. 👏

  • @zhou7yuan
    @zhou7yuan Před 2 lety +12

    Probability of faults
    Read-after-write consistency [2:06]
    Quorum (2 out of 3) [3:58]
    Read and write quorums [5:41]
    r + w > n
    Read repair [8:19]

  • @RKelly694
    @RKelly694 Před 5 měsíci

    Your are such a legend, thank you so much for this video. I went through my lecture notes continuously, but none of it even remotely came close to how well you explained it. Honestly, couldn't thank you enough and that example was brilliant.

  • @debasishmallick4639
    @debasishmallick4639 Před 3 lety +1

    Absolutely pleasure watching it. Nice way of presenting it. Thank you Martin !!

  • @xxbighotshotxx
    @xxbighotshotxx Před 2 lety +1

    Much appreciated. Thank you! You made this incredibly simple to understand

  • @Endredos
    @Endredos Před rokem

    Pleasantly clear and simple explanation. Thank you!

  • @nasirahatiq3381
    @nasirahatiq3381 Před 2 lety +1

    Beautiful explanation, thank you so much!

  • @thomasenblom9579
    @thomasenblom9579 Před 2 lety

    Beautiful and simple explanation!

  • @yuvalboker4892
    @yuvalboker4892 Před 10 měsíci

    Perfect! We love you Martin!

  • @sarkersunzidmahmud2875

    thanks a lot, Martin for the clear explanation.

  • @Best-Regions
    @Best-Regions Před 9 měsíci

    Excellent explanation

  • @mrmaker101
    @mrmaker101 Před 2 lety +1

    very good and well defined explanation. Thanks!

  • @leminhquan7598
    @leminhquan7598 Před 6 měsíci

    The explanation is so good!

  • @L4ller0
    @L4ller0 Před 2 lety

    Thank you so much for the video, I found it very helpful in understanding the Sloppy Quorum!

  • @hankim1664
    @hankim1664 Před 2 lety

    Very good explanation! 🙌

  • @jacobwhitaker9477
    @jacobwhitaker9477 Před 3 lety +1

    great explanation thank you

  • @mortex9947
    @mortex9947 Před rokem

    Thank you. Very good explained!

  • @netrabhatta
    @netrabhatta Před 3 lety

    Thank you, Professor

  • @welltv1142
    @welltv1142 Před rokem

    Vielen Dank fuer die tolle Erklaerung!

  • @AnantChowdhary
    @AnantChowdhary Před rokem

    Great explanation

  • @manojsai6571
    @manojsai6571 Před rokem

    Thanks Martin!! 🙏🏼

  • @sujeetmailer
    @sujeetmailer Před 2 lety

    Well done !

  • @AnuragKumar-mw6es
    @AnuragKumar-mw6es Před rokem

    great content !! thanks a lot man!!

  • @srikrishna6595
    @srikrishna6595 Před 2 měsíci

    bro you rocked it

  • @SJM66
    @SJM66 Před 2 lety

    Good job Martin, thanks for shedding light on this topic!

  • @m.e10150
    @m.e10150 Před rokem

    thank you

  • @omartarek48
    @omartarek48 Před 3 lety

    Great

  • @abcdef-fo1tf
    @abcdef-fo1tf Před rokem

    at 4:30 what if one of the acknowledgement also fails. We could have another node read the responses and use it, but the initial write doesn't know it's successful, would it write again? it could potentially overwrite the other nodes' work if it does that

  • @AdityaPriyam
    @AdityaPriyam Před rokem

  • @ronakjacob6133
    @ronakjacob6133 Před 2 lety +1

    dude i died, but thnx for explaining i might pass my exams now

  • @nealdi5352
    @nealdi5352 Před 2 lety

    Question: what if when the quorum write 1 out of 3 succeeded, but 2 failed to write, how can we rollback the one that write ok? Because we already return the client that write if failed.

    • @murali1790able
      @murali1790able Před 2 lety

      good question, can someone ans this question?

    • @kleppmann
      @kleppmann  Před 2 lety +6

      In the simple case, there is no rollback - you end up with an incomplete write. If a write succeeds, you know that the value is stored on a quorum of replicas, but if a write fails, you don't know how many replicas it's stored on (it could be one or even more). If this is unacceptable, and you want to ensure that a write is either written to several replicas or not written at all, you need a more sophisticated protocol such as 2-phase commit, which is discussed in lecture 7.1 of this series.

    • @cobrab1978
      @cobrab1978 Před rokem

      @@kleppmann If there is no rollback and for example I have 3 replicas, when reading from 2 replicas I can get an erroneous version (the one that did not reach the write quorum) and assume it is correct. In the case of using 2PC, is it possible to use a version without a coordinator to keep the system interacting directly between clients and replicas?

  • @iyadelwy1500
    @iyadelwy1500 Před 2 lety +2

    Thank you.
    - 10 egyptian procrastinators the night before the exam

  • @adityadeo7043
    @adityadeo7043 Před 2 lety

    How following problems are solved in replication?
    Question 1. Suppose there are 3 nodes.
    Variable 'x' has initial value of 0 on all the nodes.
    Client triggers operation 'x = 1'.
    The operation is successfull on node 1 but fails on nodes 2 and 3 (due to unrelated issue e.g. broken network).
    Hence the update operation as a whole has failed but the data on node 1 has changed.
    This will lead to inconsist reads - If the quorum includes node 1 then the read(x) will return 1, otherwise read(x) will return 0.


    Question 2. In case of multiple variables that form an invariant, one variable might be updated successfully while the other one encounters an issue.

    • @egasimov
      @egasimov Před rokem

      Answer to Question 1:
      In leaderless replications - just use Quorums for reading and writing.
      formula: r + w > n
      r - denotes the minimum number of replicas needed when a client makes a “read request” from n replicas(parallel)
      and waits for at least r replicas successfully responds.
      w - denotes the minimum number of replicas needed when a client makes a “write request” to n replicas(parallel)
      and waits for at least w replicas successfully responds.
      Let's take r=w=2
      n = 3
      w = 2
      r = 2
      “Client triggers operation 'x = 1'.
      The operation is successful on node 1 but fails on nodes 2 and 3 (due to unrelated issues e.g. broken network).”

      From the above statement, the successful write request count is 1 (on node 1), so 1 is not greater or equal to w(which is 2).
      According to the write quorum number(w), at least 2 nodes where the write operation is successfully executed are required. So the client's write request will return an error.
      General rule:
      'If fewer than the required w or r nodes are available, writes or reads return an error. A node could be unavailable for many reasons: because the node is down (crashed, powered down), due to an error executing the operation (can’t write because the disk is full), due to a network interruption between the client and the node, or for any number of other reasons. We only care whether the node returned a successful response and don’t need to distinguish between different kinds of fault.'

  • @good-vibes-inc
    @good-vibes-inc Před 3 lety

    Are there any real world examples of such data replication exist? It multiplies number of your datastores that you should pay for :c

    • @4am4i
      @4am4i Před 3 lety +3

      Cassandra, for example. My guess is that any distributed database with leaderless replication

    • @TruongHoang-du9if
      @TruongHoang-du9if Před 3 lety

      @@4am4i exactly, Cassandra is a typical example. Why did you call it leaderless replication? By that you mean multiple-leader distributed database?

    • @skdangi
      @skdangi Před 3 lety +2

      @@TruongHoang-du9if Leaderless means write can be done on any replica

  • @anrikezeroti4680
    @anrikezeroti4680 Před 2 lety

    Quorum is Trustful therefore incompatible with public blockchain right.
    I came up with that conclusion open to discussions.

    • @lfm3585
      @lfm3585 Před 2 lety +1

      Yes if you have a bad actor as a node in these examples it can take out the entire system, by generating bad timestamps for example.

    • @kleppmann
      @kleppmann  Před 2 lety +4

      Some blockchains are built on the idea of quorums. In order to tolerate some nodes being malicious, they typically use bigger quorums - for example, they might require acknowledgements from more than two thirds of replicas, rather than more than half.

  • @tranminhhaifet
    @tranminhhaifet Před rokem

    thank you