Eventual Consistency vs. Strong Consistency | How to decide between the two in System Design

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • This video discusses the reasons why we prefer eventual consistency over strong consistency in certain scenarios and discuss three real-life scenarios.
    00:00 - Introduction
    01:00 - Criteria for using eventual consistency vs strong consistency
    01:50 - Strong consistency in TinyURL Datastore vs eventual consistency
    03:38 - Eventual Consistency in calculating likes count on a Tweet in Twitter Design
    13:00 - Eventual Consistency vs Strong Consistency for Inventory count in Amazon Marketplace design
    16:00 - Eventual Consistency vs Strong Consistency in Dropbox Design
    20:05 - How to Ace your System Design Interview for Senior/Principal Software Engineering Positions
    Distributed System Design Interviews Bible | Best online resource for System Design Interview Preparation is now online. Please visit: www.thinksoftwarelearning.com?CZcams-eventual-consistency
    Please follow me on / think.software.community if you like to get notified about new course chapters getting added or any other updates. I will also take your suggestions there about the course and the channel.
    Check out our following articles:
    - How to Ace Object-Oriented Design Interviews: / how-to-ace-object-orie...
    - Elevator System Design - A tricky technical interview question: / elevator-system-design...
    - System Design of URL Shortening Service like TinyURL: / tinyurl-design-from-th...
    - File Sharing Service Like Dropbox Or Google Drive - How To Tackle System Design Interview: / how-to-tackle-system-d...
    - Design Twitter - Microservices Architecture of Twitter Service: / design-twitter-microse...
    - How to Effectively Use Mock Interviews to Prepare for FAANG Software Engineering Interviews: / how-to-effectively-use...
    - Robinhood Backend System Design - How to receive realtime stock updates: / robinhood-backend-syst...
    - Selecting the best database for your service: / selecting-the-best-dat...
    #SystemDesign #DistributedSystems #FAANG #Facebook #Google #Amazon #Apple #Microsoft #Uber #Netflix #Oracle #Lyft #Interview #ComputerProgramming

Komentáře • 56

  • @ThinkSoftware
    @ThinkSoftware  Před 3 lety +6

    Please let me know in the comments below if you find it useful. Also please do not forget to like this video and subscribe to the channel.

    • @srini9
      @srini9 Před 3 lety

      Almost all (90%) of your course content is available in public domain. Don't understand why would someone want to buy your course

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +7

      Good question. The same applies to any book/course on any subject but then why would anyone buy a book/course 🤔. We will just say don't judge a book by the cover. You need to read it before making a judgment. Sometimes the same information is presented in better way with other extra information that you may not get in a single place. Also please note, all the system designs that we are discussing in the course are our own and hence not something available in public domain. Also the course is written based on my 15+ years of industry experience and there are many things (tips, warnings, etc.) that I am sharing based on my experience that you may not find on internet easily and you may have to do a lot of research for it. Similarly, all the mock interviews with all the insight, you won't find them freely on internet.

    • @SpiritOfIndiaaa
      @SpiritOfIndiaaa Před 2 lety

      Amazing thanks a lot

  • @AyushJainCodingEnthusiast
    @AyushJainCodingEnthusiast Před 3 lety +18

    Q: Fetch me the most underrated channel on CZcams
    A: Think Software
    I mean, man you are a gem, thanks for making System Design Learning process easy

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +1

      Thanks for the comment 🙂

    • @bramanandhareddyg6723
      @bramanandhareddyg6723 Před 2 lety +1

      @@ThinkSoftware Yes, this is one of the most underrated. Amazing clarity of thought.

    • @angelmotta
      @angelmotta Před rokem +1

      I also confirm this. Great content!! This is a new channel for me! Here a new subscriber!! Thanks for this lesson!

  • @arunavhkrishnan4932
    @arunavhkrishnan4932 Před rokem +3

    I have never subscribed a channel within 1 min of watching the first content. I usually do not subscribe at all. And am yet to go to your channel to check if you are still active and making videos.
    I can already feel the content here is super rich.

  • @alexg475
    @alexg475 Před rokem +2

    First congratulations for the content. I just casually found your channel. Lots of interesting things.
    On the second scenario I think for a good user experience and functional correctness I'd go with the eventual consistency. More specifically the transactions accessing the count value should use the reread pattern (more specifically optimistic offline locking).
    This means: much more users(than item inventory) could potentially attempt to purchase the specific item. However the ones that get to the end of the purchasing process(submit the data for the last step - the payment) get to reread count and then decrease the count value. Decreasing the count value will happen first, then the payment processing will follow. If payment processing fails a compensating action will reincrement count value. Parallel user sessions (that attempt to purchase) reread and figure that actual count is zero can simply inform the user that item is out of stock. Other user sessions that access count value should also be prepared to handle this eventual consistency this way or another.
    I'd use the same solution for the user groups problem.
    What do you think about this approach?

    • @ThinkSoftware
      @ThinkSoftware  Před rokem +1

      Thanks for the comment. That solution might work.

  • @KajalSingh-og7fk
    @KajalSingh-og7fk Před 3 lety +3

    Your knowledge is exceptional, and the content you provide is so to the point and clean, its amazing

  • @SuperPooja17
    @SuperPooja17 Před 3 lety

    awesome lecture

  • @raghavendraraikar9359
    @raghavendraraikar9359 Před 2 lety

    Excellent food for thought

  • @bishnuagrawal828
    @bishnuagrawal828 Před 3 lety +1

    on likes count: Adding queue to manage consistency is correct upto some extent but the problem remains same when we consumer cannot gurantee the exectly once semantic. Mintaining exactly once semantic we need transactions.(obviously the load is less compared to previous approach)

  • @obamabinladen5055
    @obamabinladen5055 Před 3 lety +3

    The way to solve the inventory count problem is to GET the count first. If the count is below some arbitrary number (say 5, 10, etc), we can introduce exclusive locks otherwise we let eventual consistency logic run. What do you think? What is the correct way?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +2

      May be a topic for a future video :) Thanks for the comment.

  • @asangani888
    @asangani888 Před 3 lety +3

    Re: checkout cart. I think this is possible if we lock the entire section with a transaction/serializable. Something like below:
    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    BEGIN TRANSACTION;
    -- Suppose we need to check inventory when customer buys the product (i.e. submits the transaction)
    Select count(*) From Product where productID=100 && Inventory>0;
    -- if the number of rows returned by the above statement is non-zero, we can update to
    update Inventory (product_id=100) -- decrement inventory count
    COMMIT TRANSACTION;
    Above is possible in relational DB. I am not sure if any of NoSQL dbs provide this functionality as well.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +3

      Usually I don't do this because setting isolation level to serializable effect the write throughput because of reduction in concurrent writes. There are other ways to achieve this.

  • @adsd7349
    @adsd7349 Před 2 lety

    good explanation

  • @mnkhere
    @mnkhere Před 3 lety +1

    Since the problem we face is with the update operation in the Tweet Likes what we can do is separate it out in another table and have multiple copies of the same tweet, we can be clever around association of these multiple rows but lets say if tweetid was 100 we can have 100_1 .. 10. Now we can randomly choose the new tweetid in this new table and update the record. When we have to return the total number of likes we simply do a sum operation which will be very quick for these rows. Thoughts ?

    • @gersonadr2
      @gersonadr2 Před rokem

      My 2 cents: I'd be hard to index tweetid if the key is result of string concatenation.

  • @ravindrabhatt
    @ravindrabhatt Před 2 lety +1

    At 10.49, why not we put all the user activities (like in this case) in a queue and let the Client read from the queue and update the database?

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      thanks for the comment.. exactly this I discussed later in the video.

  • @rongrongmiao3018
    @rongrongmiao3018 Před 2 lety

    Amz inventory issue: can display a message warning user for low inventory count. So this way even if we use eventual consistency, there will not be a huge impact on user experience

  • @rameshramasamy878
    @rameshramasamy878 Před 3 lety

    In checkout cart can we use lock and release concept?

  • @aakash1763
    @aakash1763 Před 2 lety

    Great video one doubt as we will keep on pushing likes for a particular post and a worker will keep on incrementing but there are let's say 100M followers of some celebrity and when each follower will like the post before liking we will have to show the likes that is present right now on that post so will there be 100M calls on this table to get the count?

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      We will use cache in this case.. It is not necessary to show the exact likes count as discussed in the video, so even if the likes count is a bit behind the actual count value that does not matter.

  • @wkleunen
    @wkleunen Před 2 lety

    Rather than using a record for the likes count you could have used a view. This would achieve strong consistency.

  • @Liusy182
    @Liusy182 Před 3 lety +1

    Attempt to answer the question on ways to ensure strong consistency on the checkout cart:
    For a given inventory count e.g. 12, the backend service generates a bucket of 12 unique tokens and maintains it in a distributed cache (e.g. redis). For each checkout action, the service first tries to remove a token from the cache - if successful - it updates the inventory count with the count of the remaining tokens in the cache, and respond with a successful checkout.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +1

      Thanks.

    • @Eli-ir8tk
      @Eli-ir8tk Před 3 lety

      do we have to use unique tokens? I think something similar to Compare and Swap should also work, just compare the count from client request and the count up to date on backend service.

    • @rameshramasamy878
      @rameshramasamy878 Před 3 lety

      creating unique token is biggest task. Let take an example of amazon, it has million products and each one have huge quantity.

    • @Liusy182
      @Liusy182 Před 3 lety

      ​@@Eli-ir8tk i like this idea of Compare and Swap, but if I understand correctly, if there are multiple people who start with the same initial inventory (e.g. 12), only one will succeed in making the order by decrementing it to 11. The rest will fail because the backend already has an updated count of 11.

    • @Liusy182
      @Liusy182 Před 3 lety

      @@rameshramasamy878 That's a good call. How about restricting it to 1) create unique token per product, this makes uniqueness constraint much easier to satisfy 2) only use this strategy when the inventory goes below certain limit say < 30, so that we do not overload the server with unnecessary token creation.

  • @vadivpp
    @vadivpp Před 2 lety

    why are you omitting the duscussion of the data denormalization in the first place? why keep/maintain likes count as a separate field if it depends on some other table and definitely can be taken from a cache? also, in the end - didn't this guy just say what you came to as a conclusion of your reasoning - eventual consistency is fine, and he just saved you some minutes which maybe very valuable in such a short period of time

    • @vadivpp
      @vadivpp Před 2 lety

      if you mean one is not allowed to skip this reasoning - tell it explicitly, don't simply say "it's just wrong"

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety +6

      Thanks for the comment. I think i have already answered this in the video. Just one thing I will add. You don't know what type of interviewer you will get in an interview. You will find interviewers who will never tell you if you are wrong about something and you will think that you had a great interview till you get the result. So whatever answer you are giving you should be able to justify it in front of the interviewer and should give proper reasoning for your approach. So if you are saying we should be OK to use eventual consistency in some system, an interviewer may or may not ask you why. But what if the interviewer ask you why you think eventual consistency is fine and not strong consistency. And if you go check that mock interview, I actually specifically asked the candidate why he thinks eventual consistency is OK and the candidate was not able to give a good answer then.
      In the interview, an interviewer is not there to evaluate whether the candidate has saved few minutes. He is there to evaluate the system design skills of the candidate whether what if the candidate is asked to design a system on the job how he will decide different design approaches. If the candidate decides one approach whether he will justify that approach with good reasons or not.
      You should understand that I may not be able to discuss all possible solutions for this problem in a video and so discussing pros/cons of a one particular solution. Regarding data denormalization. Let's suppose we don't keep likes count in the tweet table. Then either we will keep it in some other table (where we will see the same issue) or else if we are not keeping it anywhere then it means every time a user query a likes count, we may need to go and run a scan query on the tweet_likes table to get the count of a particular tweet. You can answer this as well but then you should be prepared to discuss all possible issues with this approach as well.

    • @vadivpp
      @vadivpp Před 2 lety

      @@ThinkSoftware @Think Software ok, got it. and no, i didn't critisize the approach per se, but what i was trying to say, that you, as an instructor, could have elaborated on why keeping this number in a db table, rather than, e. g. in cache.

    • @triallaga3
      @triallaga3 Před rokem

      Agree, putting total of likes in cache makes more sense to me. In UI perspective, if number is small, user will care about current like count increase by 1, which doesn’t matter when the number is too big, and the thumb up icon is highlighted. In that sense, using total number from cache or even the number that UI loaded from the backend is good enough. So when user clicks like, UI will increase its local total number by 1 and send to backend to add 1 record to likes table, then updating total number in cache. So we don’t need strong consistency in this case but still can satisfy 3 expectations.

  • @putinkhuilo4315
    @putinkhuilo4315 Před rokem

    Dude, you seriously need to quit this. I am not able to follow your thoughts.