Database Internals: Two Phase Locking | Systems Design Interview: 0 to 1 with Ex-Google SWE

Sdílet
Vložit
  • čas přidán 24. 02. 2023
  • Ex-Google L3 just doesn't sound as good as Ex-Google tech lead does it
  • Věda a technologie

Komentáře • 26

  • @vankram1552
    @vankram1552 Před rokem +9

    The Tech Lead arc has begun.

  • @roywastaken
    @roywastaken Před rokem +14

    just got an offer at Microsoft thanks to your channel!

  • @JoseEGuerra
    @JoseEGuerra Před 2 měsíci +1

    really good video!

  • @StellasAdi18
    @StellasAdi18 Před rokem +2

    Eng from Meta and big cudos to all your videos!! Just diving a step in detail, when we create 3rd entry for flexibility class and my friend also tries to create 3rd entry for class; how do phantoms avoid 4th entry. These rows may be different as they never existed?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před rokem +1

      Well the phantoms are what we are trying to avoid! The point is to stop each other from creating conflicting rows that don't yet exist, we need to grab predicate locks while trying to create the rows so that our queries will both grab the same predicate locks and only one of us could create the row

    • @shineinusa
      @shineinusa Před 25 dny +1

      If a new row matches the predicate conditions and if the predicate is already locked, then your friend can’t add a new row

  • @ronykrell9170
    @ronykrell9170 Před 3 měsíci +1

    Hey , not sure I follow the deadlock example. Wouldn't you release Snoop Dog's lock after reading his cart before writing to yours (and vice versa)?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 3 měsíci +1

      So actually we can't release that lock until the transaction is completely finished, or else Snoop Dog's row could change before I complete my write and we have a stale read on my end.

  • @msebrahim-007
    @msebrahim-007 Před měsícem +1

    So the solutions to solve the Phantom issue is to the either locks more rows than necessary or lock all rows that satisfy a predicate. I'm having trouble understanding how either of these solutions of locking rows prevents new rows from being created?
    When we say "lock the rows" are we using a reader-lock or perhaps some other special lock?
    If it's a reader-lock, I don't see how that prevents another transaction from writing the same row.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem +1

      How can I write a new row if the prerequisite to this is being able to grab a lock that some other transaction has already grabbed? It's any old lock. You need to grab it in order to perform the write.

    • @bharatahuja_
      @bharatahuja_ Před měsícem +1

      @@jordanhasnolife5163I had same doubts so when you say it need grab the lock means an exclusive lock here right?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem

      @@bharatahuja_ Reader writer

  • @slover4384
    @slover4384 Před měsícem +2

    Good content.
    You are conflating phantom reads with read-modify-write.
    Phantom reads are not about writes. They are about predicates in where clause changing mid-transaction because another transaction committed.
    Read-modify-writes (e.g. lost updates) are a sub-example of write skew.
    Write skew is strictly easier to solve with a lower level of isolation than phantom reads.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem

      Oops - probably forgot what I was talking about by the time I was making this video lol, thanks for the clarification.

    • @LegitGamer2345
      @LegitGamer2345 Před 14 dny +1

      gpt says that phantom reads is easier than write skew:
      let’s re-examine the concepts of write skew and phantom reads, and the isolation levels required to address them, to ensure clarity and accuracy.
      Phantom Reads
      Phantom reads occur when a transaction reads a set of rows that satisfy a condition, and another transaction modifies the set of rows (by inserting or deleting rows) such that the first transaction sees different rows if it re-executes the query.
      Repeatable Read: This isolation level ensures that if a transaction reads a row, it will see the same data if it reads that row again. However, it does not prevent phantom rows because new rows can still be inserted or deleted by other transactions.
      Serializable: This isolation level ensures complete isolation by making sure that transactions are executed in such a way that they appear to be run sequentially. It prevents phantom reads by locking the entire range of rows that the transaction queries, ensuring that no other transaction can insert or delete rows in the range.
      Write Skew
      Write skew occurs when two transactions read overlapping sets of rows and then perform updates based on what they read, potentially leading to inconsistent results.
      Repeatable Read: This level does not prevent write skew because it only ensures consistent reads for individual rows, not for sets of rows.
      Serializable: This level ensures that transactions are executed in a way that they do not conflict, effectively preventing write skew. Serializable isolation makes sure that any read-modify-write operations are coordinated so that the results are consistent.
      Which is Easier to Solve with Lower Levels of Isolation?
      Phantom Reads: These can be partially mitigated with Repeatable Read or Snapshot Isolation. However, full prevention of phantom reads typically requires Serializable isolation because it ensures that no new rows can be added or existing rows deleted within the range of a query during the transaction.
      Write Skew: Preventing write skew effectively requires Serializable isolation because it ensures that transactions are serialized, preventing conflicts from overlapping read and write sets.
      Conclusion
      Upon careful re-evaluation, it is clear that:
      Phantom reads can be mitigated to some extent with lower isolation levels like Repeatable Read or Snapshot Isolation, but complete prevention typically requires Serializable isolation.
      Write skew is more challenging to solve with lower isolation levels and generally requires Serializable isolation to ensure that transactions do not produce conflicting updates.
      Thus, while both anomalies are best addressed by Serializable isolation, phantom reads can be more easily mitigated with lower levels of isolation compared to write skew. This makes phantom reads slightly easier to handle with lower isolation levels than write skew.

  • @user-ee7oi8qv7f
    @user-ee7oi8qv7f Před měsícem +1

    Does the Index range locking also check whether every new entry matches the predicate while the transaction defining the predicate is in process?

  • @brandonwl
    @brandonwl Před 10 měsíci +2

    Is going this in depth for databases really needed? I thought most of these concepts are already handled by cloud providers?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 10 měsíci +6

      If you're planning on passing systems design interviews and doing well in your job I'd posit that the answer is yes

  • @Gerald-iz7mv
    @Gerald-iz7mv Před 7 měsíci +1

    is Two-Phase Locking is a specific protocol that implements pessimistic locking?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 7 měsíci

      I would say it's an algorithm that achieves serializability. Not sure I'd say it "implements" pessimistic locking since that's not like a protocol or interface, but it definitely falls under the category of pessimistic locking