Write-ahead-logging

Sdílet
Vložit
  • čas přidán 12. 10. 2023
  • This is the first video in the database internals series. In this series I am going to talk about general patterns and concepts that are used in various databases. I will try to dig deeper and give you practical context for each.
    Stay tuned for more patterns in this series. Next set of videos will cover:
    1) BTree
    2) LSM trees
    3) MVCC
    4) Database caching
    5) Lifecycle of a SQL query
    ….
    Also don't miss out on the amazing conference NODES hosted by neo4j. It's completely FREE and gives you an opportunity to listen to various talks from experts around the world. Your location doesn't matter because there are many sessions in 24 hours across timezones.
    Go checkout and register now: neo4j.com/nodes
    If you like this video, hit the like button and share it with your network.
    You can follow me on Linkedin and Twitter for more updates:
    www.linkedin.com/in/kaivalyaa...
    x.com/thegeeknarrator?s=21
    Stay curious! Keep learning! See you in the next video.
    Cheers,
    The GeekNarrator

Komentáře • 36

  • @AtharvaRao0104
    @AtharvaRao0104 Před 8 měsíci +5

    WAL plays an important role in crash recovery and transaction recovery.
    Because WAL provides durability, it is possible for the page cache to do the updates(write operations) in memory.
    @8:28 - It is because of the page cache that buffers the updates in memory, that we get efficiency. WAL just provides durability to these in memory page cache operations. Writing to the WAL per se is not giving us efficiency but only durability in my understanding. Please let me know if I'm wrong.

    • @TheGeekNarrator
      @TheGeekNarrator  Před 8 měsíci +1

      Thanks yes you are right, I meant WAL gives us the confidence so we could only do writes in the buffer and avoid updating data files. Which indirectly means WAL enables us to gain efficiency. Does that make sense? I agree primary use case for WAL is durability and point in time recovery.
      Thanks for commenting and watching the video 🙏🏻. I really appreciate it.

  • @diegrootam
    @diegrootam Před 8 měsíci +2

    Awesome content. Really helpful to remember some important concepts. Thanks

  • @dhanushshetty7840
    @dhanushshetty7840 Před 9 měsíci +2

    It was like revising multiple db concepts I learnt in past few years. Very informative

  • @sbera87
    @sbera87 Před 3 měsíci +3

    Where is the commit log stored? I assume disk. If so, whats the cost of flushing data to disk vs appending to log on disk?

    • @TheGeekNarrator
      @TheGeekNarrator  Před 3 měsíci +1

      Yes its on the disk. Appending to log is cheaper because it is sequential and hence no random access. Flushing actual data to disk requires random access which is slower and requires tons of IO depending on what you are inserting/updating/deleting.

    • @akashgoyal2567
      @akashgoyal2567 Před 2 měsíci +1

      But log is then also prone to getting lost in case of server crashing since they are stored in in-memory before being flushed to disk . What’s the advantage of creating log then structure then ?

    • @TheGeekNarrator
      @TheGeekNarrator  Před 2 měsíci +1

      @akashgoyal2567 Typically if durability is important, you would fsync the log (Databases have config for that) which means the log is persisted to the disk. Since it is a sequential write it is way faster and more importantly when DBs have high concurrency one fsync call can be used to write 100s if not 1000s of transaction log to disk. That’s when it becomes really light weight as compared to updating data pages. In short log isn’t lost if you use the right configuration.

  • @varunvats32
    @varunvats32 Před 8 měsíci +3

    Would have been very helpful, if you could have explained the format of WAL, how it looks like, maybe an example. Overall a thumbs up

    • @TheGeekNarrator
      @TheGeekNarrator  Před 8 měsíci +5

      Thanks for the suggestion. Yes WAL structure is interesting. I can talk about that in a separate video, I want to keep the video length between 15-25 mins. Unfortunately I have do prioritise what to cover and skip. But please keep the feedback coming.

  • @yraghutube
    @yraghutube Před 9 měsíci +4

    Hi
    Nice informational video!!
    You said when we commit a transaction in DB, it writes to a log and then asynchronously it wites to the disk. So what happens, when DB receives a GET/Read query/request in-between it syncs the data from WAL to disk. ?

    • @TheGeekNarrator
      @TheGeekNarrator  Před 9 měsíci +3

      Great question. There are different implementations in different databases to handle this case. But at a high level, they do the following:
      1) they keep track of whats new in the WAL and how much the data files lag.
      2) If the read is for unmodified data then no worries, it can simply read from the data file. If not, then they try to merge WAL data, in memory state and data files to return correct result.
      3) the idea is to not grow the uncommitted WAL entries so much that it starts impacting the performance. So the background processes are(can be) tuned to checkpoint and commit sooner.
      I hope it clarifies?

    • @TheGeekNarrator
      @TheGeekNarrator  Před 9 měsíci +3

      I am going to cover this in a separate video.

    • @AtharvaRao0104
      @AtharvaRao0104 Před 8 měsíci

      @@TheGeekNarrator IMO - for the situation in this example. The page cache in memory already has the transaction updates. The WAL just provides durability for crash scenarios and enables the page cache to do its work of buffering the updates in memory instead of doing IO on disk every time. The writes to data files on disk can happen later. In the meanwhile if reads come they are always first checked in page cache. The data is synced from page cache in memory to data file pages on disk. Not from WAL to disk.

  • @AtharvaRao0104
    @AtharvaRao0104 Před 8 měsíci +1

    write ahead log - is append only log
    My understanding is - The WAL log is buffered in memory (as it does not occupy an entire disk block) but It has to be flushed to the append only log file on disk when requested by the page cache in a force operation(when log buffer fills up or txn commits).
    A transaction cannot be considered committed until the WAL log file on disk is flushed up to the commit record of the transaction.
    Probably including the log file on disk in the diagram will make it more clear..
    You correctly summarized the functionalities of WAL:
    1. Allow buffering updates in memory (instead of doing IO for every update) while ensuring durability of such changes
    2. Persists all changes until the page cache affected by these changes are flushed to disk(data records).
    3. Helps in crash recovery and transaction recovery.
    I had read about Kafka and I had also read about WAL's important role in databases.
    But it never struck me that the core ideas are very similar. Thank you for linking both of them.
    This idea of writing to a durable log and asynchronously processing the log to do a time consuming activity is so powerful.
    Your series is so interesting. May you continue to enlighten us like Martin Kleppman.

    • @TheGeekNarrator
      @TheGeekNarrator  Před 8 měsíci

      Yes thanks for pointing that out. IIRC I have talked about how fsync helps here in keeping the buffered WAL synced to disk with Seq IO, but that diagram could be clearer I agree.
      Thanks a lot for watching. I really appreciate it.

  • @vinitsunita
    @vinitsunita Před 8 měsíci +1

    Greate Video. It would be great if you can talk about format of WAL, and about the background process that is responsible for syncing WAL to data files on the disk. What is its frequence and what happens if we some concurrent read query is being fired on the table and changes are not yet synced to data files

  • @LeoLeo-nx5gi
    @LeoLeo-nx5gi Před 9 měsíci

    Hi thanks for the info.
    Had one query we are using a WAL, can you please explain to me is it like we are maintaining some kind of append only logs in a file or is that again in memory?
    Basically where is the log file stored? Coz if it's again in memory or is stored in some kind of DB we can have downtime in the WAL itself thus wanted to clarify this, thanks!!

    • @TheGeekNarrator
      @TheGeekNarrator  Před 9 měsíci

      The Log needs to be on the disk. As I have mentioned in the video it goes to the disk otherwise it’s a problem because then you can’t guarantee durability. Databases have specialised techniques to do that. In some cases log are also moved to a separate disk device such that it is safe from disk corruption.

    • @LeoLeo-nx5gi
      @LeoLeo-nx5gi Před 9 měsíci

      @@TheGeekNarrator ahh thanks for pointing out again!!!

    • @AtharvaRao0104
      @AtharvaRao0104 Před 8 měsíci

      WAL log files are append only log files on disk. But they are also buffered in memory as shown in this video for efficiency reasons. Once the log buffer is full or when the transaction is committed, this buffer is flushed to the disk in a sequential write. This info is missing in the diagram. I hope Kaivalya can find time to update the diagram to show a log file on disk so that the concept becomes clear.

  • @john_rambo_27098
    @john_rambo_27098 Před 8 měsíci +1

    As WAL is also in memory, how do ensure durability then. The node hosting WAL might crash as well.

    • @TheGeekNarrator
      @TheGeekNarrator  Před 8 měsíci

      WAL is fsynced to disk, which means it is written to disk in a durable manner. So transaction isn’t complete unless WAL is persisted.

  • @ajinkyagaikwad663
    @ajinkyagaikwad663 Před 8 měsíci +1

    High quality stuff! Love this!
    Things that I especially liked (and hope you continue this):
    - The emphasis on 'why' this is even needed
    - The emphasis on the concept as a 'pattern'
    Nit: In the first WAL slide, you show that the WAL is entirely in memory. I guess there is more detail there. I'd just call out that it is not entirely in memory.
    My noob understanding is that it is segmented and all old segments are stored in disk. The key point here is that these writes do not contribute to inefficient IO
    Keep it up! I'm looking forward to learning more

    • @TheGeekNarrator
      @TheGeekNarrator  Před 8 měsíci

      Thanks for the feedback.Good point, you are right the diagram can be a bit misleading. But I guess I have talked about how the WAL is synced with the disk and persisted. 😀 I hope that will help.

  • @zishanshaikh9375
    @zishanshaikh9375 Před 6 měsíci

    question: if the log is in memory, then won't we lose data when the machine crashes, I'm assuming the log is a data structure maintained in the same machine where the database process is running, if the machine where the db is running crashes, then isn't the log also going to be lost causing data loss ?

    • @TheGeekNarrator
      @TheGeekNarrator  Před 5 měsíci

      The log file is fsynced to disk to make sure data is durable. Data replication is used, so we can recover from complete failures such as disk corruption for example.

  • @mukuljha1738
    @mukuljha1738 Před 5 měsíci

    I have a doubt if the Log file is stored in memory then it doesn't have durability once the server crashes log file might be lost.
    @TheGeekNarrator

    • @TheGeekNarrator
      @TheGeekNarrator  Před 5 měsíci

      Yes the log file isn’t stored in memory only. It is fsynced to the disk. I guess the diagram I used is confusing people a bit. Sorry for that.

  • @joobis.b4568
    @joobis.b4568 Před 6 měsíci +1

    very well put !

  • @user-yw9us2qo6g
    @user-yw9us2qo6g Před 9 měsíci

    Nice

  • @TechieTech-gx2kd
    @TechieTech-gx2kd Před 5 měsíci +1

    LSN | Transaction ID | Operation | Table | Page | Data
    ----------------|-------------------------|------------------|------------------|-----------|---------------------
    0001A10 | 1234 | INSERT | customers | 12 | (id=1, name='John')
    0001B20 | 1234 | UPDATE | orders | 37 | (order_id=50, total=150.00)
    0001C30 | 1235 | DELETE | products | 29 | (product_id=20)
    0001D40 | 1236 | INSERT | employees | 03 | (id=10, name='Alice', role='Sales')
    ...
    This is how a WAL file looks like

    • @TheGeekNarrator
      @TheGeekNarrator  Před 5 měsíci

      Cool - what database is it? WAL format may differ based on the database.