AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • Come to this session to learn how Amazon DynamoDB was built as the hyper-scale database for internet-scale applications. In January 2012, Amazon launched DynamoDB, a cloud-based NoSQL database service designed from the ground up to support extreme scale, with the security, availability, performance, and manageability needed to run mission-critical workloads. This session discloses for the first time the underpinnings of DynamoDB, and how we run a fully managed nonrelational database used by more than 100,000 customers. We cover the underlying technical aspects of how an application works with DynamoDB for authentication, metadata, storage nodes, streams, backup, and global replication.

Komentáře • 31

  • @shawapratim
    @shawapratim Před 21 dnem

    Amazing. Just what I needed. Thank you for making this available here.

  • @dimkir100
    @dimkir100 Před 5 lety +36

    This one is an awesome talk. Jaso goes through important innards of DynamoDB, including secondary indexes, throttling, burst capacity, accumulating read capacity, adaptive capacity, etc.

  • @ahmxtb
    @ahmxtb Před rokem +2

    I've read both Dynamo 2007 and DynamoDB 2022 papers and this is a great summary. Thanks.

    • @awssupport
      @awssupport Před rokem

      Glad you enjoyed it, Ahmet! 😃 ^KR

  • @saumilkapadia88
    @saumilkapadia88 Před 4 lety +1

    very nicely described. kept on watching and watching. came back to this video multiple times when i understood more and more.

  • @CAMorales82
    @CAMorales82 Před 2 lety +8

    0:00 Intro
    2:39 Agenda
    - GetItem/PutItem
    - Auto Scaling
    - Backup Restore
    - Streams
    - Global Tables
    3:04 GetItem/PutItem
    - Request Router
    - Paxos
    - Partition Metadata System
    - Tables (hashing, partitioning)
    - Eventual Consistency
    - Storage Nodes (B-tree, Replication Log)
    - System Management (Auto Admin, Partition Repair)
    - Secondary Index (Log Propagator)
    - Provisioning Table Capacity
    - Adaptive Capacity
    28:48 Auto Scaling
    33:54 Backup Restore
    - Point in Time
    - On-demand backup
    41:43 Streams
    44:45 Global Tables

  • @galeop
    @galeop Před rokem +3

    Key takeaways:
    - storage nodes acknowledge a WRITE (on a partition) back to the request router if 2 out of the 3 storage nodes for the affected partition successfully completed the WRITE. So when you perform an "eventually consistent READ", the odds to get a strongly consistent result are 2/3.
    -The 3 storage nodes of a partition decide between themselves which one will be the partition leader.
    - Leader nodes store the partition data into a B-tree index, and they also maintain a "replication log" (a kind of TX log).

  • @rahulat85
    @rahulat85 Před 3 lety

    Good talk! Seems like a industry guest talking about dynamodb in classroom.

  • @VitalyZdanevich
    @VitalyZdanevich Před 4 lety +11

    Thank you for dark slides.

  • @pemessh
    @pemessh Před 4 lety

    Great talk.

  • @brandonhunter3036
    @brandonhunter3036 Před 5 lety +5

    damn that was a good talk

  • @anjalivas1111
    @anjalivas1111 Před 3 lety

    Nice. . . very nice.

  • @70lan7
    @70lan7 Před rokem

    Best talk!

  • @frankren1333
    @frankren1333 Před 5 lety

    Trying to guess the PID controller part, seems it only needs the P(proportion)?
    let multiplier = 1.5
    Let's say we want to allow roughly 60 seconds of 150% consumed over provisioning.
    Then on each iteration(say, a second), do:
    multiplier = multiplier + (provisioned - consumed) / provisioned / 60 # 60 is for 60 seconds
    multiplier = min(1.5, max(1, multiplier)) # clamp it between 1 & 1.5
    This way the consumed can surge to 1.5 times provisioned and will slowly go down to 1, then keep as 1 until consumed is below provisioned and slowly bring multiplier back to 1.5

  • @Himanshu-mb8nl
    @Himanshu-mb8nl Před 5 měsíci

    At 9:21, could the client be talking to the request router (RR) in an availability zone (AZ) and it isn't necessary that it has the leader storage node, and therefore the RR might have to send the request to the storage node leader to a different AZ. Nothing wrong with that but I wonder if there are any performance savings in having writes go directly to the RR in the AZ with the leader storage node.

  • @AbHIShEkUPAdhYaYshekup

    Perfect.

  • @galeop
    @galeop Před rokem

    18:00 I guess we're talking about *global* 2ndary indexes here, and not *local* 2ndary indexes, correct ?
    Does this propagation of a new value from the main table to the 2ndary index happen asynchronously?

    • @awssupport
      @awssupport Před rokem

      Hi Galeop, I appreciate your question and have raised this with the necessary team. Please check out this doc which may provide more insight into this: go.aws/3jETlOJ You're also welcome to reach out and discuss this matter on re:Post, by using this link: go.aws/aws-repost. ^ES

  • @kevin8918
    @kevin8918 Před 4 lety

    the last few sentences are like "we are hiring; come and work with us" lol

    • @justmeandmy
      @justmeandmy Před 4 lety +1

      Yeah, and now that you've seen the talk you're already partially onboarded :P

  • @Mike-ci5io
    @Mike-ci5io Před rokem

    Does the replication log store the entire history of the data or just current state ?

    • @ahmxtb
      @ahmxtb Před rokem +1

      usually the writeahead log only stores the portion of the data that's residing in the memory that's not yet flushed to disk. usually some databases that use Log-structured merge trees store most recent data in in-memory trees + writeahead log, and periodically flush to disk files.

  • @Tommy-dd5pz
    @Tommy-dd5pz Před 2 lety

    I think it's looks like nanoseconds but actually milliseconds? 50:42

  • @sagara9389
    @sagara9389 Před 5 lety +2

    Dynamodb is awsome for bigdata works but it is really hard to work with CRUD operations

    • @guild_navigator
      @guild_navigator Před 4 lety +2

      What are you struggling with?

    • @kulpranav
      @kulpranav Před rokem +1

      Its just another kv data store. Why is it hard for crud?

  • @youran
    @youran Před 2 lety

    Where can I download the PPT, please?

    • @NR-bt7yz
      @NR-bt7yz Před rokem +1

      d1.awsstatic.com/events/reinvent/2019/Amazon_DynamoDB_Under_the_hood_of_a_hyperscale_database_DAT325.pdf (it's from 2019, but appears to be identical)

  • @galeop
    @galeop Před rokem

    13:57

  • @arifmalikoracledba9757

    Might it worth blowing your nose (away from the Mic ! ) & clearing you air waves before start of your session !

  • @grousemoriarty
    @grousemoriarty Před 8 měsíci

    what db does scamazon use for thier own shopping?