WTF is Hadoop? | Systems Design Interview 0 to 1 with Ex-Google SWE

Sdílet
Vložit
  • čas přidán 24. 07. 2024
  • Imagine that it took these bums like a decade to add zookeeper into this, scrubs!
  • Věda a technologie

Komentáře • 16

  • @royarijit998
    @royarijit998 Před rokem +5

    Great video!
    There's so much to learn from your videos, thanks for all the hard work :)

  • @sudarshanprajapati1339
    @sudarshanprajapati1339 Před 4 měsíci +2

    Finally a YT channel that goes in depth about system design topics.

  • @AntonioMac3301
    @AntonioMac3301 Před 11 měsíci +2

    Yooo these videos are goated

  • @user-kf1ul3uh2w
    @user-kf1ul3uh2w Před 4 měsíci +1

    Awesome video! I have a quick question on the replication pipeline though. You mentioned that if write from A to B is failed, client could choose to ignore it and let the Name Node to do the replication. What if write to A is failed? For client he didn't get an ack neither way, but the second one definitely can't be ignored since Name node won't know how to do a replica without the original data. How the client knows the difference?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 4 měsíci +1

      I guess generally as a client we should likely assume that no ack = no write and try to rewrite until we get one

  • @zachlandes5718
    @zachlandes5718 Před 10 měsíci

    If you know the client is closest to a certain data center, can you manually or automatically have your two replicas assigned to that data center? In the case where you have multiple clients in different zones, would writes be slower on average because you can’t guarantee the secondary replica is in the same zone as the client +primary?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 10 měsíci +1

      It's less so that they're in different zones, ideally you have two data centers on the same rack and a third on a different rack same datacebter or something like that

  • @PavanBommana19-25
    @PavanBommana19-25 Před 4 měsíci +1

    What is the difference between the replication pipeline in hadoop and the chain replication? Both look very similar. Chain replication guarantees strong consistency. So this must also offer strong consistency, right? Is the difference in the way the reads happen in both the cases? Like reads can happen from any datanode in case of hadoop but in case of chain replication it happens only from the tail node.

  • @paaaaiiinnnn
    @paaaaiiinnnn Před rokem +2

    I subscribed :)

  • @RalphMartinBido
    @RalphMartinBido Před rokem +1

    nah u not ugly bro. U look like Miles Teller

  • @nithinsastrytellapuri291
    @nithinsastrytellapuri291 Před 4 měsíci +1

    A small suggestion, please avoid speaking when panning around or zooming in / out on iPad. its distracting