Map Reduce Paper - Distributed data processing

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • Paper that inspired Hadoop. This video explains Map Reduce concepts which is used for distributed big data processing.
    This video takes some liberties to explain the underlying concept as simply as possible. For example; the map process for song count is typically implemented as, emit number 1 for each song title. After this a combiner function is used to locally aggregate/sum these counts per song.
    Also, this video leaves out many implementation details, which are interesting. I encourage you to read the paper for them.
    Thanks for watching.
    Channel
    ----------------------------------
    Complex concepts explained in short & simple manner. Topics include Java Concurrency, Spring Boot, Microservices, Distributed Systems etc. Feel free to ask any doubts in the comments. Also happy to take requests for new videos.
    Subscribe or explore the channel - / defogtech
    New video added every weekend.
    Popular Videos
    ----------------------------------
    What is an API Gateway - • What is an API Gateway?
    Executor Service - • Java ExecutorService -...
    Introduction to CompletableFuture - • Introduction to Comple...
    Java Memory Model in 10 minutes - • Java Memory Model in 1...
    Volatile vs Atomic - • Using volatile vs Atom...
    What is Spring Webflux - • What is Spring Webflux...
    Java Concurrency Interview question - • Java Concurrency Inter...

Komentáře • 29

  • @umessi10
    @umessi10 Před 4 lety +33

    It's incredible how you compress a complex paper that can take days or even weeks to fully grasp into a ten minute video. You are an amazing teacher. Props to your animation that is on point.

  • @TheModernPolymath
    @TheModernPolymath Před rokem +4

    i cant even begin to explain the level of clarity i achieved after watching this video!! Thanks a lot sir! Please keep posting more videos, it is very helpful for students like us :)

  • @jigyasarathod6194
    @jigyasarathod6194 Před 3 lety +3

    Really very well explained in a very short amount of time! Much appreciated

  • @monishchhadwa777
    @monishchhadwa777 Před 6 měsíci

    You are an excellent teacher!
    Please keep making more such videos.

  • @manukhandelwal8878
    @manukhandelwal8878 Před 5 lety +1

    I highly appreciate the work you do. Keep up the great work

  • @talirabetti8066
    @talirabetti8066 Před 5 lety +1

    Thank you for the video! Very clear explanation. I especially liked the examples part.

  • @chinmaykajalwa
    @chinmaykajalwa Před rokem

    The best explanation and pictorial representation of Map Reduce I came across. I saved this Playlist. It is too good and useful.

  • @MrGreen-kq4ds
    @MrGreen-kq4ds Před 4 lety +5

    thank u! can't wait for bigtable design review.
    please do a zookeeper / etcd one.

  • @tarunbhatia8652
    @tarunbhatia8652 Před 4 lety

    One of the best explanation you can find on internet ! Please make a video on HDFS

  • @aristonchen8782
    @aristonchen8782 Před 3 lety

    the best explaining video of this concept i have ever seen. Thanks :)

  • @trysubbu100
    @trysubbu100 Před 2 lety

    awesome and crystal clear explanation. Such a big topic condensed to 10 minutes video. kudos to your work

  • @sanchitsingh7089
    @sanchitsingh7089 Před 4 lety

    Dude, this was an amazing explanation!!

  • @ashokrajur09
    @ashokrajur09 Před 2 lety

    short and crisp explanation, thank you

  • @SatyaprasadMr
    @SatyaprasadMr Před 3 lety +1

    After a long time I found excellent videos. May I request you to create videos/playlist on Kafka, Cassandra and AWS Cloud. I see them very tricky and hard to understand. Thanks for making awesome videos.

  • @glsruthi6522
    @glsruthi6522 Před 3 lety

    Thanks for such awesome explanation. Keep doing the great work 😁👏

  • @sugyansahu9120
    @sugyansahu9120 Před 5 lety +1

    very good that you also covering latest technologies like Hadoop ecosystem. Expecting more things like these. 🙂

  • @jennybolena2341
    @jennybolena2341 Před 4 lety

    Great explanation!!

  • @feniljagani6150
    @feniljagani6150 Před 3 lety

    Excellent!!

  • @yashwantdhole1228
    @yashwantdhole1228 Před rokem

    Excellent explanation....👍👍👍

  • @HarmonicQuest
    @HarmonicQuest Před 2 lety

    That was really good !!!

  • @ruhinapatel6530
    @ruhinapatel6530 Před 2 lety

    You are brilliant

  • @siddheshswnt
    @siddheshswnt Před 4 lety +2

    Need Google big data table video as you promised in GFS video.

  • @songzhu1085
    @songzhu1085 Před 2 lety

    Good

  • @pulkitbajpai01
    @pulkitbajpai01 Před 5 lety

    i hv certain questions related to java memory manegement and out of meemory...where i can send

  • @Anotender
    @Anotender Před 4 lety

    Really good explanation! However, I have one question. I may have missed something but how exactly it deals with chunks replicated over a couple of nodes? There may be a case when we use some data twice so it can impact the result.

    • @architsaxena3792
      @architsaxena3792 Před 3 lety +1

      I think that's why client informs master right. I mean master has all info where the nodes are duplicated so it can avoid duplicate servers.

    • @user-em9mw9ch3y
      @user-em9mw9ch3y Před 2 lety +1

      operations are run on only one of the 3 replicas ( remember that out of 3 servers, 1 is primary and other are secondary). If the primary fails, the GFS master sends the operation (map ) function to another secondary replica keeping the data and final result in the same server.
      my humble answer. Corrections are welcome.

  • @PratapSingh-dz9tj
    @PratapSingh-dz9tj Před rokem

    Can't we get read/write frequency count from GFS master log files itself which is Stored remotely since it have read write log for files, I Just learning so might i understood wrongly

    • @DefogTech
      @DefogTech  Před rokem

      GFS's responsibility is to act as massive hard-disk, it does not have understanding of what is written on files. If you check the GFS video, clients directly store data on individual machines, and GFS Master is not aware of what is being written.