S3 system design | cloud storage system design | Distributed cloud storage system design

Sdílet
Vložit
  • čas přidán 5. 07. 2024
  • In this video, lets understand system design for
    AWS S3 system design
    Azure Blob storage system design
    Distributed cloud storage system design
    Distributed data store system design
    #s3systemdesign #cloudstoragesystemdesign #blobstoresystemdesign

Komentáře • 114

  • @kumarc4853
    @kumarc4853 Před 4 lety +32

    I interviewed a candidate recently and he mentioned to me about your channel. Thank you for the good content and teaching lot of people and helping them crack system design interviews,

  • @kumarc4853
    @kumarc4853 Před 3 lety +8

    A friend of mine got into FB and APPLE. He found your channel (and couple of other SD channels) very helpful in his prep.
    We can do this!
    Thank you

  • @metalalive2006
    @metalalive2006 Před 3 lety +28

    20:28 overview of the design with example
    * 22:04 partition layer
    * 23:40 stream layer
    * 26:34 different partition strategies
    27:34 stream layer
    * 28:06 store new file in append-only fashion
    * 29:00 seal file server that is full
    * 31:24 monitor space of all these file servers
    * 32:36 garbage collection performed on sealed file servers
    * 34:30 replication
    * 38:01 health check on the file servers
    * 40:32 block group
    45:27 partition layer
    48:56 performance improvement tips

    • @metalalive2006
      @metalalive2006 Před rokem

      At 28:06, you mentioned that spinning hard disk was a cheap feasible hardware solution for scalable storage service like S3 and SSD disk was expensive, I am interested to know if that is still true in 2022 since I know very little about detail architecture and marketing of SSD storage .

  • @kunchasaikrishna
    @kunchasaikrishna Před 4 lety +18

    Really your channel content not less than any other top online education platforms.
    Appreciate your content 😊 Thankyou so much🙏

  • @__abhiruchigupta__
    @__abhiruchigupta__ Před 4 lety +3

    Really appreciate the level of detailed information provided in this video. Thanks a lot for your hard work and creating such awesome content !! :D

  • @gunhound45
    @gunhound45 Před 4 lety +9

    Just want to say that I really love watching these videos. Even if I'm not preparing for system design interviews, its fun to do these thought exercises to design a big system.

  • @renon3359
    @renon3359 Před 3 lety +1

    Your channel is priceless brother, thank you.

  • @balakrishnan3725
    @balakrishnan3725 Před 3 lety

    Thank you Naren! Nice video. I could feel the effort which you have put to create such video.

  • @kirankothandan5529
    @kirankothandan5529 Před 3 lety +1

    You are an amazing teacher bro. I am a frontend folk but I am still interested towards system design because of you. How the design are made the way you explain makes me very curious. Thanks for the big efforts. Cheers 👌

  • @Siddharth42280
    @Siddharth42280 Před 3 lety +12

    @Tech Dummies Narendra L: Could you please make videos on a centralized logging system and a distributed job scheduler?

  • @abrarisme
    @abrarisme Před 3 lety

    this was great, can't wait to see more videos!

  • @trybeingakr
    @trybeingakr Před 4 lety

    Appreciate the drastic improvement in delivery style.

  • @sureshnathann8360
    @sureshnathann8360 Před 4 lety

    Hi Narendra, You awesome man! Keep posting ! Keep learning!!

  • @amanpervaiz2843
    @amanpervaiz2843 Před 2 lety

    This channel is gold!

  • @ramakrishnanvisvanathan3378

    Really liked this comprehensive design session, great keep it up and all the very best. I really appreciate the the work you have done towards bringing such wonderful to us.

  • @aneksingh4496
    @aneksingh4496 Před 4 lety +2

    Must say ,it would have taken much time for you to prepare this content , kudos !!!

  • @forgotten225522
    @forgotten225522 Před 3 lety

    Most valuable information ever on your channel.

  • @ullas06
    @ullas06 Před 4 lety

    Thank you for your time and efforts ,Its very helpful.

  • @prasadg9583
    @prasadg9583 Před 4 lety +1

    loved it mate!! thanks ❤️

  • @mattleahy3951
    @mattleahy3951 Před 3 lety +1

    Great video! Only question I had is in the table you showed for the Stream manager, where it tracked the Start and stop offsets for the primary, it also had fields for the secondary and tertiary replicants, but it didn't separately track their offsets; that would need to be included as well, right? Thanks.

  • @amlanch
    @amlanch Před 2 lety

    Terrific presentation! Love your videos

  • @rohitsharma-rp2jh
    @rohitsharma-rp2jh Před 3 lety

    shandaar zabardast zindabaad!

  • @bhavyamishra3502
    @bhavyamishra3502 Před 4 lety +3

    Nice content....keep it up👍👍

  • @fendy0390
    @fendy0390 Před 3 lety

    Really Appreciate your video here. You explain it very clear.

  • @harishkrish14386
    @harishkrish14386 Před 4 lety

    Very nice videos including ur perspective on how to get jobs in germany, kerp going bro 👌🏻👌🏻

  • @sushantasaha9938
    @sushantasaha9938 Před 4 lety

    Appreciate your hard work behind it

  • @anuragagnihotri5238
    @anuragagnihotri5238 Před 2 lety +1

    Thanks a lot for putting effort and providing design details of the distributed cloud storage. Although I had few questions:-
    1. I see Cluster manager is SPOF, how do we handle if the CM is down ?
    2. Why do we use DNS approach to update available Region routing ? Usually dns resolving is cached with few minutes or so, which will increase the downtime ?
    3. How do we handle concurrent update(not append) for same file from different users ?

  • @boombasach
    @boombasach Před 2 lety

    Really appreciate you putting up quality content. Very insightful . Couple of suggestions thougth - may be starting with high level user flow which you started talking at 21.00 will be useful. Also I am not sure both API server and Cluster Mgr two separate component talking to one DB is a good idea.

  • @pramodsingh4668
    @pramodsingh4668 Před 3 lety

    This channel covers a lot of ground and probably one the best channels. But...and a big but...It takes 2-3 times more time than needed. A lot of duplication, unrelated content which turns a 20 minute video into an hour video. For example, everything before first 20 minutes could have been finished in just 2-3 minutes. Please keep it short and precise. Appreciate all the hard work you put and the knowledge you are sharing. Keep going.

  • @a.yashwanth
    @a.yashwanth Před 4 lety +9

    Amount of work you put in making these 50 minute long videos is insane.

    • @kumarc4853
      @kumarc4853 Před 3 lety

      phenomenal work. we dont have to read books, they are for dummies :p

  • @icey3080
    @icey3080 Před 4 lety

    this is very useful, thank you

  • @pravaskumar7078
    @pravaskumar7078 Před 4 lety

    awesome...very helpful

  • @ankita8867
    @ankita8867 Před 3 lety

    Thanks for posting!!

  • @SunilKumar-yd8xv
    @SunilKumar-yd8xv Před 3 lety

    Amazing Content! Really appreciate your efforts.
    One question - Do you need cluster manager in this architecture? Simple, failure, geo, weighted routing are supported by DNS mostly.

  • @asahikitase5398
    @asahikitase5398 Před 3 lety

    thanks buddy, I do prefer the way you started with a simple architecture, and improve the system while increasing the traffic.

  • @vigneshrajarajan6724
    @vigneshrajarajan6724 Před 4 lety +2

    Hi Naren,
    thanks for your work. I have a question on uber/ food delivery design, from what i collected most of the applications rely on state machines to proceed to next step, could you please explain us how this Finite state machine is used in food delivery/uber designs

  • @hydtechietalks3607
    @hydtechietalks3607 Před 4 lety +5

    Great Talk, I love this.. but to differentiate from others, Please anounce who is the audience and what is the depth level you would go in this video..for example, are you going to discuss algorithms used in design or overview of it.. if its scoped for an application developer or scoped for systems design developer...

  • @kveldgorkon4611
    @kveldgorkon4611 Před 2 lety

    Thank you .. Great Explanation

  • @progfan234
    @progfan234 Před 3 lety

    Awesome stuff as always! I have a couple of questions:
    1. What impact will consistent hashing in realtime have on serving requests?
    2. What will happen when a particular partition server goes down? Will it be replaced by a standby? How many standbys should you consider maintaining?
    3. Is the Partition Map table a single point of failure? Or is it a within-cluster replicated data store?
    4. Would there be any benefits to replicating a given file server within a cluster?

    • @SharpySnipery
      @SharpySnipery Před 2 lety

      حء
      مگر
      جنگففےےےےےتےگءیءءءیثتسےڈےڈءءقرقر
      قررقنرضنعضھڑضھھڑضھرگےرےڑےڑےڑثڑڑےثثثثڑحڑحڑءضءقءرءرقءڑقےڑضےڑتقتڑقءڑقحڑضیرءرقےرقفڑقےڑقیہقریءےن
      نڑںڑچغدڑ
      ڑنر

  • @praveenjain183
    @praveenjain183 Před 3 lety

    Great Stuff Narendra, I appreciate the effort you make in gaining all this knowledge from multiple sources and sharing with us. Thanks a lot.

  • @tanayakarmakar2407
    @tanayakarmakar2407 Před rokem

    great content

  • @ravitandon9351
    @ravitandon9351 Před 2 lety

    Very well done!

  • @Vendettaaaa666
    @Vendettaaaa666 Před 3 lety

    Mind blown!

  • @Vendettaaaa666
    @Vendettaaaa666 Před 3 lety +1

    The partition server + linked list of file servers idea seem like "Consistent Hashing on steroids"!
    Basically instead of a single server on a ring for a given hash range, it's an array of servers.

  • @amlanch
    @amlanch Před 2 lety

    Excellent explanation. You didnt talk about the leader election and manager election in any of the layers but that's just some more detail.

  • @Miguel-ym2rr
    @Miguel-ym2rr Před 2 lety

    This is the first time that I see how S3 works. Thank you so much!. I decided to focus my career on Distributed Systems as a Software Engineer, how do you get the base knowledge to design and implement a Distributed System?

  • @prashant211087
    @prashant211087 Před 4 lety +24

    I appreciate your efforts. If possible, can you also share the references you go through for such design questions.

    • @vijayprajapati8475
      @vijayprajapati8475 Před 4 lety

      444r

    • @fragrancias972
      @fragrancias972 Před 3 lety +2

      He seems to read a lot of tech companies’ engineering blogs, based on his content.

    • @metalalive2006
      @metalalive2006 Před 3 lety

      really appreciate his effort , these engineering blogs in these tech companies are mostly very long articles

  • @zianxu2006
    @zianxu2006 Před 3 lety +19

    great content. Really appreciate it. I'm wondering, is it a good idea to start with a simple design and then scale up towards the final target design? I tried that at an interview and got the feedback that I didn't address many of the complexities until later in the discussion... Some other times I jumped into details upfront and got the feedback that I was focusing on details too much too soon....

    • @RajenderReddy12sw
      @RajenderReddy12sw Před 2 lety +2

      it's always a good idea to ask the interviewer.. what they are interested in..

  • @sowjanyav6570
    @sowjanyav6570 Před 3 lety

    what happens if a user wants to add more content to a file, (say file has 1-100 lines, and user wants to add 10 more lines to it) which is already in a sealed storage server? Will the file be copied to a new server? Or only the extending part in a different file server?

  • @kdakan
    @kdakan Před rokem

    How do you do file and disk operations on the remote file server, from the partition server and the stream server (like copying, clearing up space from unused blocks, etc.)? Do you mount an NFS share on these servers and issue local shell commands on these remote shares?

  • @happyandinformedlife1212

    Given a set of processes running on a cluster of hosts , design a system that load balances the hosts through live migration of the process. The goal of the load balancer is to minimize or prevent recourse starvation, a situation in which processes are not allocated the amount of recourses they want to consume. In case where all hosts in the cluster are overloaded, we want to distribute recourses evenly across demanding process. Given an imbalanced cluster, we want to bring it to a banned state as soon as possible at the lowest cost. Can you do Load Balancer next:

  • @adithyaks8584
    @adithyaks8584 Před 3 lety +1

    Wow!! simply wow... Now I can cross question managers at Amazon during interviews

  • @JashanPreetsingh-mi2nl

    Nice

  • @DarwinLo
    @DarwinLo Před 3 lety

    The Cluster Manager is responsible for updating the DNS entries upon a cluster failure. What do you suggest doing for client-side caching of DNS queries?

  • @rohanbundelkhandi3202
    @rohanbundelkhandi3202 Před 4 lety

    Very Nice Video. One doubt, How Partition Server communicates to Stream Manager? As we don't have direct link over there..

  • @ranjithsudhakar9304
    @ranjithsudhakar9304 Před 4 lety +2

    Great work, a small suggestions if it makes sense for you. Videos less than 20 minutes are appealing than longer videos. In case if it cannot be condensed then could be split in to parts.
    Awesome work on all your system design videos. Thanks

    • @Reji012345
      @Reji012345 Před 4 lety +2

      It's better to be at file.. otherwise it will break the flow.

    • @ellakkiankvp6267
      @ellakkiankvp6267 Před 4 lety +2

      Not really, that can be left to the audience, I mean if you need break, you can pause, right? Also since this is a single entity, it's good to be a single video, honestly, I don't see any partitions here. Also psychologically imo if you recall the flow and feel something's hazy it's Less cognitive load to look for it in the flow compared to thinking between videos.

  • @KimetsuNoYaiba100
    @KimetsuNoYaiba100 Před 3 lety

    Good followup: How does PUT API work for large files?

  • @zakariamaaraki1130
    @zakariamaaraki1130 Před 3 lety +1

    Great video keep going! I have only one remark, in minute 11 you said that replication must be in other region in case of a disaster, i think data must stay in the same region for some reasons (latency, RGPD ...) but in different Availability zones instead (this is the default option used by S3). Am i right ?

    • @phildinh852
      @phildinh852 Před 2 lety

      Yes, data is replicated in AZs of same region. There is an option to replicate data to another bucket in another region.

  • @groinache
    @groinache Před 2 lety

    very nice presentation. Concise and good pronounciation. However, too much echo. Suggest to get a better recording system or infra with anti-echo.

  • @doydoybb
    @doydoybb Před 4 lety

    I have a question. In your first simple design, you have a separate server to store metadata. On your second scaled storage system, where are the metadata stored? Is it all stored in the stream manager? Or is it stored on each individual partition server? Thanks!

  • @TechAwesomeMemories
    @TechAwesomeMemories Před 3 lety

    good

  • @viewforsourav
    @viewforsourav Před 4 lety

    How does Partition Server handle concurrent write requests if the system wants to honor append mode of writing to disk?
    One solution will be for a Single Stream - one can have multiple writers, each of which write to different file servers. However orchestrating such a model would be excruciatingly complex.
    Or Partition Servers can be logical entities with a 1-1 mapping to the stream id. Definitely that will lead to having many stream ids and some house keeping work for the Stream Manager. This will ensure the append mode of writing data and a better spread of file servers to stream ids.
    Let me know your thoughts Naren@Tech Dummies.
    Thanks for your videos.

    • @willinton06
      @willinton06 Před 4 lety

      "excruciatingly complex" sounds about right, there's a reason why only a handful of companies even try to get something like this working.

  • @shallimeetyougmail
    @shallimeetyougmail Před 2 lety +1

    Time 48:10 Remapping of range from 0-100 to 0-50 and 50-100 is fine. But what happens to the files which are already written in the previous partition? How will the reads for UUIDs with hashes 0-50 map to the older partition?

    • @SudhanshuTamhankar
      @SudhanshuTamhankar Před rokem

      In that case, the mapping is not updated till the new stream is already "warmed up", which means that the files with 0-50 hashes are already copied over to the new stream. Once this is done, there is a cut-over transaction in the partition manager DB which now starts routing the calls for 0-50 into the new stream. In the meanwhile, there might be files which got written to the old stream while this transaction was still happening. So that is handled by a catchup routine which ensures all files have been copied over.
      Imagine it to be a two stage commit : When the cut over begins, there is a soft commit which says : write all new files for 0-50 in new stream. At the same time, while reading, try reading from both new and old stream. Once all files are copied over and there's no stale writes left over in old stream, the commit is finalized. Now all reads and writes for 0-50 go to new stream, and some garbage collection happens for old stream to free up space.
      Hope this helps.

  • @baoleijia3764
    @baoleijia3764 Před 3 lety

    appreciate your share, but
    1, I don't think different replications located in defferent Region, it costs to much to tranfser data between replications
    2, i don't think the fail over switch is done by dns,

  • @mopsyched
    @mopsyched Před 3 lety

    Something like RAFT or Frangimini or Spanner is always used for file servers

  • @himanshuupadhyay6749
    @himanshuupadhyay6749 Před 3 lety

    Quick question, when the request of a file upload goes to the server, is it chunked on client side? if so where sync service will come into the picture?

    • @Gerald-iz7mv
      @Gerald-iz7mv Před 2 lety

      good question - shouldnt there be a chunk service - which splits the file into chunks?

  • @OnkarSingh-fc8mu
    @OnkarSingh-fc8mu Před 3 lety +1

    (Time 48:10) In case, when there is more load on the partition servers, the partition manager splits the range into two partition servers, how does this newly created partition server would talk to the older file server in the streaming layer (where the file was actually stored) Does anything change in streaming layer as well?

    • @amishsumit
      @amishsumit Před 2 lety

      When partition manager assigns a new partition for a subrange say 1-50 out of 1-100, it also updates the partition map table entries. For example all the hash values say 14, 36, 42, 58, 89 were initially mapped to the partition server 2. Once the new partition server is added corresponding exiting stream servers in map table (14, 36 & 42) will be mapped to this new partition server. That way any further read request for those existing stream servers will be served by this new partition server.

    • @phildinh852
      @phildinh852 Před 2 lety

      ​@@amishsumit But a partition server is assigned to 1 stream only?

  • @metalalive2006
    @metalalive2006 Před 3 lety

    does anyone know how cloud storage like Amazon S3 handle access control of each uploaded file ? for example , Amazon S3 exposes API endpoints for consumers to read and edit access control list of a file object , how does S3 do things ? really appreciate any reply or hints.

  • @shantanu143
    @shantanu143 Před 2 lety

    Good contect however one doubt like if we are replicating from Europe to Asia isnt it Asynchronous replication?

  • @pearlssnowboard3793
    @pearlssnowboard3793 Před 3 lety

    Do you have any idea how to design a system load a 5G file to 5000 server?

  • @RachnaDiary
    @RachnaDiary Před 3 lety

    how to store images or videos? what is the mechanism behind that? what have you explained it's for storing a file is okay but for photo/videos how it works?

  • @andybhat5988
    @andybhat5988 Před 2 lety

    Ceph RADOS layer with remote replication can handle this much better. It also does not need metadata server for replication. Using CRUSH, proper availability can be guaranteed.

  • @rishabhgoel1877
    @rishabhgoel1877 Před 4 lety

    Thanks, it would have been much better if you had related these concepts in terms of S3 keys and buckets

  • @eugenee3326
    @eugenee3326 Před rokem

    Great video but why can't ZooKeeper just do what Partition Manager does?

  • @noypi613
    @noypi613 Před 3 lety

    how will the api insert data to the data store server?

  • @tylerscott6531
    @tylerscott6531 Před 3 lety

    Do AWS regions each represent a continent? I thought "us-east-1" and "us-west-2" were both in the US.

  • @PoojaMehta271
    @PoojaMehta271 Před 2 lety

    Isn’t API server at 23 min nothing but a load balancer?

  • @noypi613
    @noypi613 Před 3 lety

    what technology do you use store the file? is it a database?

  • @paraschawla3757
    @paraschawla3757 Před 3 lety

    S3 system use Object Storage instead of Block Storage as mentioned in 43:00 min, Correct me if I misunderstood.

  • @akashjain2990
    @akashjain2990 Před 2 lety

    Why do we need partition layer? Why can't the API layer directly talk to Streaming layer since there is 1:1 of Partition to streaming layer anyway?

  • @viditmathur8437
    @viditmathur8437 Před 3 lety

    what happens if cluster manager goes down?

  • @ariellyrycs
    @ariellyrycs Před 4 lety +1

    Hey , how can I deposit you the dollar 💵, this is too much work, I have an interview coming up and I’m watching all your videos , thank you

    • @TechDummiesNarendraL
      @TechDummiesNarendraL  Před 4 lety +1

      Thanks, Join the channel. You will find join button in the channel page!

  • @prasenjitkundu7904
    @prasenjitkundu7904 Před 3 lety

    do you know captain america

  • @gijduvon6379
    @gijduvon6379 Před 2 lety

    I think noone today use spinning disks in production. At least in new projects. SSD are not so costly as they used to be.

  • @nalamda3682
    @nalamda3682 Před rokem

    why not zip?

  • @sumonmal009
    @sumonmal009 Před 3 lety

    Solution 20:28

  • @zuowang5185
    @zuowang5185 Před měsícem

    Is this a mid level answer?

  • @MohanRaj-vp1zt
    @MohanRaj-vp1zt Před 3 lety +1

    Lot of content, but language & presentation is quite poor. Because of that the flow is broken multiple times. This really doesn't help in an interview setting of 45 mins. The first major thing that an interviewer would want to see is the REST API signature of different functionalities offered , for example upload_file.