Dropbox system design | Google drive system design | System design file share and upload
Vložit
- čas přidán 27. 07. 2024
- Let's design a file hosting service like Dropbox or Google Drive. Cloud file storage enables users to store their data on remote servers. Usually, these servers are maintained by cloud storage providers and made available to users over a network
Diagram: imgur.com/a/pzKb4f7
#systemdesing #dropbox
idea scope 1:38
scale 2:10
HLD 2:41
problem to solve 4:55 6:57
solution 10:41
metadata file 15:26
HLD 17:38
messaging service detail 25:01 device sync feature
metadata handling 28:40
metadata schema 31:48
edge store usage to serve metadata 36:16
search feature 40:01
Thanks for your channel Naren! Brings back my love for computer science. We need more such teachers that can break things down and explain it as simply as you have done here.
Great work Narendra. This is the best video I have found so far on CZcams on the DropBox architecture.
Enjoyed this video more than others because of the cute doggo interruptions. :) Thank you!
Another reason to use async queues: one cannot assume that only a single file will be uploaded. There could be a case in which multiple files could be uploaded and a queue ensures that chunks do not get mixed with each other. I guess one can also talk about failover (what happens when a chunk gets lost during transmission/gets corrupted) but that might not be required.
Edit: NVM he covers this case as well LOL. Love the depth he goes into when covering different components.
15:39 LMFAO! great video man, you are my go to for system design prep
Great system design and clear explanation, thank you !
Your explanations and approaches in explaining these System Design Problems is absolutely phenomenal.
Give this man Bharat Anmol Ratna : ]. Thanks for SD series it helps us broaden our thinking and not just defect fixing and small CR.
The best part I like about your videos is you do a lot of research to put the information from various sources about a topic into one place. You are our Edgestore ;)
Hello Naren! your channel is a goldmine. I've learned quite a lot. Please consider creating content that dives deep into data models/schemas/datasets. Thanks 🙏
Sure, Thanks
@@TechDummiesNarendraL Do you have code for this explained system?
Very informative. You have covered each layer like front end, Middle tier and database layer effectively. Thanks
Awesome video that comes down to details for real design not just for interviews 😄
Give this man the credit he deserves 👏🏼👏🏼👏🏼
Great job, Naren! Love your work. Keep it up!
You are the best! Thank you so much for explaining this so nicely!!!
Great system design. I really wish he explained why file change sets need to be ordered and consistent, in which led him to use a relational database for the metadata.
If you look at his design for google docs, it doesn't even use a relational database for massively concurrently updated files.
Yes. He explained google docs using operational transformation.
Awesome! Loved the explanation and learned a lot. Last part of the search design for this service could be expanded into another video.
Shooting a video for serchengine design
Wow. Amazing. U r doing a grt job.
A Great Video on Understanding file storage service design like dropbox, Preparing for an interview and this content is helpfull
So clear and easy to understand, keep going!
You don't have studio but you are delivering better content than those who have studio.
Fabulous videos, excellent information and lots to learn ! Dogs were hilarious.
Truly amazing. Hats off to you. 🙏😍 Request you to upload more of such videos. It would be too awesome if we can have a system design tutorial for beginners and how to improve.
Sure, Some time soon
Clients described at 18:26, taking an example of Google Drive, refer to the various "Backup and Sync" desktop clients which you might have active on multiple devices. All these clients keep listening to a messaging queue. In case one device makes changes to a file, the change is propagated to S3 and all clients are notified of this by publishing the change to the messaging queue which they are listening to. The client which is the originator of the change doesn't care but other clients do and when they know of a change they update their local copies (download the whole file if not present).
Update:
It's not just one Q2. Each client will have its own queue on which the change is broadcasted. This is to have an asynchronous behaviour wherein the client can be offline for a period and then when it is online it starts listening to the queue for any changes
This is my understanding. Correct me if I'm wrong
Shit, it's fucking perfect explanation. Thanks for all these stuff.
Amazing video, Very detailed and to the point!! If possible, please add Fault tolerance and Security related usecases to be incorporated in the design
Thank you so much. The best system design video on this topic.
Wonderful Explanation...!! Thanks for the work Naren.
Amazing .. I am new in system design and I've learned a lot.. Thankyou so much
Doing great job Naren, keep up the spirit 👍🏻
Excellent the way of explaining the concept.
and really enjoyed the the dogs pictures while barking in the mid of presentation. 🙂👍
THE BEST OF THE BEST -> PLEASE, CONTINUE YOUR CHANNEL!
The most handsome tech guy I have found in youtube! Thanks a lot !
Amazing no nonsense serious designs which are really good hatsoff bro 👍 keep doing good work
Really you done a good & great job annaiah.....Awsome explanation,tq☺️
WTF. only 633 likes out of 38,663 views for this gold? Come on viewers, you are beholden for this guy who is putting enormous effort to share knowledge beyond his boundaries.
excellent video , thanks a ton .
pls make a video on system design for decentralized applocations on ethereum and ipfs (like decentralized uber)
thank you for the video
it gets the very general idea about how it works
but without important details though
once again thanks
Thanks, you are doing a great job. Also, It would be really helpful if you could run the whole flow once at the end. So that we don't have to watch the full video when revisiting the video for the second time.
Salaams and respect from Pakistan for you sir! You are a hard working and a smart individual who is helping the IT community across the world using whatever best resources you have. Keep up the good work - Keep posting them system design videos. God Bless!
Great system design video. Thank you !!!!
Very good content. I loved the dog barking.
Great work Narendra. I'm learning a lot from your videos. I have gone through almost all your system design videos. Just checking if you can create one on a Saas product like Salesforce. I didn't find any good video on Salesforce / Shopify like services.
Great video ! Really appreciate the time and effort put into it
Firstly, thanks for the video. it would have been interesting to know how the Edge Wrapper achieves transaction isolation level without explicit locking/transaction.
Great work! I think we should have a block on the client side to reconstruct the document!
Needs an explanation of how exactly does one detect which chunk was changed. Because your applications, video editor, for example, doesn't know anything about chunks, it doesn't change a chunk, it changes your file. It's up to your Dropbox client to figure out which chunk the change corresponds to. And that is not immediately obvious especially for huge binary files.
Hash computation can help
Love your channel, very useful info, salutes to you!!!
Thank you so much. This was fascinating!
Very nice video! Please do an Instagram system design for the next one! Thank you!
Without reference to original paper “Designing a Dropbox-like File Storage Service” by Alejandro Ramirez, Fariborz Khanzadeh, Hassaan Bukhari. this is unfair.
Thank you for your awesome videos. You rock!
Man, I wish I discovered your channel sooner. I recently failed on a system design interview, Dropbox system design particularly. Thanks for your work. I will study every single of your video and prepare myself for my next interviews.
which company ?
@@lakshminarayanansairam2739 Ledger. The company that builds crypto wallets.
separate queue for each client doesn't sound good additionally we are using queue as persistence storage which should be avoided because a large number of messages can pile up in queue without any proper ordering. instead, the client side can call the sync service to fetch the latest files index for the user
believe me, ur channel will gonna have 50K+ subscribers within 3 months, keep up the good work
:) I wish, Thanks
dogs are also barking loudly and disturbing me here as well :) Btw you rocked!
was waiting for that kind of video!
Hey I really like the explanation and concept of solution you provide. Can you make a video of CZcams system design. As there is no video on CZcams yet.
Really Awesome. Please keep up the good work.
Thanks for the great video! For content extraction from files, you can use Apache Tika which detects the file type first (using some byte frequency analysis algorithms) and then use the specific parser for that file type to extract its content. It can also extract metadata from files. Of course for images/videos we need some other DNN models to extract meaningful content.
Those are great videos that u r doing. Can you please start a course about system design basics n how to build from scratch to advanced level. Please do that course I would love to buy. Thank you 😀
Nice explanation. Insightful
for the length of video, i learned a ton
Wonderful videos! Learned a lot from your videos.
Amazing video, great explanation!
The content is always great from this channel, but if you can use a microphone while talking that will bring the video to the next level.
You deserve a million subs. please make a system design on Inshorts and Instagram.
Hey Naren. Great job! Few questions for you.
1. Why can't we expose a single service which takes chunks of data and make metadata entry into database and also stores chunks to S3 instead of client calling both services?
2. From your design, if sync service pushes notifications to a topic are we maintaining dedicated topics/partitions for different clients? Or are we pushing notifications via Websockets/HTTP Polling?
Few comments:
1. If clients go offline they can still come back and establish connections via Websockets?
2. We can't have 'n' number of topics because creating Kafka topics/JMS queues need infrastructure support and is a costly operation. Also creating partitions in a live system is a costly affair. Pls let me know if I'm missing anything.
Though this video is a good starter, its gets wrong at multiple places
sweetest part of the video at 15th Min
Great job Nagendra, look forward to seeing more interesting content from you. A part of system design it would also be nice if you could do a couple of class design and DB design examples. Design a chess game (all the classes and design patterns) or Design the database schema for instagram would be good examples.
Sure I will as soon as I get more time to work on videos.
Awesome video! Thanks!
Would you be so kind to do a video on how AWS is structured. I enjoy your videos. Very informative...!!!!!.
honestly as a swe working at dropbox, i don't feel like this is an answer i am looking for. It misses a lot of important stuff like how do you design your database schema for storing the metadata and how would your sync protocols looks like? what if there are write conflicts during sync how do you deal with that? and the search engine part i guess is the least likely bonus question i'll ask in an interview(probably makes more sense in design twitter)
no offense to Narendra, i think you put in a lot of effort/research into this and even referenced dropbox's blog post on network edge infra.
but i think this's a problem to almost all of these youtube system design videos, like, yes you will learn a little bit here and there, but it's not the same as a real interview and don't expect to memorize some sys design solution and pass the interview.
better ways to learn system design:
read DDIA, web scalability for startup engineers, take a distributed system class
listen to real mock interviews if you somehow can(or some faang engineer does these mock interviews and post them somewhere i guess)
design and implement projects at your job if you have the opportunity
While explaining why we need queue instead of http call to sync service you mentioned we need it as client may not always be connected. My question is if client dont connect to internet for example, even that message also cant be transmitted to queue right ?
This is amazing video with lots of details. If you could add more details on which part runs where , it will be complete .
What if we send the diff only, what git does. Storing a tree like structure of changes.
Sirji excellent video
Outstanding. Thank you.
thanks for detailed videos, please make one more for custom garbage collection too
Informative video 🙌
Hello Narendra,
Could you please make a video to design "google photos" like app? Or what architectural changes you would do in this existing design of drop box to limit it to "google photos"? By the way, your video has been real source of knowledge!
Good Job Naren
The videos are amazing, Very helpful. I have seen all your videos. Thank you so much. Can you please make a video on Designing Amazon Lockers?
Sure, Its in my TODO List
just fyi, cassandra consistency model provides a higher chance of reads being consistent, but doesn't provide true linearizability. This is why it's better to use terms like linearizability and not consistency as DB providers can play games with their definition of "consistency". Cassandra and similar nosql variants are basically partitioned key value stores in disguise and cannot ever compete with a true relational database. Also, even within relational databases, configuring isolation levels is pretty important, and it's easy to get tripped up there.
Best video :)
Thanks a lot for this
you look really good and confident
A big Thank you!
Great content to learn !
this is awesome, when you time can you do one for job scheduler which is scalable where jobs can be scheduled
great job. Big fan of u
Hey Narendra,
Great video. I am learning a lot. In the beginning of the video you talked about designing the system for 10 million users. Is this coming in another video? How is the sizing for required resources done. I am curious. Thanks mate.
Thanks for the useful information bro
I am not quite clear about the response queue. Is it necessary? If each client maps to a response queue, and what if the client never comes back? Are we still posting messages to its queue? Meanwhile, why not just let each client periodically check the diff between the local metadata vs. the latest metadata? By doing this, we can get rid of the response queues, right?
Great Video Narendra. I have a question. Don't you think there won't be any compression and encryption of data happens? I think this help in both Bandwidth and security.
Awesome Bro. Can u make video for "finding trend in social networking" lets say tweeter. Very rare topic.
Very nice...keep it up..
Awesome, thanks brother.
Thanks for this!
great job:
betting systems system design
Truly one of the best system design videos on CZcams. Well done!
Only if you are new to this