Learn System design the easy way!! This is the second session of distributed datastores basics In this session you will learn how nosql(cassandra) works in contrary to RDBMS!!
I have been going through the videos in this playlist and have watched most of them. The videos are easy to understand, well explained and touches the core concepts of distributed systems without going too much into fancy details (except for probably NFLX video). They are a delight to watch. Thank you, and please keep posting :)
I did really well in system design interview recently for one of the biggest company in the world because of you, I dint end up getting offer because of other round but grateful for your videos.
This is one of the top channels for system design. The videos are well researched and very well presented. Looking forward to going through all your videos and waiting for more!
In all consistent hashing algoritjms, when a node(system) joins or leaves the hash ring, only the neighboring nodes will be affected(rearranged).. usually it's just the one next node.. that's the main purpose of consistent hashing.. less damage for ring changes !!
Thanks for the clear explanation and as suggested I had a look at cassandra architecture . I have a question about when a node goes down new writes to that node will be directed to other node. which means the hashes get remapped . Q1 : Also if the node after a while comes up then would it again be responsible for same hash range or different range? Q2: Does data balancing occur everytime a node goes down or joins the cluster? Q3: What happens to the read/writes during the time this re-balancing is happening? Specifically availability? Q4: Typically how long would the re-balancing operation take place.? Maybe it would be good if we have video on Cassandra
Thanks Narendra, I have just started viewing your series of videos on Distributed system design. Your explanation looks simple and good. However, I have one basic question to as you... why have you used term "Datastore" and not used "Database". Any particular reason regarding usage of this terminology?
NOTES: In this cluster, there is no Such Master/Slave Every Node stores a portion of data[Primary duty] and backup of other nodes[Secondary duty] 2:09 - NOSQL data is stores in various nodes 6:50 - Consistent Hashing tells us which node shud backup which data of which node 9:20 - In case of sharding, records go to server based on key Values. In consistent hashing, records go to server based on values of hash(Key)
I dont think thats how data gets replicated to other nodes. If I am updating some data, each replicated node gets the entire data not random pieces from it. In your example when [1,2,3] came to node1, and if replication factor is lets say 2. then 2 more nodes (clockwise or anti-clockwise) get that data too (sync or asnycn replication based on how we define). Am I missing anything here?
NoSQL there are not joins and aggreagtion,so its easy to scale NOSQL Datastores...and yes AMAZON RDS is the example of distributed RDMS, but its has some limitation. ´
THIS COMMENT IS FOR MY PERSONAL REFERENCE. TO UNDERSTAND PROPERLY WATCH THE FULL VIDEO -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- idea 0:58 partitioning 1:57 partition and replication example 3:57 consistent hashing 7:17 replication factor 12:19 rebalancing node addition 14:43 rebalancing node deletion 17:45
I have been going through the videos in this playlist and have watched most of them. The videos are easy to understand, well explained and touches the core concepts of distributed systems without going too much into fancy details (except for probably NFLX video). They are a delight to watch. Thank you, and please keep posting :)
I did really well in system design interview recently for one of the biggest company in the world because of you, I dint end up getting offer because of other round but grateful for your videos.
All d best. you will get it :)
This is one of the top channels for system design. The videos are well researched and very well presented. Looking forward to going through all your videos and waiting for more!
This is great and thanks for this video sir
Good explianation 👏
Very clear and concise high level description....
nice explanation ... keep it up
Awesome work man... keep them coming!!
great content!!
Good high level explanation. Thanks.
I thank you for creating such a fantastic and well prepared series
Really simple and awesome to understand dude.Thanks a ton....
Nice one . very helpful
Excellent explanation, like the other videos! Thanks for sharing!
Thank you very much. I needed this brief explanation
Awesome explanation !! Please keep teaching us new concepts.
Excellent explanation. Thanks.
First of all kudos for the effort, excellent explanation. And would like to see a video on distributed messaging queue like kafka or Amazon sqs
Easy to understand...awesome video :)
I am trying to prepare the system design interview. your video is very help.
In all consistent hashing algoritjms, when a node(system) joins or leaves the hash ring, only the neighboring nodes will be affected(rearranged).. usually it's just the one next node.. that's the main purpose of consistent hashing.. less damage for ring changes !!
Wonderful Sir!!
Nice. I really like your videos. Very helpful
I don't understand why 7 people have disliked the video..
Anyways great content. Thanks for sharing your knowledge with us.
Awesome job :)
Great Job Narendra !
good job Bro...Keep it up :)
Nice explanation.
Nice explanation 👍
Good One!
Great video
Thanks alot
Awesome!
Thanks for the clear explanation and as suggested I had a look at cassandra architecture .
I have a question about when a node goes down new writes to that node will be directed to other node. which means the hashes get remapped .
Q1 : Also if the node after a while comes up then would it again be responsible for same hash range or different range?
Q2: Does data balancing occur everytime a node goes down or joins the cluster?
Q3: What happens to the read/writes during the time this re-balancing is happening? Specifically availability?
Q4: Typically how long would the re-balancing operation take place.?
Maybe it would be good if we have video on Cassandra
Thank you so much. It really helps.Can you please explain about craigslist system design and its api design
Thank you!
Great.
sir you are best
You are my man!
Thanks Narendra, I have just started viewing your series of videos on Distributed system design.
Your explanation looks simple and good.
However, I have one basic question to as you... why have you used term "Datastore" and not used "Database". Any particular reason regarding usage of this terminology?
Sir please make video for IRCTC system design
NOTES:
In this cluster, there is no Such Master/Slave
Every Node stores a portion of data[Primary duty] and backup of other nodes[Secondary duty]
2:09 - NOSQL data is stores in various nodes
6:50 - Consistent Hashing tells us which node shud backup which data of which node
9:20 - In case of sharding, records go to server based on key Values. In consistent hashing, records go to server based on values of hash(Key)
So what is the difference between sharding and consistent Hashing?
@@sauravdas7591 consistent hashing is a strategy for sharding
This really helped me thank you.
What algorithm is used for Rebalancing after a node is added/deleted?
U are really good at teaching but pleas enhance the sound quality.
any good resources to check out the internal working of Cassandra? tnx
Great videos. Please make some videos on the role of concurrency in the context of distributed systems/System Design interview?
Sure :)
What are some good books on architecting such systems?
I dont think thats how data gets replicated to other nodes. If I am updating some data, each replicated node gets the entire data not random pieces from it. In your example when [1,2,3] came to node1, and if replication factor is lets say 2. then 2 more nodes (clockwise or anti-clockwise) get that data too (sync or asnycn replication based on how we define). Am I missing anything here?
Sir,what if primary and secondary node containing data as CAT go down in such a distributed datastore..
Great videos! Question - are there any RDMS distributed dbs? Why does NoSQL works better for distributed datastore?
NoSQL there are not joins and aggreagtion,so its easy to scale NOSQL Datastores...and yes AMAZON RDS is the example of distributed RDMS, but its has some limitation. ´
Can you refer some resources from where we can learn the internals of cassandra?
Awesome explanation.
Just one doubt. What is the logic for getting the partition key ?
so can you please address the original question as to how we should store millions of users in a datastore which was mentioned in the first video?
@techdummies -- are you doing more videos on the concepts?
Yes, It will be available in January.(there is a slight delay as I am on long trip :l )
How the distributed datastore happens in SQL databases ?
THIS COMMENT IS FOR MY PERSONAL REFERENCE. TO UNDERSTAND PROPERLY WATCH THE FULL VIDEO
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
idea 0:58
partitioning 1:57
partition and replication example 3:57
consistent hashing 7:17
replication factor 12:19
rebalancing node addition 14:43
rebalancing node deletion 17:45
Narendra I would like to request your session on Google docs please
Recording Google docs video, will be out soon
could you please do something about the sound of your videos. sound is low.
19:37 In case of HDFC :D listen in 0.5 speed
nice explanation.