System Design: Uber Lyft ride sharing services - Interview question
Vložit
- čas přidán 26. 07. 2024
- How to design a ride sharing service like Uber or Lyft? In this video we talk about many aspects when designing complex distributed systems with mapping and routing components. Under the hood they use sophisticated graph algorithms to display precise ETA's and shortest routes. Additionally we talk about storage (Redis, NoSQL, Cassandra ), provisioning (Mesos, Teraform), networking, docker containers and logging.
Success in Tech now also offers coaching, mentoring and mock interview sessions on www.successintech.com
Music: www.bensound.com
Follow SuccessInTech on Facebook: / successintech
Follow SuccessInTech on Twitter: / _sh4dy_
www.successintech.com - Věda a technologie
I'm doing a little experiment over on IGTV, the SiT VLOG. Check it out! instagram.com/tv/BkYf4GphfQz/
The Uber Finding Route is NOT a Vehicle Routing Problem travelling salesman problem , it is Shortest Path or Minimum Spanning Tree in one direction. Dijkstra and A* are for minimum spanning tree. Travelling Sales man is for visiting all customers not shortest path to closest user. For example a Uber need to be dispatched to closest/cheapest customer not that How that uber should be routed so that it serves all the customer in the city with shortest time.
Exactly!
Satya Prakash I think, yes, that case is similar to salesman, but with the addition that you can't visit the nodes in any order (you need to pick up the passenger before going to its destination). I guess that you can use the easy heuristic, like going for the closest rider first, then to that rider destination or to the other passenger (whichever closer) and so on.
and also TSP is not np-complete, it's np-hard.
Upvote! I came here to say the exact same thing.
I don't know how he brought TSP into this, but it's completely unrelated. As you've said, it's just dijkstra with the caveat that weights can change over time and can't be precomputed - although at scale they can.
Better to describe this video as "Infrastructure at Uber " as compared to system design interview.
Exactly. If this was a real sys design interview answer. It would be a fail. I'm sure he could do it, but he totally missed designing the high level system before getting into perf optimizations.
Your attempt to describe a tough design is wonderful.
Good luck to anyone who gets this in the interview.. LOL
I feel uber’s core is taking passenger’s loc and find the top k drivers nearby, the challenge here is both passenger and drivers loc are dynamic, then delegate to google map for path navigation and ETA estimation. It can be another topic to design google map. So the unique challenge for uber here is how to take all those realtime loc update from both drivers and passengers and match them, for example just take ETA for pickup, for idle drivers, it’s directly the ETA bt drivers current loc and passenger loc, if driver is finishing a trip, then its the sum of ETA to finish current trip and the new ETA from trip destination to new customer loc.
Really thankful for the efforts in making the video with such a precise explanation making it available at free of cost to all.
Great work Roman! I am not expecting all in one video. Also, please note that in actual interview, interviewer is guiding candidate on the track he wants to verify. In all great work Roman, please continue with your effort.
Agree!
I watched your other videos on system design and was impressed by the content you shared on them. In comparison to your other videos, this one is too specific and surfacial. Like the other comments, I would say the given the time frame of the interview, a simple design with potential to scale, perform, and manage is what expected. I appreciate all the technology stack you shared but the video is way more complex for system design. The interviewer is not asking how is Uber or Lyft doing something, the main focus is how you would design some simple system that model the features of Uber or Lyft. Start from simple and present generic techniques for scalability, performance, and maintenance. Just a personal opinion. Thanks.
Exactly, this is what bothered me from the very start. It's about designing an Uber-like system. Not talking about how is Uber actually doing every aspect of its system.
Thank you a lot Ramon for your effort. I really admire your work and am thankful. Whenever I see a new system design video posted by you, I will make my girlfriend happy.
Alin Idomir Well good for you man 😄 Cheers!
By letting her see this video, because she is also a software developer? :-) haha
+Olha R 😂
Not yet a developer, but after making her happy, maybe she will want to become one. :P
tech couple make lots of money. atb bro. I m trying too.
Very nice explanation . Extremely informative and tons of new things I got to know . :)
Great insight on the inner workings of Uber.
Hi, Thanks for the wonderful explanation. I had a question about why we need to invoke the travelling salesman problem (visiting all nodes and returning to the start)? To the best of my understanding the problem is the path between two nodes (driver to customer and driver to customer's destination) solvable by Dijkstra(or that class of algorithms) with dynamic weights based on but not exclusively cartographic direction, traffic, distance.
I think that whenever you have more than 2 nodes (i.e. besides the "start" and "end" node), you'll need the travelling salesman algorithm to figure out a way to visit all of them in an optimal fashion (for instance, when you use Uber pool and you have more than one customer in the car, each with different destinations).
+Mahanth B Yes what you’re saying is correct, I should have been clearer there: According to their engineering blog they did experiments using Djikstra or A* but because of their latency constraints it wasn’t a viable option. I mentioned TSP because calculations on graphs are hard problems and a good example is TSP because it was researched for a long time and there are many excellent resources out there covering it.
@@SuccessinTech Do you have link to this blog page?
very useful! great explanation and information of technology stack!
Uber and Lyft are such complex systems, but you're very thorough in your explanation, Ramon. Thanks for the hard work that you put into preparing these great videos.
+Fabien Teulieres Thank you, Fabien!
Tnks, i learnt alot. Thnks for the time you put into these videos.. Can you continue d video, i'd love to really learn more. Tnks
Thanks so much man. Helped me a lot
FOR ETA - Divide the city into small square parts (by unique number) and construct the graph based on that (edge from one part to another)? Do the graph search on realtime ? but the respond time is like in milliseconds ?
Thanks a lot for your educational videos. However, I have two comments:
1. 30:35 The traveling salesman problem is the problem in which order to visit certain nodes minimizing the total path, not finding the shortest path between two nodes.
2. I am pretty sure Djikstra and A* are not NP complete, simply, because they visit each node only once, so they should scale linearly in the number of edges.
Yes, it is a total confusion on his side! Fastest route search is not related at all to traveling salesman problem and NP complete.There is also quite a stupid explanation about route precomputing, since fastest search algorithm finds the fastest route in milliseconds. Any navigation program like Google maps and Waze solving this problem all the time.
I was thinking the same thing!
Thanks for your video. I'm a Facebook Engineer previously from Grab (Grab is the Uber of South-east Asia and has just recently acquired Uber's business in SEA). Having conducted various interviews at work, I'm pretty sure if a candidate mentioned the things in your video in a real interview, they would not pass it.
In my opinion, you are focusing on the wrong things and have failed to touch on the most crucial part of a ridesharing service - the driver-passenger matching logic. You omitted that important part and chose to focus on the secondary stuff like data warehousing and graph algorithms which is extremely puzzling for a system design interview. Data warehousing is relevant to most domains and there isn't much value in talking about it here. Graph algorithms are helpful, but in reality, Uber, Lyft and Grab rely on Google Maps' APIs heavily for ETAs and routing.
In the context of an interview, I'd say to focus on how drivers and passengers are matched and what different microservices are needed to support those business flows and how you can scale the services to support millions of passengers and drivers. The content you have presented here is not a positive example.
I have to agree, I'm disappointed that details on how customer and driver client applications would interact with the system is omitted. This feels too high level.
My videos are not thought to be tutorials on how to exactly pass system design interviews. I pick up interesting topics (in this case ETA calculations) and share them. You still have to turn on your brain and cover the appropriate topics needed in an interview situation.
I don't disagree with your sentiment, but it can be helpful to get an insight into a full solution as to get acclimatised to expectations. To that end I feel your example is lacking but I appreciate the context.
It‘s misleading to think that you can plan out an answer to a system design question beforehand. A good interviewer will steer the interview into different directions to probe the candidates abilities and knowledge. It‘s a dialogue not a memorized monologue.
If you are not doing an interview for a Uber-like company, maybe the interviewer is more interested in how you'd deal with scalability and data storage rather than customer-driver matching.
In fact, you can't presume what will the interviewer be more interested in. You'll only know during the interview, actually I think it's ok to ask the interviewer what do they want you to focus on while designing a Uber-like system, since this is a very broad question and you have a short time to talk and design. Well, at least this is my opinion.
Great video Ramon. Just like ur other one on Messenger service. However, I have a couple of quick questions. Is there a way to know if a website is built on microservices or is monolithic by viewing its source or any other technique? I have asked several people this question but to no avail. Also, Where does the much talked about Headless CMS sit in these System Designs? Thanks in advance :-)
Keep it up buddy!
Travelling salesman problem is different from finding best route between two points... The former problem is NP-complete and the latter can easily be solved by Dijkstra's algorithm in poly time.
Yeah, traveling salesman problem is not really related to this design problem. Maybe for some specific advanced use case, but certainly not for main features of Lyft/Uber. I'm pretty sure mentioning that during an interview would be a bad idea.
Very well explained!!
Very informative videos!
thanks for the insight!
very well explained!!!!!!!!!!
Do we need to know about each of these technology like Cassandra, Mesos, Teraform etc.?
+Gaurav Yadav No but it helps reading about them because it will give you insights how real world systems use them.
Hi, what's the tool you mentioned that is used to moniter and manage services' life cycles? I heard something sounds like "mazos", but not sure what's the correct spelling.
Meso (generic open source) or Mesosphere (freemium). You might also consider Kubernetes
What happened to th 2nd approach for TA, how does splitting the city into smaller parts help? Some details are missing here...
Hello sit team ,could give me some refrences or resource for how uber app shows nearest car to the user in map.
Excellent video, I really enjoyed it.
This video is very helpful. Would you be able to do a video on system design for Gmail?
can you please make a video regarding designing payment transaction system
Hi
In the video, you said we will keep the driver location in NoSQL DB, what would be the sharding key for cabs location data? To fetch fast their current location. The driver will be continuously moving after every 4-sec new location feed will come and the old location feed will be stale data.
Please help.
Very Detailed Explanation ! I think you covered almost all the aspects of designing uber, lyft like service . Looking forward to more such videos :-)
12:59 logging
16:20 SOA
Finally
great work
Could you make a video about UberEats System design?
Hi Ramon, great job and great explanation. I have just 1 question as of now...
> How does Uber or Lyft store the current co-ordinate info about drivers? Since this is the key thing on which the efficiency of the system as a whole depends a lot. Please do reply
Atleast Uber uses Google's S2 library that divides the map into grid and cells with a certain radius.
Shortest Distance problem is solved by Dijkstra Algorithm, which is not NP-complete.
Comments brief:
• not TSP
• "aarkive"
• this would fail the interview
Despite the above cons mentioned by others, this video really helped me a lot! Loved the part when you explained more about SOA, in such a clear manner. So my most sincere thanks! Keep up the good work!
Is caching/ in-memory DB like Redis used in this system and if so what are they used for and what is the advantage?
In general in-memory db as the name says put the database in memory (usually RAM) which is 1000 factor faster than conventional hard drive.
The benefit is obvious from above fact that information can be fetched, stored, updated very fast. So this is for latency and for end-user they have very good responsive system.
But also one drawback is if not implemented properly there would be data loss and sometimes it's impossible to recover this data. For this nowadays in-memory DB has around 3 default replicas and this lies in different GEO Data center so they can recover from any live copy.
This video is best example on how to fail system design interviews.
hi ramon. can you do a video on nextflix and app store like services?
+lokesh kumar The Netflix question is something I discuss with people I coach, you won‘t see it on the channel anytime soon ;) I noted the app store services, though!
thanks
How does real time matching happen for driver and users? sticky sessions?
Your explanation are very Amazing, thank you. If you can make a video about System design of Google Drive / Dropbox / system like complex cloud storage? is intersting also how netflix and youtube are made. Thanks for everything
12:30 You mentioned caching live data (such as driver's position) is not the best solution, then you suggest something like Redis, which is a cache mechanism. I think I missed something here.
Yeah I think the caching mechanism wasn't mentioned later and would love to know how you support millions of users with less latency.
Oh man, nice explanation. How do you know all that information about uber?
Check the Uber engineering blog. It’s all publicly available information.
hey, thanks for the awesome videos. Can you just attack links in the video descriptions for some further reading, who have less background about the basics of the topics covered l didn't know what low latency is. So, that would be helpful. Thanks
+Asutosh Sahoo My bad, I wanted to attach the Uber Engineering blog: eng.uber.com/category/architecture/
thanks :)
I expected quad trees to be covered at some point. Please do a video on how quad trees are stored and maintained
I have no clue what quad trees are, sorry!
Hi Ramon
I liked your other videos. But felt like in this one you just swayed away from the actual things which were needed to be discussed like Creating trips, Matching drivers with customers. And instead focused on Data warehouses, logging and such.
The first 28 minutes or so has nothing to do with what we should talk about if this question is actually asked in the Interview. And even the matching solution does not feel satisfactory. We should be concerned with just the Time as edge weight for example between 2 locations. Speed has nothing to do with it cos distance is also a factor. And I have limited knowledge about graph, but it seems like a Dijkstra algorithm problem to me.
But thanks for the effort.
Thank you very much for all these videos! Very helpful my friend.
Thanks a lot for sharing , excellent explanation👌👌👌👌👌👌👌👌👌👌👌👌👌👌👌👌👌👌👌👌
routing discussion around minute 24 is good
Very informative video.
Thanks for visualizing eng.uber.com/tech-stack-part-one/ and eng.uber.com/tech-stack-part-two/. But you can do better than that. I really like your twitter video, because that video hits the core, i.e., why is twitter AT A MASSIVE SCALE hard. I think Interviewers value insights more than mere regurgitation of engineering facts. Instead of diving into storage and SOA, you need to show how to cope with massive mapping and matching at Uber scale, i.e., what you wrote on the right side of the white board.
Lots of this sounds like it could work but is actually very inaccurate. I was there at Uber from 2015-2018.
that was an amazing explination thank you for your time
"30.35, Shortest path between 2 nodes in A graph is an NP-complete problem", LOL . It's a (|E|+|V|\log |V|) time Algo called Dijkstra's . Invented Almost a half-century ago . Longest path in NP-complete. . :P
I am 30 mins into the video. Still haven't gotten to how "1. Customer Driver Matching" is done. What does that problem have to do with multiple containers on a machine.
There are many aspects to a complex use case like this one. I can't cover everything in a 20-40min video and you won't have the chance to cover everything in an interview either. Sometimes you end up discussing the Ops side of things and sometimes a specific aspect of the business logic.
You are right. Sorry about my previous dismissive comment. Keep up the great work!
great info tmxmzigo
Better to describe this video as "graph algorithms " as compared to system design interview.
It’s not Traveling Salesman. It’s simple shortest path with positive weights.
Yes correct, I think I said „a familiar problem is TSP“. If not, sorry about that!
The problem of shortest path is not NP complete at all.... actually it is very convenient famous algorithm Dijkstra which is not NP complete..I can't understand why do you talk about TSP!!!?.... and then NP complete...
I know this didn‘t come across clearly but all I said is that TSP is another familiar graph problem. I didn‘t say this ETA calculation is TSP
28:44 core modules / graph and mapping problems
Forgot the Kafka to Redis arrow...
Isn't it shortest path problem? Why is it traveling salesman?
He Chen I should have been clearer there: According to their engineering blog they did experiments using Djikstra or A* but because of their latency constraints it wasn’t a viable option. I mentioned TSP because calculations on graphs are hard problems and a good example is TSP because it was researched for a long time and there are many excellent resources out there covering it.
Yes, it's the shortest path problem. But if you look at "Uber Pool" it must be solved using TSP. And Dijkstra is not an NP-complete but TSP is.
can you please make a video regarding designing payment transaction system
please create one video on designing of third party payment gateway.
Why is there a Dislike for Knowledge?
what is KA(Foreign Key)A?
KAFKA
You are talking about Service based architecture. Not servicd oriented architecture. Those are differentZ
as former uber eng, I have to say the high level is ok, things about the graph are wrong, some key components are missing ~~~~
Don't forget to check out www.successintech.com
Sorry, but you didn't do any system design, nevertheless a good discussion, but it was not what I was searching for.
Less of a system design and more of KT on uber infrastructure
🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩PLEASE FOLKS, FOR YOUR SAFETY AS THE DRIVER, PLEASE DO NOT PICK UP ANY PASSENGER WITH A LOW RATING (4.7 or lower). Safety over “rights”…Also!!! Please don’t pick up anyone who uses STREET NAMES ESPECIALLY LATE NIGHT/EARLY MORNINGS IN BAD OR POOR NEIGHBORHOODS!!!! THEY HAVE LOW RATINGS FOR A REASON!!!!!!🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩🚩
Not recommended! This video contains unnecessary details and miss the gist of Uber like how would you maintain direction and updating of driver's position, when he is free etc.
How do you suggest covering these topics?
@@aditibhatnagar6970 the maps are usually represented based on grids of definite size and then the location of driver keeps on updating after certain time. Check out the videos from Uber engineering on CZcams.
@@akr22075 thanks!
You're assuming my intent. The intent is to become familiar, understand the type of granularity and detail that might be expected. To analyse an answer to help provoke ideas. It's not to memorise an answer. By failing to provide that level of detail I don't feel I have any basic level of confidence on what to expect or what to focus on. Examples are a form of experience. If people wanted to use them by rote then I'd be just as critical as you. That seems intellectually dishonest and unethical.
how to fail system design 101
"Archive" is pronounced "ar-kive"
It’s pronounced ‘AarKive’
Noted! :)