System Design Interview: TikTok architecture with

Sdílet
Vložit
  • čas přidán 10. 06. 2024
  • We attempt to design a large-scale distributed video hosting platform like TikTok or Instagram Reels.
    The engineering involved in building these systems is complex, and our attempt does not (even nearly) cover all the challenges that these engineering teams face. We instead have a mock system design interview setup. Yogita will have 45 minutes to design an architecture that can scale, is performant, fault-tolerant, and meets the functional requirements.
    00:00 Intro
    00:34 Problem Statement
    01:24 Requirement listing
    04:00 Capacity Estimation
    06:34 Design skeleton APIs
    08:34 Choosing datastores
    12:10 Comparing datastores
    19:16 Ingestion Engine
    24:21 Video pipeline
    30:59 Last mile delivery
    33:46 What is a CDN?
    35:52 Network Protocol
    38:03 End to end request flow
    39:54 Caching
    41:19 Evaluation and verdict
    45:03 Final Architecture
    Yogita's Channel (sudoCODE): / @sudocode
    InterviewReady: interviewready.io/?_aff=SUDOCODE
    Social Media:
    Github: github.com/coding-parrot/
    Instagram: / applepie404
    LinkedIn: / gaurav-sen-56b6a941
    Twitter: / gkcs_
    #SystemDesign #InterviewReady #SoftwareEngineering

Komentáře • 723

  • @gkcs
    @gkcs  Před 2 lety +61

    If you are preparing for a system design interview, try get.interviewready.io.
    All the best 😁

    • @karunagadde
      @karunagadde Před 2 lety +4

      S3 is not a file storage

    • @vishal733
      @vishal733 Před 2 lety

      Hi. Could you please share the name of the online tool you are using for colaborating?

    • @ManishSharma-pe1jf
      @ManishSharma-pe1jf Před 2 lety

      @@vishal733 All online meeting service will have a whiteboard inbuilt in it such as webex, zoom, etc.

    • @sayantangangopadhyay669
      @sayantangangopadhyay669 Před 2 lety +4

      I have 2 question on the final architecture diagram. one is why raw video is sending directly from ingestion to s3. s3 only take final processed video after processing by workers right? and second, why the arrow is from different devices to CDN instead of CDN to different devices

    • @tanvirbinazam548
      @tanvirbinazam548 Před 2 lety +1

      What software is used for drawing in this video?

  • @paperguns115
    @paperguns115 Před 2 lety +6

    Thank you both for putting this together and providing this content openly. This is very helpful for those trying to prepare for this exact type of interview scenario and who might not be familiar with the format. Excellent job!

  • @vibhoragarwal2935
    @vibhoragarwal2935 Před 2 lety +28

    Scrolling tiktok for 45 min. - No
    Watch whole video for 45 min. - Yes, it's great.

  • @rashmendrasai496
    @rashmendrasai496 Před 2 lety +140

    These kinds of mock discussion on SD is really helpful. Provides viewer a thought process while dealing such questions. Kindly do more these kinds of video ...

  • @sachin_getsgoin
    @sachin_getsgoin Před 2 lety +26

    Very detailed, touches very important system design aspects. Gives many pointers for further research!
    A zillion Thanks!

  • @sandeepg1983
    @sandeepg1983 Před 2 lety +88

    Another awesome delivery , thanks Gaurav ,
    One thought :- we increased the storage to ~6x for considering different resolution and formats , which we can handle by introducing 2 entities in the system . one , for avoiding different format , we can provide a dedicated video player to user, which understand our format only . Second entity is a resolution manager which we can place before streaming engine , which can help us to upgrade or downgrade a resolution as per user bandwidth or user reqest .
    take axample like netlix and youtube , they have their own media player which can understand their recording format . yes one extra task will be to convert uplaoded videos to application understanding format while uploading only but that will be fruitfull in saving 6x of storage cost .
    resolution can also be handled at runtime in 2 ways .
    -One by keeping always a high resolution copy and downgrade it at run time before serving to user. downside is a storage increment because of high resolution copies .
    - another is to always keep a low resolution copy for reference with some pixel patteren files to convert the low resolution copy to high resolution copy at run time . Up side it we can reduce the cost of storage system significantly.
    for perfromace handling in conversion , a dedicated system with predefined resolution converter filter can work .

    • @gkcs
      @gkcs  Před 2 lety +11

      Brilliant points, thanks!

    • @shirsh3
      @shirsh3 Před 2 lety +1

      It would also be good idea to take a look at ffmpeg and "ts" files creation

    • @edwardspencer9397
      @edwardspencer9397 Před 2 lety +3

      Yes it is common sense to create your own video player which supports all devices instead of creating 20 formats lol.

    • @lhxperimental
      @lhxperimental Před 2 lety +4

      ​@@edwardspencer9397 It not just about creating an app which can play video. You'll of-course have an app. Different formats have different properties. Some have small file sizes but require some hardware acceleration to perform well which may not be available on all devices. So even if you create your own player, it will do software decoding which will be slow - users will complain about phones getting warm, high battery consumption and sluggish performance. Instead you create different formats that are optimized for a particular family of hardware. There can always be a basic format as a fallback but you should cover the large percentage of devices in formats optimized for them.

    • @edwardspencer9397
      @edwardspencer9397 Před 2 lety +1

      @@lhxperimental Large percentage of devices is no longer true. Businesses always prefer those who have medium / high end phones/devices capable of hardware acceleration because all the others owning low end phones are mostly poor people who have no intention to spend any money on subscriptions or visit advertisers. So even if a poor guy uninstalls something due to overheating issues it shouldn't be a problem.

  • @yagyanshbhatia5045
    @yagyanshbhatia5045 Před 2 lety +44

    Few ideas!
    - Utilising the fact that most requests are of videos that are in trend, and trends die in ~month or so, instead of storing all the transcoded files, we have a live transcoder, and store the result in a cache (or CDN) with a TTL of ~ month (this time can be decided by data analysis). Twitter did this and were able to save millions on storage costs.
    - We can have live websockets with the online users, so that whenever the video is complete we can notify them, and maybe also the users who were tagged, or are very engaged with an account.
    - Instead of dividing videos in chunks after receiving the whole video, let the client do the chunking and upload chunks only. This would result in way less failures as if a upload fails after uploading 95% of the video, you don't need to re upload the entire file again.
    - Maybe have caches on top of databases

    • @VikashSharmaVS
      @VikashSharmaVS Před rokem +3

      s3 also have multiple tiers . you can set the rule to move files to lower tier after set time and further

    • @mostaza1464
      @mostaza1464 Před rokem +1

      Agree with chunking the video on the client side!

  • @richakaur28
    @richakaur28 Před 2 měsíci

    You both are just too good!! I love the authenticity and simplicity. The actual interview does take this similar course. Keep up the great work.

  • @jesu9313
    @jesu9313 Před 2 lety +9

    one of the most valuable content in youtube for young IT engineers

  • @chostislas
    @chostislas Před 2 lety +18

    There should be more sessions like this. It's super helpful. I loved it!

  • @manoharkamath8561
    @manoharkamath8561 Před 2 lety +25

    I love this video and got to know atleast at a basic level the system design approach.

  • @anastasianaumko923
    @anastasianaumko923 Před rokem +1

    Awesome, guys! It is really valuable to see such interview in action. Feels like you are the one who is being interviewed. Good job, thank you! 🤩

  • @ShivamMishra-td3jz
    @ShivamMishra-td3jz Před 2 lety +16

    Two of my fav youtubers on system desigm

  • @sanjana8358
    @sanjana8358 Před 2 lety +4

    This was probably the best video so far. Please try to make more such videos

  • @spiritual5750
    @spiritual5750 Před 2 lety +29

    This video is so good. It so helpful talking to engineering manager.

    • @pratikpatil5452
      @pratikpatil5452 Před 2 lety +7

      Liar it's no where near the real world projects...!! Although they are really good, it only gives us a idea of MVP and also how to crack interviews!! Real world scenarios are much worse and terrifying👻😱!!

  • @riceball100
    @riceball100 Před 2 lety +6

    Thanks so much Sen-sei

  • @KCrimi
    @KCrimi Před 2 měsíci

    Kudos on this interview. So refreshing to see a mock sys design on youtube where the interviewer takes it seriously, challenges, questions and pushes the decisions of the interviewee.👏

  • @alpacino3989
    @alpacino3989 Před 2 lety +3

    Amazing video!!! Learnt a lot. The parallel workflow thing blew my mind. I thought it could be done later on, maybe post the original upload in a slower way. But that matrix thing was amazing!!

  • @kayalskettle4063
    @kayalskettle4063 Před 2 lety

    Excellent video ! Thanks Yogita for putting yourself out there for our benefit.

  • @prakharkhandelwal739
    @prakharkhandelwal739 Před rokem +4

    That was really amazing... like how smoothly she explains bits and pieces of the problem.
    loved it.
    Learned a lot.
    .
    .
    Thanks a lot for this content guyz.

    • @gkcs
      @gkcs  Před rokem +1

      You're very welcome!

  • @vinayshukla6316
    @vinayshukla6316 Před 2 lety +6

    Hey gorav, much helpful for the freshers and people with 1-2 years of experience in this field because this is how we deal with upper management, I always gets those diagrams and based on that do my implementation but now only I knew how they come to the conclusion of what needs to be done. Thanks for this. 👍

  • @skobanemusic5752
    @skobanemusic5752 Před rokem

    I LOVE THIS VIDEO!!! You brought a pro and the back and forth brings that dual insight

  • @sumanthvarada
    @sumanthvarada Před 2 lety +1

    Thanks a lot Gaurav for this extremely useful video. I must appreciate Yogita for this very detailed system design and component choices right from the queue, S3, CDN, Diff DB's, etc were awesome and especially the processing part of the video via workers. Thank you both!!

  • @balaji.bodkekar
    @balaji.bodkekar Před 2 lety +11

    This is way to learn How system design with respect to requirements

  • @itsme1547
    @itsme1547 Před 2 lety +14

    By watching this video I fallen in love with System Design 😅

  • @premmkrishnashenoy4070
    @premmkrishnashenoy4070 Před 2 lety +16

    Coincidentally Akamai CDN was down just a few days after this video was uploaded

  • @srinadhp
    @srinadhp Před 2 lety +3

    Great discussion. Yogita, huge respect. The way you explained the different choices you took, is an eye opener for people like me who is going to take the bull by horn soon. Subscribed to your channel as well. Thank you Gaurav.

  • @rodoherty1
    @rodoherty1 Před 2 lety

    Fantastic video, guys! Thanks so much for sharing! Very insightful!

  • @beautyofthenature7060
    @beautyofthenature7060 Před 2 lety +1

    It was too good! informative. Hoping to see more such videos. Thanks Gaurva and Yogita.

  • @jonlenescastro1662
    @jonlenescastro1662 Před 2 lety

    The best mock I saw in my 2 months studying for my interview.

  • @st3114rr
    @st3114rr Před 2 lety

    amazing, thank you both for this

  • @ParadiseQ
    @ParadiseQ Před 2 lety +42

    There should be some questions asked upfront before diving in such as "do we want video searching", "do we want to generate newfeed", "what about video sharing", "are users able to download video", "are users able to follow other people", etc. After that we can focus on what the interviewer is really interested at.

    • @vikrantsai7
      @vikrantsai7 Před 2 lety +1

      ya i was wondering the same

    • @ashishprasad1963
      @ashishprasad1963 Před 2 lety +3

      That would be really a microservices part AFAIK. Scalable architecture is the first goal followed by additive services.

    • @surbjitsingh3261
      @surbjitsingh3261 Před rokem

      @@ashishprasad1963 correct

  • @ishikajain8143
    @ishikajain8143 Před rokem

    Great video...
    The way she used all of her info and Gaurav summarized, it is just great in a short time.
    Thank you

  • @rops009
    @rops009 Před 2 lety

    I'm just 10 minutes in the video and it's already great! Thank you for this! :D

  • @vaibhavdadas5372
    @vaibhavdadas5372 Před 2 lety +5

    When i started watching i thought ill quit in between but the session was so nice and non boring and interactive that I watched the hole video thanks a lot for this

    • @ashish7516
      @ashish7516 Před 2 lety

      this video was not on hole, are you sure watched this video only ?

  • @SheshagiriPai
    @SheshagiriPai Před 2 lety

    This is so practical and relevant. Thank you.

  • @badrinarayanan5183
    @badrinarayanan5183 Před 2 lety

    Awesome stuff ! Thanks for this, Gaurav !

  • @amitdeshwal2860
    @amitdeshwal2860 Před 2 lety +1

    This was very informative, thank you !

  • @pravaskumar7078
    @pravaskumar7078 Před 2 lety +3

    Excellent session very helpful..u guys r actual heroes for dev like us..

  • @aadeshsharma0001
    @aadeshsharma0001 Před 2 lety

    really enjoyed the session and also learned new things, keep uploading more

  • @kameshkamesh9953
    @kameshkamesh9953 Před rokem

    One of the best videos to understand system design. Thanks guys

  • @ANDRYAN182
    @ANDRYAN182 Před 2 lety

    this is so good, thank you Gaurav and Yogita!

  • @sololife9403
    @sololife9403 Před 2 lety

    this video is just so precious . many thanks

  • @AgAnushree
    @AgAnushree Před 2 lety

    This video is amazing guys, great work

  • @avinashsakroji1811
    @avinashsakroji1811 Před 4 měsíci +1

    I am watching this video after almost 2 years. Thanks for uploading these kind of videos, They are very helpful.

    • @gkcs
      @gkcs  Před 4 měsíci

      Thank you!

  • @himanshugupta7010
    @himanshugupta7010 Před 5 měsíci

    Thank you so much, Gaurav and Yogita. I got to learn a lot from this particular video. Please posting such videos for the community. Thanks again.

  • @dapeng1919
    @dapeng1919 Před 2 lety +3

    i think the integrations of s3/cdn and cache/cdn are something i would like to learn more as a followup. Great video btw!

  • @rameshthamizhselvan2458

    In so many video I searched the difference between sql and no sql but i didn't understand the use case but I got a clear picture about the use case for the no sql.. Thanks for this keep posting your videos especially yogitha

  • @KaustubhPande
    @KaustubhPande Před 2 lety

    Great video as always Gaurav. Well done. Look forward to more such interviews. :)

  • @DeepakSingh-uf5vv
    @DeepakSingh-uf5vv Před 2 lety

    super informative , sudoCode effort was really great. Keep making more such content, lets take airbnb as next system.

  • @deepmp1
    @deepmp1 Před 2 lety

    Thanks Yogita and Gaurav, looking forward to more such videos

  • @khurram6700
    @khurram6700 Před 2 lety +2

    Thanks @gaurav for making such a extremely handy and useful video. Kudos for that. 👍
    Can we please have part 2 of this video where you include discuss about the
    1. Exception handling and reporting,
    2. Ballpark estimate for each component of this system.
    3. What strategy to be used a month or a year after to decrease load on the file system.

  • @neel3297
    @neel3297 Před 2 lety

    I read that some people have already talked about this. As another solution per requirement, I feel you need not wait for all the formats and resolutions to be available one at a time. You can push them to a queue and then a worker group can keep on pushing. This will allow more parallelism. In this way the video with lower resolution/size can be made available for preview while the UI to the uploader can show that the rest are being processed. Or, otherwise the original video can be uploaded directly and the format and resolution part can be taken later. Many times we edit the videos. Once all formats are available the video can be made viewable to public.

  • @hamidja1537
    @hamidja1537 Před rokem

    Many thanks for sharing. It is helpful to see the chain of thoughts, when architecting the solution.

  • @sanjayg2686
    @sanjayg2686 Před 2 lety

    Wow it was really great and i was waiting for this kind of video from long time to understand how the system design discussions will be done be in details which you did, Thank so much for both of you and Request you to come with similar kind of videos for different complex use-cases like Banking or Insurance ...e.t.c.

  • @sankalparora9374
    @sankalparora9374 Před rokem

    Very helpful.
    Have used all the knowledge gathered so far in the playlist.
    Thanks for sharing this discussion!

    • @gkcs
      @gkcs  Před rokem

      You're welcome!

  • @komalgupta8558
    @komalgupta8558 Před 2 lety

    Very helpful discussion around databases. Thanks Yogita and Gaurav!

  • @chetanmotamarri6942
    @chetanmotamarri6942 Před 2 lety +3

    This is really informative. Good job folks. Looking for more sessions like these.

  • @Marcus-yc3ib
    @Marcus-yc3ib Před 2 lety

    I learned a lot from this video. Thank you very much.

  • @shridhar_rao
    @shridhar_rao Před 2 lety

    More of this please! ♥️

  • @preetiirrothi744
    @preetiirrothi744 Před rokem +8

    Great video! One feedback - I didn't see the usage of the 1.2TB data you calculated, I mean a translation of how many servers (with resources like CPU, RAM, Disk, IO, etc) would be needed for ingestion pipeline as well as storage would have been helpful. Also, some interesting scenarios like thundering herd, data compression to reduce cost would have been of great help. And don't you think, putting all the video in the CDN would be cost heavy. Should have some strategy based on popularity/recency/TTL and upload/remove the video from CDN.

  • @cloudpachehra1113
    @cloudpachehra1113 Před 2 lety

    Amazing ....u guys rock...thanks for sharing , waiting for more 🙂🙂

  • @curious1731
    @curious1731 Před 2 lety

    Very good for some one who is interested in designing solutions...hits the basics really hard.

  • @26goutam
    @26goutam Před rokem

    One of the best video on this channel.

  • @harshkumar-qg1rh
    @harshkumar-qg1rh Před 2 lety

    Hey Gaurav,
    Love to see this amazing and informative video.
    Please make more mock interviews video.
    All the best and Happy Deepawali 💥

  • @shravandhar6169
    @shravandhar6169 Před 2 lety +14

    Great take at the design problem. :)
    However I'd have a different approach for replication. We're replicating the video in s3 for 2 reasons:
    1. Fault tolerance
    2. Latency due to geographical location
    I'd suggest to replicate to far fewer s3 locations and that too only for (1).
    To tackle (2) we can have this approach -->
    1. Buffer around 1 second or so of the video on the device upfront.
    2. When user starts watching the video, then lazily load the rest of the video in chunks.
    The buffering strategy further depends on (to name a few):
    1. Device network quality
    2. Prediction of potential videos which user might want to watch based on some ranking algorithm
    Also, regarding hot video meta data caching:
    1. We can cache the api response at cloudfront end.
    2. Redis can also be used alternatively.
    Redis might be a better approach here because it is distributed and if the video is deleted/modified by the OP then we can update it accordingly.

    • @kanuj.bhatnagar
      @kanuj.bhatnagar Před 2 lety +2

      1. We can cache the api response at cloudfront end. -> AWS has the Global Accelerator for this purpose. It's costly, but if you're ingesting ~1.2TB of videos everyday, you can afford it.

  • @shamkantdesale8994
    @shamkantdesale8994 Před 5 měsíci +2

    Thanks Gaurav Sen & Yogita for informative contents. You guys are great. I was looking for such videos since long time. Finally found one. Thanks again.

    • @gkcs
      @gkcs  Před 5 měsíci

      Our pleasure!

  • @tirupatirao7521
    @tirupatirao7521 Před 2 lety +1

    Thank you Gaurav for the video, this kind of interacted videos will explore more and more queries to understand the sd

  • @cewlguy
    @cewlguy Před 2 lety

    Awesome thanks Gaurav and Yogita 👍

  • @ChronicPassion
    @ChronicPassion Před 2 lety +30

    Amazing video....lot of questions were addressed. This duo should do a video series covering other case studies like :
    stock broker platform , uber , whatsapp etc

    • @gkcs
      @gkcs  Před 2 lety +3

      czcams.com/video/vvhC64hQZMk/video.html

  • @lonewolf2547
    @lonewolf2547 Před 2 lety +1

    amazing video...You should do videos like these more often....

  • @matrixRule127
    @matrixRule127 Před 2 lety

    Long time subscriber of Yogita's channel here!

  • @arunprasath9586
    @arunprasath9586 Před rokem +1

    She came really prepared for this question! Didn’t she 😂 she was playing back what she prepped really nicely for this video. Great stuff folks 👍

  • @ayazherekar
    @ayazherekar Před rokem

    Maza aagaya... Thanks a lot... So much knowledge in a 45 min video.

  • @sudhirchoudhary5323
    @sudhirchoudhary5323 Před 2 lety

    This video is very informative , thanks to both of u .

  • @tanyarajhans7
    @tanyarajhans7 Před 2 lety

    Wow, this is so awesome!

  • @nextgodlevel4056
    @nextgodlevel4056 Před 2 lety +1

    This is my first system Design video that I watch till end 😅

  • @hakimbencella4242
    @hakimbencella4242 Před 2 lety

    Thanks a lot for this awesome content 🙏

  • @pawandeepchor89
    @pawandeepchor89 Před 2 lety

    Very well designed ... Loved it 👍

  • @abhisekpatnala
    @abhisekpatnala Před 2 lety +3

    This was really nice discussion, AWS has got a good endorsement…. On a lighter note

  • @raj_kundalia
    @raj_kundalia Před rokem

    Thanks for this!

  • @vinaygupta2369
    @vinaygupta2369 Před 2 lety

    Good one, @yogita explained very well.

  • @joshiadvait8
    @joshiadvait8 Před 2 lety +2

    Ultimate knowledge 🔥

  • @joyjitchakrabarti9092
    @joyjitchakrabarti9092 Před 2 lety

    Very Informative! Thanks for sharing

  • @sarathfromsaudi
    @sarathfromsaudi Před 2 lety

    Fabulous video.. Thank you @Gaurav and @Yogitha

  • @vamsikrishnapasupuleti7443

    Inspired me to think about IT in a significant way for the first time

  • @empr1ze
    @empr1ze Před rokem

    Thanks, good video that explains how the world's most popular app works

  • @pavangrandhi
    @pavangrandhi Před 2 lety

    Very useful video! Thank you

  • @architjain5108
    @architjain5108 Před 2 lety

    This concept of video is awesome

  • @tirthvora8591
    @tirthvora8591 Před 3 měsíci

    wow the end-to-end request flow was really smart, as we're just returning the list of metadata it'll be fast and metadata will have actual video link too

  • @akashdeepwadhwa5828
    @akashdeepwadhwa5828 Před 2 lety

    Hi first of all thank you both of you so much for sharing how things work .i will.wish for your best future

  • @ishankagarwal7798
    @ishankagarwal7798 Před 2 lety

    Great video. Made me like and subscribe within 3 mins

  • @amityanarayan8808
    @amityanarayan8808 Před 2 lety

    Gaurav sir aap to clean bold ho gaye. Interviewer got impressed throughout. Thanks so much for the efforts.

  • @merxxibeaucoup9093
    @merxxibeaucoup9093 Před 2 lety

    Wow very very educative !! Big ups !!

  • @igorburilo3937
    @igorburilo3937 Před 2 lety

    Great job, thanks!

  • @sumeetbasu1526
    @sumeetbasu1526 Před 2 lety +1

    Great discussion...The most important parts starts at 19:20 and 38:04 to be specific

  • @sudarshanrbhat7686
    @sudarshanrbhat7686 Před 2 lety +1

    We want more of these mock interviews plz..

  • @avicool08
    @avicool08 Před rokem

    Super one, good work you both 👍

  • @vishalabhang1152
    @vishalabhang1152 Před 2 lety +7

    Instead of Uploading Files from Api ,
    can use direct upload file into S3 using signed S3 url

  • @avinashb4485
    @avinashb4485 Před rokem

    The idea to split the video file to chunks and process them parallel is really interesting and I feel very fundamental in processing input in general.

    • @rajeevrp1500
      @rajeevrp1500 Před rokem +2

      How does that happen exactly by the way ? You literally split 1 mb file into three 333kb files and then convert them using any file-format-converter like FFMpeg etc, and then merge again ??