How Roblox Went Down For 73 Hours

Sdílet
Vložit
  • čas přidán 9. 07. 2024
  • A look into what happened behind the scenes during the longest outage in Roblox history.
    Sources:
    blog.roblox.com/2022/01/roblox-return-to-service-10-28-10-31-2021/
    www.hashicorp.com/resources/how-we-used-the-hashistack-to-transform-the-world-of-roblox
    roblox.fandom.com/wiki/2021_Roblox_outage/
    roblox.fandom.com/wiki/Timeline_of_Roblox_history/2016#August_2016
    news.ycombinator.com/item?id=30013919
    raft.github.io/
    www.lmdb.tech/media/20130329-devox-MDB.pdf
    www.lmdb.tech/doc/
    db.cs.cmu.edu/mmap-cidr2022/
    czcams.com/video/HDOipdFPbB4/video.html
    Chapters:
    0:00 Intro
    0:33 HashiStack Explanation
    4:47 Outage Investigation
    8:20 Root Causes Found
    11:30 Return to Service
    12:19 Slow Leaders
    15:56 Resolution
    Corrections:
    - At 9:44, the default unbuffered channel in Go does not hold any items and has a buffer size of 0. Sends to such a channel are blocked until another goroutine is ready to receive the value. The illustration in the video shows a **buffered channel of size 1** however the overall point still stands.
    Music Credits:
    - Firecracker by LEMMiNO (czcams.com/video/ulfoU2MziOc/video.html)
    - Impact Prelude by Kevin MacLeod
    - We're Finally Landing by Home
  • Věda a technologie

Komentáře • 531

  • @sirgamsay3596
    @sirgamsay3596 Před 23 dny +3176

    Imagine just doing a Hobby project to understand a piece of Software and suddenly the complete Roblox Infrastrucure is build on it.

    • @AjaxGb
      @AjaxGb Před 23 dny +848

      Open source developers: "Hey guys check out this thing I built in my spare time! It's not perfect but I'm making it freely available so other people can learn from--"
      Large corporations: "FREE?? 👀👀🥵🥵👀👀"

    • @JusticeNDOU
      @JusticeNDOU Před 23 dny +55

      i thought this was a joke,

    • @xelspeth
      @xelspeth Před 23 dny +107

      xkcd 2347

    • @shantilkhadatkar1195
      @shantilkhadatkar1195 Před 23 dny +3

      @@xelspeth what is xkcd

    • @michael_betts
      @michael_betts Před 23 dny

      @@shantilkhadatkar1195 webcomic. type that text into goold and youll get the comic.

  • @agnor9978
    @agnor9978 Před 23 dny +1487

    every time I hear that someones hobby project caused a major outage somewhere I get the feeling that maybe big corporations should maybe check what software they are built on and support it's development/maintenance

    • @fisch37
      @fisch37 Před 23 dny +83

      XKCD 2347

    • @humza890
      @humza890 Před 23 dny +43

      Except that software relies on another software, which then relies on another software, which then relies on another software....
      It can turn into an endless loop

    • @agnor9978
      @agnor9978 Před 22 dny +47

      @@humza890 it can't circular dependencies are usually rare and you can stop looking for dependencies once you've seen it. I also didn't mean, that every company has to look through all of their dependencies and maintain them all, but maybe picking a few or doing an audit of some of them every now and then would be beneficial to not only them, but the world as a whole

    • @GoogleDoesEvil
      @GoogleDoesEvil Před 22 dny +10

      The Unix philosophy of "do one thing" and link against a ton of dependencies was a mistake.

    • @bookle5829
      @bookle5829 Před 22 dny

      It's what FUTO stands for.

  • @c1ph9r
    @c1ph9r Před 23 dny +1284

    the negative 900 million dollars hits hard 😭

    • @_GhostMiner
      @_GhostMiner Před 22 dny +11

      Why? Trash game has trash income

    • @Luna5829
      @Luna5829 Před 22 dny +100

      @@_GhostMiner its not a game tho
      its a game engine and hoster

    • @_GhostMiner
      @_GhostMiner Před 22 dny

      @@Luna5829 **H O S T*

    • @_GhostMiner
      @_GhostMiner Před 22 dny +7

      @@Luna5829 ackhually

    • @ducksongfans
      @ducksongfans Před 22 dny +84

      @@_GhostMiner plenty of trash games hosted on roblox, plenty of great ones

  • @ratm0
    @ratm0 Před 23 dny +1462

    "A massive company with ... -$924 million net income" 💀

    • @klafbang
      @klafbang Před 23 dny +181

      "Each minute of downtime costs us negative $1750, this must be fixed ASAP!"

    • @kosmonautofficial296
      @kosmonautofficial296 Před 22 dny +6

      @@klafbanglolol

    • @Mempler
      @Mempler Před 22 dny +2

      That is absurd lmao

    • @zaper2904
      @zaper2904 Před 22 dny +66

      @@klafbang So does that mean they were earning money when they were down? 🤔

    • @LibertyMonk
      @LibertyMonk Před 22 dny

      ​@@zaper2904no, because they still had expenses (developers trying to fix the servers) but reduced income (no micro transactions available).

  • @k4l1hm4n
    @k4l1hm4n Před 23 dny +846

    Turns out, this video could be a great introduction to modern backend architecture and development.

    • @SrKinko
      @SrKinko Před 23 dny +98

      I think all of his videos are a good resource for understanding different architectures and subsequently how fragile they can be lol

    • @wangjiefan8939
      @wangjiefan8939 Před 22 dny +12

      I worked at a global e-commerce company a year ago and their platform infrastructure is pretty similar, down to their use of etcd and go channel spaghetti 💀

    • @百合仙子
      @百合仙子 Před 21 dnem

      and a great counter-example for troubleshooting....

    • @bounceysteve
      @bounceysteve Před 13 dny

      the leaks are too

  • @erie7452
    @erie7452 Před 13 dny +27

    Crowdstrike video when?

  • @yeetyeet7070
    @yeetyeet7070 Před 23 dny +271

    github repo: "it was a toy project never meant for production"
    multibillion dollar company: "YAYEET"

  • @ccccy-o7x
    @ccccy-o7x Před 23 dny +196

    Hi Kevin, amazing content as always! One minor correction @9:54 tho: Go unbuffered channel's length is 0, instead of 1, and it means the sender will get blocked until a receiver receives the value. What the video @9:54 showcase is actually a buffered channel with length 1 (e.g. result of make(chan string, 1)).

  • @mrdabup
    @mrdabup Před 23 dny +239

    I still remember the day that it went down, people were blaming Chipotle (american fast casual chain) because they had an event that same day where you could claim a free burrito. People suspected that it was due to a mass influx of people, I knew (and a bunch of people too) that this wasn't an issue with influx of people. At the end of the day, it was a fun journey (more or less with the conspiracies, guessing correctly that it went down for 3 days months before this outage, and youtubers just milking on the outage). Thank you for making a video about this.

    • @frezzingaces
      @frezzingaces Před 23 dny +8

      Wait. How in tf could Chipolte's traffic affect Roblox's servers. Whats the theoretical connection?

    • @baribari1000
      @baribari1000 Před 23 dny +30

      @@frezzingaces it was a sort of partnership between chipotle and roblox, so if you installed roblox and did a bunch of stuff you'd get a free burrito. I think that's what it was, roblox has done tons of these

    • @fitmotheyap
      @fitmotheyap Před 22 dny +6

      Oh this happened during that time? Man the memes about the roblox crashes during its downtime were so enjoyable

    • @bruhlake
      @bruhlake Před 18 dny +2

      @@baribari1000 Yeah, it was super easy too, you could do it in like 2 minutes on a new account, it gave you a free entrée instead of a free burrito, so you could actually choose most meals you wanted. The few times they did the event with chipotle, I probably earned like 35 or so free entrees, which is pretty decent!

    • @argynews2825
      @argynews2825 Před 17 dny

      wasnt there also a massive adopt me update at the time which also probably caused a large increase of active accounts

  • @theprantadutta
    @theprantadutta Před 23 dny +298

    This is one of the biggest challenges of modern programming, depending on various 3rd party packages, not knowing what that package is, what it does, or whether it's even reliable, and moreover knowing what are the dependency of that 3rd party package and whether they are safe or not.

    • @Paulo27
      @Paulo27 Před 22 dny +18

      Also never update anything

    • @juniorwmg
      @juniorwmg Před 22 dny

      *If its not a security fix ​@@Paulo27

    • @tbuk8350
      @tbuk8350 Před 22 dny

      Or HashiCorp, being a multi-billion dollar company, could just maintain the fucking project themselves instead of blindly using a 4-year-old abandoned pet project from some random person's GitHub page and trusting it to work in a large production environment.

    • @ironcanon4920
      @ironcanon4920 Před 22 dny +4

      And that's before the issues of relying on additional 3rd party companies to supply the correct 3rd party packages. Supply chain issues the whole way down.

  • @mahnibba2674
    @mahnibba2674 Před 13 dny +24

    Came here to look for crowdstrike, seems like im way too late🤣

  • @michaellin7936
    @michaellin7936 Před 13 dny +20

    Crowdstrike video incoming in 2 years

  • @andreyabrz
    @andreyabrz Před 13 dny +22

    Well.. now we know the next video

  • @useruser-ti1og
    @useruser-ti1og Před 23 dny +150

    This is like the XKCD of all of the world depending on a toy project someone abandoned 10 years ago

    • @imgladnotu9527
      @imgladnotu9527 Před 22 dny +12

      probably 2347... as someone mentioned in some comment above.....

  • @shalodey
    @shalodey Před 13 dny +14

    IT global outage vid gonna go crazy

  • @Sam_Hue
    @Sam_Hue Před 13 dny +10

    The Crowdstrike video is going to hit pretty hard

  • @levimatheri7682
    @levimatheri7682 Před 12 dny +12

    Waiting for the Crowdstrike outage video!

  • @sergelorenzvillasica2361
    @sergelorenzvillasica2361 Před 13 dny +7

    Can't wait for the CrowdStrike episode 😀

  • @arnavn2554
    @arnavn2554 Před 13 dny +8

    You gotta make a video about the CrowdStrike outage

  • @pdlbackup
    @pdlbackup Před 23 dny +24

    Roblox players figuring out about the DNS steering and sharing ips for early access is kinda crazy 💀

  • @TheeSirRandom
    @TheeSirRandom Před 21 dnem +20

    Imagine how it must feel, starting a free project just as a hobby, and planning to abandon it eventually, then pretty much half the internet starts using it as an important building block to support the web. Now you're just sitting there, and have a choice to make. Stop maintaining the software, and pretty much break half the internet or keep going, getting zero thanks, and zero dollars for your work.

  • @mortred4144
    @mortred4144 Před 13 dny +12

    yo when is the CrowdStrike video coming

  • @ski3r3n
    @ski3r3n Před 22 dny +77

    the kids enter angry
    the kids leave confused

  • @darthmaul5413
    @darthmaul5413 Před 12 dny +6

    can you do a video about the current CrowdStrike Outage?

  • @KieranMahoney
    @KieranMahoney Před 12 dny +8

    WHENS THE CLOUDSTRIKE EPISODE COMING OUT???

  • @nebufabu
    @nebufabu Před 23 dny +170

    Whatever it took to make a video about a Roblox server crash and not use the "oof" SFX even once... I salute it.

    • @MartijnvanBerkel
      @MartijnvanBerkel Před 23 dny +71

      It's on 6:45

    • @yeetyeet7070
      @yeetyeet7070 Před 23 dny +5

      @@MartijnvanBerkel gottem

    • @aze4308
      @aze4308 Před 23 dny +2

      6:45

    • @nebufabu
      @nebufabu Před 23 dny +14

      @@MartijnvanBerkel I stand corrected. Frankly, using it only once is even more impressive.

    • @vincentschumann937
      @vincentschumann937 Před 21 dnem

      i read this 2 seconds before the oof sound played, well done sir

  • @TheRealStevenPolley
    @TheRealStevenPolley Před 12 dny +5

    Kevin Fang, big fan here. Please cover the clownstrike incident

  • @ChineseKiwi
    @ChineseKiwi Před 13 dny +7

    Kevin, get busy and make the Crowdstrike video 😂😢

  • @bummbumm6
    @bummbumm6 Před 23 dny +158

    This happened in the middle of my friends sleepover, when we were COMPLETELY into Roblox. He pretty much just came to play it. We checked like every 5 minutes if it got better.
    We eventually just slept. THROUGH THE WHOLE THING
    Edit: Are some of you really watching videos on Roblox and just hate people in the comment section who used to like the game? Find something better to do jeez

    • @ProblematicParag0n
      @ProblematicParag0n Před 23 dny +16

      Seems like you guys need to find better games

    • @N30ZUK1
      @N30ZUK1 Před 22 dny +2

      ​@@ProblematicParag0n Isn't your avatar from a ripoff of Minetest?

    • @Hellscaped
      @Hellscaped Před 22 dny

      @@N30ZUK1 minetest is a clone of minecraft...

    • @tbuk8350
      @tbuk8350 Před 22 dny +41

      @@N30ZUK1 Calling Minecraft a ripoff of Minetest is the most sweaty nerd Redditor thing you could do

    • @dexahtheman
      @dexahtheman Před 22 dny

      @@tbuk8350 Tbh nothing is correct here. Minetest is not trying to be minecraft it's trying to be a general purpose voxel game engine (check out it's other gamemodes there's some pretty unique cool stuff in there)

  • @yaakovwaxman4807
    @yaakovwaxman4807 Před 23 dny +18

    This is by far my favorite documentary channel on yt

  • @BananasAintCheap
    @BananasAintCheap Před 23 dny +14

    It’s crazy how much of the internet as a whole is in the hands of solo developers who made a thing in their spare time for fun

  • @MaximumADHD
    @MaximumADHD Před 23 dny +57

    Oh shit I was gonna suggest this as an idea, awesome to see that you did it!

    • @0x7f2c
      @0x7f2c Před 22 dny +3

      Lol nice you're here

    • @glefyr
      @glefyr Před 22 dny +4

      is that

    • @use2l
      @use2l Před 21 dnem +1

      ​@@glefyrhello call of duty black ops guy

  • @superbobsaget9000
    @superbobsaget9000 Před 23 dny +32

    Thank you for all the work you put into making this!!

  • @HarishDoredla
    @HarishDoredla Před 13 dny +5

    Next video on Crowd Strike update causing global outage!!

    • @ChineseKiwi
      @ChineseKiwi Před 13 dny

      It was Crowdstrike, not Microsoft

  • @_tylerkinney
    @_tylerkinney Před 23 dny +3

    Thank you for this, been waiting for this one for awhile now!

  • @tekratek4077
    @tekratek4077 Před 23 dny +14

    Nice technical aspect of the outage!

  • @Aunarky
    @Aunarky Před 23 dny +1

    I'm glad you made a video on that. I had no idea how it went down behind the scenes! :D

    • @Komas19Gaming
      @Komas19Gaming Před 16 dny

      there was an blog post made after the outage

  • @iqmal
    @iqmal Před 13 dny +4

    Great. Hopefully you'll make a video about Windows bsod due to CrowdStrike

  • @rusprice
    @rusprice Před 20 dny

    Thanks! I submitted this in as a suggestion a while ago, never thought it’d be published.

  • @pompomaddons
    @pompomaddons Před 23 dny +151

    KEVIN FANG JUST DROPPED A VIDEO ABOUT THE HALLOWS OUTAGE OH MY GOD

  • @Hopgop1
    @Hopgop1 Před 23 dny

    Man I love your videos, this was a particularly technical one, but still really well presented and interesting.

  • @i-am-linja
    @i-am-linja Před 22 dny +11

    I'd imagine programmer Hell is just a bug like this which takes all of Eternity to fix, also it takes down the company's internal issue tracker and communication system.

  • @davidslevs
    @davidslevs Před 23 dny +6

    Roblox is actually a bigger company than most think. Thanks for doing a video on it.

  • @zenobikraweznick
    @zenobikraweznick Před 22 dny +1

    Amazing CGI as always, thanks !!!

  • @asmith7966
    @asmith7966 Před 21 dnem +2

    Haven't finished the video yet, but this already makes me feel better about the half-day internet outage I fixed at work

  • @poketopa1234
    @poketopa1234 Před 19 dny

    Great great video, I seriously love the format and I learn so much

  • @xFrednet
    @xFrednet Před 23 dny

    Awesome summary, as always. Thank you! :D

  • @Cmanorange
    @Cmanorange Před 22 dny +1

    daily appreciation of kevin's visual style, i love how you're able to break down the language i might take for granted and make it easily followable

  • @patahgaming
    @patahgaming Před 23 dny +40

    Saddest day ever for 7 Y.O i hope they can recovery from this 😢

    • @dagdnoob
      @dagdnoob Před 22 dny

      😂😂😂😂😂😂😂😂😂😂😂

    • @gn2b445
      @gn2b445 Před 15 dny

      developers probably missed out on millions of dollars too!

  • @pitust
    @pitust Před 22 dny +8

    9:45 "A default channel can only hold one piece of data at a time" It's actually even worse than this: an unbuffered channel also requires that this piece of data is received before a send can complete (!)

  • @MohamedAruham
    @MohamedAruham Před 23 dny +8

    Damn I was waiting for this one

  • @Evercreeper
    @Evercreeper Před 22 dny

    YAY GLAD YOU DROPPED THIS

  • @wormonastring6888
    @wormonastring6888 Před 23 dny +2

    Another super interesting well researched + explained video! As a back end game dev, thanks for the nightmares!

  • @TheGrimravager
    @TheGrimravager Před 23 dny +12

    > And probably some machine learning and block chain for good measure
    lmao nice

  • @mwalton9526
    @mwalton9526 Před 12 dny +9

    How fast can you pop out a video? I think there might be something video worthy.

  • @hdgrove5567
    @hdgrove5567 Před 22 dny +2

    Love these videos please keep them coming!

  • @passenger175
    @passenger175 Před 23 dny

    Good work, these are both interesting from the tech perspective and just plain fun hah

  • @hasanpatel9029
    @hasanpatel9029 Před 23 dny

    The oof sound was a chef kiss to this master piece of video. Great work as always.

  • @Dudex11a
    @Dudex11a Před 22 dny +2

    This video is very well executed!

  • @warw
    @warw Před 23 dny +3

    Great video!

  • @_xord
    @_xord Před 23 dny +12

    new kevin fang video
    today is a good day

  • @ElioAllen-sb6by
    @ElioAllen-sb6by Před 23 dny +1

    I like your stuff keep it up make more security related stuff!

  • @ishan6771
    @ishan6771 Před 21 dnem

    Well done as always

  • @flokibyarian6832
    @flokibyarian6832 Před 20 dny +1

    Thank you for the great information and entertainment video like always😊

  • @JustDeeevin
    @JustDeeevin Před 19 dny

    Tons of love for captioning your videos❤❤

  • @3rdalbum
    @3rdalbum Před 17 dny

    Another great video, I really enjoyed it.
    There's probably heaps of outages you can do next, but perhaps you could do a video on the "OpenOffice can't print on Tuesdays" bug?

  • @mat-hu5ys
    @mat-hu5ys Před 23 dny +2

    love your vids! please make more

  • @frosty4769
    @frosty4769 Před 22 dny

    the goat's back with another banger

  • @gareth2021
    @gareth2021 Před 23 dny

    great video, thanks dude

  • @matze489
    @matze489 Před 23 dny +4

    its a good day when there ia a new kevin fang video

  • @vash47
    @vash47 Před 22 dny +1

    your videos are quality over quantity

  • @Spiffycaius
    @Spiffycaius Před 23 dny +2

    Man I remember when this happened this was crazy.

  • @ibis8566
    @ibis8566 Před 22 dny +1

    these videos make me feel like im watching a some type of CSI crime documentary

  • @Viniter
    @Viniter Před 22 dny +23

    I love this series. It's like true crime or airplane disaster videos, but it can be fun, because nobody really gets hurt. Except for big corporations and Roblox players, and well... screw them.

    • @hagangray8006
      @hagangray8006 Před 22 dny +10

      That’s a bit harsh on Roblox players… I mean most of them are like 9 years old

    • @absoultethings4213
      @absoultethings4213 Před 22 dny

      @@hagangray8006if they aren’t 9 there’s a 50% chance they’re a predator or another kind of scum

    • @apersoniguess_
      @apersoniguess_ Před 21 dnem +6

      @@absoultethings4213 or… just normal people. Big shocker I know

    • @adityaramadhan1708
      @adityaramadhan1708 Před 19 dny

      ​@@apersoniguess_impossible😱😱😱😱😱

    • @enthuscimandiri1640
      @enthuscimandiri1640 Před 19 dny

      unti money some rando get involved, yeahhh its really fun

  • @NoobieNoodle89
    @NoobieNoodle89 Před 22 dny +1

    I love the way that you explain these complex incidents. You deserve a 冰淇淋🍦

  • @ImTotallyTechy
    @ImTotallyTechy Před 23 dny +2

    In life... you have roblox
    (another BANGER kevin fang video, cant wait for the next)

  • @thegammingbros6231
    @thegammingbros6231 Před 12 dny

    love this video makes everything understandable!

  • @H-E-S-C
    @H-E-S-C Před 19 dny

    finally, a good video on the infamous outage

  • @heyjakeay
    @heyjakeay Před 23 dny +27

    yo honey wake up, new Kevin Fang video to watch while at work

  • @greatcanadianmoose3965
    @greatcanadianmoose3965 Před 22 dny +1

    Always love kevin fang videos... but would you mind using I Home's we're finally landing closer to the end of the video please? Thx

  • @GardenOfUna
    @GardenOfUna Před 23 dny +1

    I don't understand a single thing but I'm so incredibly curious that I want to know more
    I genuinely really love this for some reason

  • @Mihacappy
    @Mihacappy Před 16 dny +1

    Ah yes, that day in 2021 that i was working in Studio and the toolbox stopped working, and my ass almost had a heart attack because i though i got banned.

  • @No-day-off
    @No-day-off Před 11 dny +1

    Let’s go bro. CrowdStrike is giving free material to your next video.

  • @fang-penlin4482
    @fang-penlin4482 Před 23 dny +3

    Oh man, I love your channel so much. I can't wait to see the XZ backdoor video made by you, it's gonna be fun 😂

  • @5TC
    @5TC Před 23 dny

    Wasn't expecting him to talk about this but man I remember when this happened

  • @randomazzy11
    @randomazzy11 Před 22 dny +3

    5:04 I also heard avatars broke before the whole game went out, and some players were able to play roblox but most of the scripts were missing so it was pretty unplayable. Is it because the game couldnt fetch those? Wow

  • @CiY3
    @CiY3 Před 23 dny

    Finally, a Kevin Fang video about an outage I was witness to.

  • @brawldude2656
    @brawldude2656 Před 20 dny

    These server incident always feels like a surgery where you have to save a person in its current form ASAP

  • @siz1700
    @siz1700 Před 22 dny +1

    Nice! I wish Roblox never recover from that!

  • @Jmcgee1125
    @Jmcgee1125 Před 23 dny

    I paused the video when that perf screenshot came up. 5 seconds later I'm like "why the hell did nobody check this before?" We love lock contention.

  • @chawrx3
    @chawrx3 Před 23 dny

    STARTING THE DAY OFF NICE !

  • @leosh9026
    @leosh9026 Před 17 dny

    Thanks for the explanation dude"

  • @Core533
    @Core533 Před 5 dny +4

    Waiting for the crowdstrike video

  • @imchillingbro
    @imchillingbro Před 17 dny

    i remember this outage, me and my friends kept playing muck while waiting

  • @06NinjaKid06
    @06NinjaKid06 Před 23 dny +1

    best roblox video

  • @mementomori8856
    @mementomori8856 Před 23 dny

    GO mentioned!
    So you're telling me that I should continue to be paranoid about how every single line of code of my personal projects is not efficient or secure enough? Deal! Love this thank you!

  • @caduhidalgo4996
    @caduhidalgo4996 Před 23 dny +4

    Baby, wake up!
    New Kevin Fang lore just dropped 🎉

  • @soulsmwc
    @soulsmwc Před 23 dny +1

    Great!

  • @dogeextras
    @dogeextras Před 10 dny

    My friend was stuck in a game during the roblox outage and saw lots of things happen

  • @BeaStScoPesHD
    @BeaStScoPesHD Před 22 dny

    3 days to figure out that they turned on a bad feature a day ago is actually insane