Cloudflare Deploys Really Slow Code, Takes Down Entire Company

Sdílet
Vložit
  • čas přidán 2. 06. 2024
  • Cloudflare is back at it again with more regex and state machines.
    Previously on Cloudflare: • How One Line of Code A...
    Sources:
    blog.cloudflare.com/details-o...
    blog.cloudflare.com/introduci...
    swtch.com/~rsc/regexp/regexp1...
    www.regular-expressions.info/...
    cyberzhg.github.io/toolbox/nf...
    www.businessinsider.com/cloud...
    Chapters:
    0:00 Part 1: Intro
    0:51 Part 2: Regex
    2:29 Part 3: Deployment Process
    4:20 Part 4: Disaster Strikes
    6:25 Part 5: Root Cause
    12:22 Part 6: Aftermath
    Corrections:
    - Missed opening bracket [ in the domain name part of the expression 6:36
    - This particular regex is worst case quadratic, not exponential 8:30. The example right after w/ 1 million steps is exponential.
    - The DFAs at 10:40 and 11:50 should have the starting states marked as end states as well to properly match empty strings
    Music:
    - Nocturnal by LEMMiNO ( • LEMMiNO - Nocturnal (BGM) )
    - Smooth by Silent Partner
    - Encounters by LEMMiNO ( • LEMMiNO - Encounters (... )
    - Cipher by LEMMiNO ( • LEMMiNO - Cipher (BGM) )
    - Fine Dining by TrackTribe
  • Věda a technologie

Komentáře • 746

  • @rockyvillano777
    @rockyvillano777 Před 10 měsíci +1472

    If I had a nickel for every time quadratic complexity passed testing but blew up in prod I'd be rich

    • @allyourpie4323
      @allyourpie4323 Před 10 měsíci +167

      What if every time it quadratic complexity passed testing, you got a nickel for every time quadratic complexity passed testing?

    • @renakunisaki
      @renakunisaki Před 10 měsíci +95

      If I had a nickel for every time I got a nickel...

    • @Bollibompa
      @Bollibompa Před 10 měsíci +39

      If I got a nickel for every time someone says "If I got a nickel" I would be a gabbagillionaire!

    • @Aaron.Thomas
      @Aaron.Thomas Před 10 měsíci +19

      This is why we have and should have computer science and not just computer engineering - math is important.

    • @yaksher
      @yaksher Před 10 měsíci +17

      This is exponential, not quadratic. If it were quadric, there wouldn't be any problems.

  • @nwrocketman6438
    @nwrocketman6438 Před 10 měsíci +1625

    2:51 I like how the 1000x engineer just foreshadows all the events that about to happen, and then approves the change.

    • @tehroflcopters
      @tehroflcopters Před 10 měsíci +103

      par for the course tbh

    • @1008OH
      @1008OH Před 10 měsíci +283

      Yeah average senior software engineer moment tbh

    • @EgonFreeman
      @EgonFreeman Před 10 měsíci +256

      Yeah, because if you go "Hey guise, this might actually break in a very non-obvious way, let me run some sims..." you're usually ignored and the go-ahead is given _anyway._ Nobody wants to sit around waiting for That One Guy Who Keeps Envisioning Black Day Scenarios to finish their petty "I must be included as a vital part this conversation" so-called-tests (because believe me, there are people who see it as such). Besides, production-level testing usually ends up being faster... :D

    • @jfbeam
      @jfbeam Před 10 měsíci +122

      @@EgonFreeman From experience... had they spoken up, they would almost certainly have been met with one of "who asked you?", "shut up, we're doing this!", or just ignored. Dozens of others didn't notice any issues, so who's going to listen to "that guy"?

    • @xhivo97
      @xhivo97 Před 10 měsíci +25

      a true 1000x enginerd would not use regex lol

  • @CYXXYC
    @CYXXYC Před 10 měsíci +554

    2:37 laughed my ass off at "delete master after the pull request is merged"

    • @ABaumstumpf
      @ABaumstumpf Před 10 měsíci +35

      and there are people that would click that checkbox.

    • @Ultrajuiced
      @Ultrajuiced Před 10 měsíci +53

      Chill, "master" is the dev branch here while "main" is the master branch of course.

    • @zerto111
      @zerto111 Před 10 měsíci +26

      @@Ultrajuiced Let's just agree that it's a funny easter egg in the video and laughing i justified.

    • @nicholasfinch4087
      @nicholasfinch4087 Před 10 měsíci +4

      Glad to see others had seen that too 😂❤

    • @MrWarlock616
      @MrWarlock616 Před 10 měsíci

      Yeah that was such a nice touch!😂

  • @xorinzor
    @xorinzor Před 10 měsíci +2400

    Despite all this, I still very much love Cloudflare especially because of their transparency. They always go into great depth explaining what happened, what they did, and how they resolved it.
    Many companies can learn a thing or 2 from them in that regard. Customers tend to have more faith in a company that just owns up to it's mistakes rather then trying to have a PR department cover it up in nice words.

    • @StkyDkNMeBlz
      @StkyDkNMeBlz Před 10 měsíci +93

      ​@@KabodankiCaptcha? Google the "Privacy Pass" extension. It lets you skip the tests by doing tests beforehand.

    • @jacksoncremean1664
      @jacksoncremean1664 Před 10 měsíci +35

      @@StkyDkNMeBlz they don't have captcha anymore, they use turnstile

    • @maskettaman1488
      @maskettaman1488 Před 10 měsíci +115

      I love my state sponsored man in the middle

    • @returndislikes6906
      @returndislikes6906 Před 10 měsíci

      I hate cloudflare because its trying to become monopoly on internet ethics. Its not your job to pass judgement what is allowed on the internet. Banning something that is illegal is fair. But banning because its immoral according to them... yea I hate cloudflare.

    • @returndislikes6906
      @returndislikes6906 Před 10 měsíci +49

      @@StkyDkNMeBlz I really want to be tracked by Google across the internet with their corpo issued cryptographic IDs. You do understand what you are shilling?

  • @ejcx_
    @ejcx_ Před 10 měsíci +538

    I was actually at Cloudflare in the room for Cloudbleed and this issue, in SF for Cloudbleed and happened to be in London for this one. The real story is much better than this. We were at lunch doing a tech talk in the lunchroom when someone grabbed the mic and announced we were having a P0. We stampeded back to our desks and got to work fixing it. The issue was obviously related to the WAF from the start and it was just a matter of cleaning up. Keep up the videos they are great

    • @Nayayom
      @Nayayom Před 10 měsíci +12

      So how's working with cloudflare like ? :)

    • @0xggbrnr
      @0xggbrnr Před 10 měsíci +92

      @@NayayomIt’s actually really awesome. Tons of really smart, kind, curious people. Everything internally is about transparency, execution, and learning. Definitely engineering-centric, but also super product-focused in that the customer is always considered during meetings/talks/decisions.

    • @Nayayom
      @Nayayom Před 10 měsíci +19

      @@0xggbrnr sounds like a good place to work! Glad to hear that

    • @CarlosISoares
      @CarlosISoares Před 10 měsíci +11

      Ok, but is not BETTER than the video story lol

    • @magno5157
      @magno5157 Před 10 měsíci +8

      What happened to the employee who made the regex?

  • @eldrago19
    @eldrago19 Před 10 měsíci +35

    "Some programmers run into a problem and think, 'I will use regex to solve this!' Now they have two problems."
    - Zawinski

  • @mwissel
    @mwissel Před 10 měsíci +169

    Love the little details, like the upside down cloudflare icon in Australia. Good job editing!

  • @greg_289
    @greg_289 Před 10 měsíci +403

    Worth mentioning that Cloudflare isn’t just a CDN. It’s predominantly used by most websites as a web proxy responsible for the majority if not all requests to the origin.

    • @emeraldbonsai
      @emeraldbonsai Před 10 měsíci +21

      the webproxy is a cdn last i checked

    • @VeggieRice
      @VeggieRice Před 10 měsíci +3

      users can elect their own dns service

    • @hoo2042
      @hoo2042 Před 10 měsíci +19

      @@emeraldbonsai Typically a CDN serves static or at least mostly static data. A CDN may be implemented as a caching web proxy, but a web proxy can do a lot more than what usually falls under the definition of "CDN". In CloudFlare's case, they basically offer both and blur the line about which is which, which is fine since it's a blurry line, but the person you are replying to isn't wrong.

    • @hoo2042
      @hoo2042 Před 10 měsíci +5

      @@VeggieRice ⁠DNS has nothing to do with what's being discussed here (aside from being an earlier step in the chain that would take you to the page's configured web proxy or CDN, of course, but equivalently so to saying "the user can elect their own browser").

    • @benhook1013
      @benhook1013 Před 10 měsíci +7

      The distinction is useful here as the outage is much more impactful if your web pages wont even load themselves (because the web proxy is down) rather than just CDN assets not loading (which could be only large assets).

  • @evilsqirrel
    @evilsqirrel Před 10 měsíci +130

    I work in a cybersecurity administration space where regex is used all the time as a necessity. This is a story we tell people all the time to make sure they understand how important it is to make efficient regex.

    • @Wyvernnnn
      @Wyvernnnn Před 10 měsíci +1

      The regex is fine there was no reason for the engine to backtrack on it

    • @Zei33
      @Zei33 Před 10 měsíci +12

      @@WyvernnnnI disagree. The pattern didn’t make much sense. It was clearly missing something between the initial wildcard and non-capturing group. There’s never any reason to put two wildcards next to each other like that.

    • @Wyvernnnn
      @Wyvernnnn Před 10 měsíci +4

      @@Zei33 Yeah that was weird, but it should still get O(n) in the end, that's the whole point of regular expressions (as long as you don't have capturing groups that can be re-used within the regex aka backtracking)

    • @Croz89
      @Croz89 Před 9 měsíci +1

      I can understand how it happened, even an experienced programmer can struggle to parse a regex by eye and it's easy to make something that's a resource hog without realising. It's certainly a lesson to test your regex thoroughly before release.

    • @Zei33
      @Zei33 Před 9 měsíci

      @@Croz89 oh yeah I’m constantly making mistakes. No one is perfect, that’s why debugging and beta testing exists. After over a decade of programming, I don’t make a lot of mistakes, but when I do they’re usually obscure cases or subtle logic errors. When you’re working with tens of thousands of lines of code and looking at them for 8 to 20 hours a day, mistakes are gonna happen.

  • @Thect
    @Thect Před 10 měsíci +209

    "like re2, which work by converting regex to a state machine, or fancy computer science flowcharts"
    Damn, I wish I can say this line to the professor who teaches compiler course in my university lol

    • @mr_confuse
      @mr_confuse Před 10 měsíci +17

      I had straight vietnam flashbacks when the statemachine came up lmao

    • @skyhappy
      @skyhappy Před 10 měsíci +5

      State machines were the most useless thing I learned 2nd year in "computational theory" class. Whole class was academic fluff.

    • @tissuepaper9962
      @tissuepaper9962 Před 10 měsíci +50

      ​@@skyhappy FSMs are used everywhere, they're the basic building block of most digital protocols and embedded systems. Definitely not "useless".

    • @DNX3M
      @DNX3M Před 10 měsíci +3

      I call them spaghetti meatballs. Feel free to use that one.

    • @arthurpenndragon6434
      @arthurpenndragon6434 Před 10 měsíci +27

      @@skyhappy average framework enthusiast with no understanding of computer science.

  • @violabrockman5284
    @violabrockman5284 Před 10 měsíci +220

    It's important to note that re2 actually has other downsides compared to other regex engines, such as being unable to handle lookaheads and lookbehinds. This isn't just an implementation issue either: adding these operations actually makes regex strictly stronger than a finite state machine (instead it becomes a pushdown automaton). There's also a lot of fun math with finite state machines, where it turns out they're strictly equivalent to generating functions, which are basically power series where you don't care about convergence!

    • @Ceelvain
      @Ceelvain Před 10 měsíci +5

      I think "look-around" assertions could still be implemented to run in linear time. As far as I know, back references is the only feature that can make the run time go exponential. In fact, matching regexes with backrefs is proven NP-hard.

    • @hoo2042
      @hoo2042 Před 10 měsíci +18

      Lacking some of the more advanced PCRE features in order to make guarantees about the maximum runtime seems like the right compromise to make for a high-volume security frontend that sits between the global population and a large swath of the internet.

    • @Ceelvain
      @Ceelvain Před 10 měsíci +20

      @@hoo2042 That's actually why Russ Cox developed RE2 in the first place. He made it for google code search (now defunct). You can't really be Google and expect tech people to only input well-behaved regexes. He has a very interesting series of articles named "Implementing Regular Expressions". I really recommend every developer to read them.

    • @tommihommi1
      @tommihommi1 Před 10 měsíci +2

      shouldn't regex expanded like this be called cfex instead, since it's, well, no longer a regular expression

    • @mustard96
      @mustard96 Před 10 měsíci +4

      @@hoo2042The problem now is that they use it on every Google product. re2 is the regex engine of BigQuery and I’m stuck with this limitations. It doesn’t make sense in a data warehouse.

  • @royalepros669
    @royalepros669 Před 10 měsíci +73

    my brain automatically shut down when you start explaining the regex...

    • @TMRick1
      @TMRick1 Před 10 měsíci +2

      Same here dude. I can't understand how people are still relying on regex for such important aspects of the code. It's just mind-blowing that a firewall rule is managed with that in 2023.

    • @arielcg_
      @arielcg_ Před 10 měsíci +14

      @@TMRick1 what alternative is there that is universally supported and has the same level of flexibility for how "compact" it is?

    • @tissuepaper9962
      @tissuepaper9962 Před 10 měsíci +15

      ​@@TMRick1 you just don't know the pain of using anything else to do what regex can do. What do you suggest? awk?

    • @user-eh7hy2xn3w
      @user-eh7hy2xn3w Před 10 měsíci +3

      lol just write your own parser. You are acting as if that's a hard problem to solve and as if customers are not important. You just want to make your lives as "programmers" easier. Have some responsibility for the unnecessary amount of code that runs on users machines.

    • @jtfoog5220
      @jtfoog5220 Před 3 měsíci

      @@user-eh7hy2xn3wI use arch btw lol

  • @thewhitefalcon8539
    @thewhitefalcon8539 Před 10 měsíci +38

    Someone wrote a paper ages ago about backtracking vs non-backtracking regex engines and the state of software slowness...
    The title is "Regular Expression Matching Can Be Simple And Fast (but is slow in Java, Perl, PHP, Python, Ruby, ...)" written by a Russ Cox in 2007. I bet he's feeling vindicated

    • @lhpl
      @lhpl Před 10 měsíci +8

      That article, and several others, should be mandatory reading for anyone using regular expressions.

  • @Aldrasio
    @Aldrasio Před 10 měsíci +34

    This is why as a general rule I NEVER use .* in my regexes. If I want to match everything before an equals sign, I'd use [^=]*= rather than .*= because it's always better to be as explicit as possible.

    • @framegrace1
      @framegrace1 Před 10 měsíci +1

      But that would match just the first '=' , not all of them. If you have a lot of parameters on a URL , you will have a lot of '=' and you will want to search all of them for certain things.

    • @HenryLoenwind
      @HenryLoenwind Před 10 měsíci +9

      ​@@framegrace1 That's why you don't anchor the expression to the end of the string in this case. We don't care what else is at the end of the URL if we find a "bad thing" near the start. Also, most regex engines have a shortcut implementation for regexes ending in ".*"/".*$", so the one at the end is of no concern.
      And BTW, the issue was mostly the ".*.*", not so much the ".*=". Backtracking the latter isn't so expensive---it doesn't really matter if the engine has to search for the = from the start of the end of the remaining string. It most likely has a shortcut for "fixed character after match all" anyway. There's a good chance that ".*?=" is faster than "[^=]*?="/"[^=]*=" as it can scan the string using a simple "equals" comparison and be done. This, however, all goes out the window once there are multiple ways to match, like the infamous ".*.*". So when using this optimisation on purpose, it makes sense to manually commit after the "=" (e.g. with "(*COMMIT)").

    • @AbiGail-ok7fc
      @AbiGail-ok7fc Před 10 měsíci +2

      @@framegrace1 You can still get the last '=' by being more explicit: /([^=]*=)*=/ or /=[^=]*$/

  • @Kabodanki
    @Kabodanki Před 10 měsíci +11

    upside down cloudflare logo for australia was gold

  • @MindLaboratory
    @MindLaboratory Před 10 měsíci +175

    It's amazing how casually things are actually handled behind the scenes in the IT world. I once wrote some software for a bank, did a 3 hour audit of the code with 5 of their top developers, after which they installed a pre-compiled earlier test version on their prod system. smh

    • @jnz007
      @jnz007 Před 10 měsíci +1

      😂

    • @DanielSmedegaardBuus
      @DanielSmedegaardBuus Před 10 měsíci +55

      I once fixed a globally crashing iOS app by hacking the backend to send out technically incorrect data. The app passed all tests because the test suites didn't include any data to reveal division-by-zero bugs.
      This was especially bad since the time to get Apple to review and deploy an updated version could take a week or more, IIRC.
      After conversing with the dev responsible, I asked him how he habdled fractional numbers, and he was sure that fractional numbers were always displayed as integers, so I changed the API to send instances of 0 as 0.001, effectively circumventing the bug while displaying calculated numbers (and 0s) correctly in the app.
      I think it's the most hacky fix I've ever deployed. It felt terrible and exhilarating and awesome all at the same time 😂 I'm actually a little proud 😇

  • @andrewallbright658
    @andrewallbright658 Před 10 měsíci +72

    These stories are so cathartic. Thanks for applying your storytelling to these niche topics!

  • @chewcodes
    @chewcodes Před 10 měsíci +145

    this is my new favorite channel. explaining everything clearly, and being humorous with small jokes here and there

    • @gblargg
      @gblargg Před 10 měsíci +19

      And explosions, lots of explosions.

    • @princesssshortie
      @princesssshortie Před 10 měsíci +9

      Same. And the upside-down cloudflare logo on Australia killed me. 😂

  • @UnrealOG137
    @UnrealOG137 Před 10 měsíci +16

    3:50 Putting cloudlfare upside down above Australia was a hilarious touch!

  • @glitchy_weasel
    @glitchy_weasel Před 10 měsíci +10

    Absolutely crazy videos you're pumping up. Love your comedic editing style too!
    Every video of yours makes me feel like to the entire internet could break at any moment lol

  • @pav431
    @pav431 Před 10 měsíci +49

    I work as a sysadmin, yet I wish I had this much insight into how all the technologies I use daily work this deeply. I never finished my college, so I only ever heard of DFA, but despite that, your video explained it very well, and showed how much of an issue a simple regex can be, when executed thousands a times a second.
    Please make more videos, I cannot wait for more

    • @Atlantis357
      @Atlantis357 Před 10 měsíci +9

      It is interesting to note that any regex can be represented as a nondeterministic finite automata (NFA) and any NFA can be converted into a DFA using a simple algorithm. The only downside is that the DFA may end up with exponentially more states than the NFA which can take up a lot of memory.

    • @Nayayom
      @Nayayom Před 10 měsíci +3

      Good thing you enjoyed it, i have not so fun memories doing DFAs and NFAs by hand on college. :(

    • @habama1077
      @habama1077 Před 10 měsíci +1

      ​@@Nayayombro same. I hated it. But understanding them is pretty good. We can see how google's devs used
      automata and formal grammar theory to develop a useful practical application with regex.

    • @lhpl
      @lhpl Před 10 měsíci +4

      @pav431 even without some formal education, I'd say anyone working as sysadmin should know something about algorithms and complexity theory. Especially when writing code that systems used by others depend on. And shell scripts _are_ code.
      Knowing that there are regular expressions, and highly irregular expressions that are just _called_ something like "Perl compatible regular expressions" or "extended regexp" or whatever, is important. So is not writing script that unnecessarily nest three or more loops, and work fine on small testdata, but take "forever" with realistic sizes of data. Know and understand the various o-notations. Just because quicksort is usually quick, doesn't prevent it from having O(n²) worst case complexity. You may be fine with that, but you will want to know why you can live with it. There has to be a metric ton of good books on this, so it's possible to learn. Enjoy!

    • @Aaron.Thomas
      @Aaron.Thomas Před 10 měsíci

      ​@@lhpl As someone who's inherited, and then had to completely rewrite from scratch, core automation scripts for clusters that were written by novice sysadmins, I concur that learning these things is important.
      Some aysadmins learn that awk and grep exist and that's the end of their training. Rewriting from scratch saved me hours upon hours it would have cost me to try to maintain the poorly made code inherited from novice sysadmins.

  • @aethese
    @aethese Před 10 měsíci +7

    Love these videos Kevin! Your amazing storytelling, editing, animations, and everything else comes together in an amazing way! Love watching every video you put out, keep it up :)

  • @mariadragonbreath7130
    @mariadragonbreath7130 Před 4 měsíci +1

    I stumbled across your videos yesterday, and i find them really entertaining and interesting to watch! Thank you explaining these topics in a clear way that even if I know nothing about regex or cloudfare, i can still follow along and understand the video :)

  • @BrandonCallender
    @BrandonCallender Před 10 měsíci +77

    I dont blame you for doing Cloudflare again, their RCAs are always excellent. This is such an excellent channel, you deserve far more subs! These are exceedingly entertaining and interesting for software engineers (and probably most other folks too!)

  • @MrMCMaxLP
    @MrMCMaxLP Před 10 měsíci +8

    Ken Thompson is crying really hard. His work has been around for decades. As a CS guy who has specialized in algorithms, this hurts in the middle of the heart.

  • @patterntrader690
    @patterntrader690 Před 10 měsíci +5

    Your videos are amazing, keep going you are bound to blow up

  • @ballinlikebill8334
    @ballinlikebill8334 Před 10 měsíci +7

    love the style keep it up man

  • @tehlaser
    @tehlaser Před 10 měsíci +26

    I had a regex blow up on me like that once. Not **quite** as silly as .*(?:.*=.*), but pretty close. The regex library we were using implemented backtracking with recursion, so instead of eating CPUs like a bag of chips it would instead masticate for a while before eventually running out of stack, whereupon it would puke Pringles. This was an especially fun one to fix because if you google “regex stack overflow” you’ll find that there are zillions of questions on stackoverflow about regexes that have nothing to do with stack overflows.
    And yes, in shame I must admit the regex in question did not fit on my screen all at once. In my defense, however, that was because a year or so earlier I had torn the line noise apart and put 3 to 5 characters of actual regex on each line, followed by a comment. Only two lines had // I have no idea, this shouldn’t do anything, but it doesn’t work without it.

    • @hentai824
      @hentai824 Před 10 měsíci +1

      lmaoo man i feel you

    • @tehlaser
      @tehlaser Před 10 měsíci +1

      Come to the dark side. We indent our regexes.

  • @CALEB94
    @CALEB94 Před 10 měsíci +7

    Awesome and informative video! A small correction: NFA matching is still linear in the input string. You just have to store the configuration as a set of NFA states, rather than a single state. You don't get exponentially many paths in the way you describe in the video because paths ending at the same state are merged in this set representation.

  • @minimalist_zero
    @minimalist_zero Před 10 měsíci +3

    This is an incredible video. You took very complex and difficult to understand concepts and simplified it well. Well done.

  • @creamyhorror
    @creamyhorror Před 10 měsíci

    I can't believe I've only just run across your content - it's really well done and humorous, you're going places!

  • @BurzowySzczurek
    @BurzowySzczurek Před 10 měsíci +3

    Another nice dev store, with interesting storytelling, really enjoyable to watch. Thanks, we are waiting for more : )

  • @kanal7523
    @kanal7523 Před 10 měsíci

    I love your videos and goofy animations please never stop doing these

  • @thedoble
    @thedoble Před 10 měsíci +1

    The animation and comedic aspect of this video is great. Plus its explained extremely well. Nice

  • @LordMegatherium
    @LordMegatherium Před 10 měsíci +43

    Regex is great like shell scripts: works everywhere and does it jobs... up until a certain script size when the chance of bugs starts increasing and you should think of using another tool instead or in conjunction.
    Also this sounds like GitOps to the extreme: when you can only change your state via your repo and all the triggers that come with it you might as well replace your CD with a single bash script (see above).

  • @RSZA011
    @RSZA011 Před 10 měsíci

    this is by far my most interesting youtube channel . please keep it up ! I really enjoy this content

  • @hanlonm
    @hanlonm Před 10 měsíci

    Great video! Really like the style and explanations

  • @NishthaSharma-nt9hk
    @NishthaSharma-nt9hk Před 10 měsíci

    loving these vids. keep em coming!

  • @Cyber_Chriis
    @Cyber_Chriis Před 10 měsíci +1

    Dude your illustrations are so good and funny! :D

  • @D0Samp
    @D0Samp Před 10 měsíci +23

    The reason we predominantly use NFA regular expression engines is not just because they're usually faster if we don't throw non-degenerative expressions at them, but also because they support expressions that exceed the capabilities of a regular grammar, such as back references to a specific capture group that has been seen previously.

    • @MH_VOID
      @MH_VOID Před 10 měsíci +1

      I was under the impression that they're generally slower

    • @georgehelyar
      @georgehelyar Před 10 měsíci +2

      ​@@MH_VOIDfor a normal case the performance is generally similar, but the difference is that these linear engines like RE2 are more predictable and less likely to blow up in your face.
      If you don't have control of the pattern and the input, they are *much* safer, and losing features that depend on backtracking is generally not a big deal.
      If it's really performance critical just don't use regex at all if you can avoid it.

    • @teunmathijssen7459
      @teunmathijssen7459 Před 10 měsíci +1

      NFAs and DFAs are computationally equivalent and recognise only exactly the regular languages. So a NFA-backed RE engine would have to implement additional functionality as languages with backreferences are not regular.

    • @D0Samp
      @D0Samp Před 10 měsíci

      I've learned in the meantime that the biggest speed advantage is actually due to unrelated technologies such as a JIT compiler in PCRE2, which is in fact a top-down parser that happens to accept regex-like expressions. The only thing that is definitely faster about NFA is compiling regular expressions.

    • @paulstelian97
      @paulstelian97 Před 10 měsíci +1

      @@D0SampCertain types of things that regex engines/matchers support aren't true regex, and can't be covered by a DFA/NFA. That's a bit BS to me.

  • @dasherreal
    @dasherreal Před 10 měsíci

    This channel concept is brilliant. Thank you so much.

  • @RMDragon3
    @RMDragon3 Před 10 měsíci +9

    I kept wondering why they didn't just do a rollback to fix the issue, thanks for addressing that at the end.

    • @romannasuti25
      @romannasuti25 Před 10 měsíci

      Yeah, as much as I love Cloudflare for smaller stuff there's a reason a lot of large enterprises use Akamai. A little overpriced for a simple growth phase startup and not as transparent as Cloudflare when something breaks on their end, but that massive bucket list of features available with Ion Premier, Cloudlets, and many more, especially Datastreams and their web security analytics portal, is an absolute lifesaver. Hell, it helps us debug all sorts of broken stuff upstream of it too, although I wouldn't be surprised if Cloudflare offered something like Akamai Reference IDs for easy, enterprise-friendly tracing. Specifically, Akamai is really particular about having identical Staging and Production sections with really fast rollback when production error rates increase even a little.

    • @framegrace1
      @framegrace1 Před 10 měsíci

      All those rules are stored locally to each node, and you cannot rollback to a machine that is dead or so high on CPU that can't even handle a connection. I presume they globally disabled WAF and restarted the nodes, so when up, they didn't try to apply the WAF rules and were free to be rolled back/forward. Then they re-enabled WAF (very slowly, I presume :) ) and all was back to normal.

  • @kleinekip1234
    @kleinekip1234 Před 10 měsíci

    Love your content, motivating me to learn more about coding and software development (have always been interested in) and you are able to explain it in a way where you somehow use terms used by those actually working with databases yet I'm able to follow what you're talking about and how it all works, keep it up man :)

  • @sairao4492
    @sairao4492 Před 10 měsíci +1

    This is a great video. You made it very easy to follow. Took me back to my computational models class.

  • @RandomFish-gx7pj
    @RandomFish-gx7pj Před 9 měsíci +4

    It's kinda funny that the Internet was designed to be a `web` that hopefully would prevent failures of a single node taking down the whole system, but nowadays we heavily rely on a handful of service providers just to run the Internet.

  • @DiegoGuerrero-zy5ne
    @DiegoGuerrero-zy5ne Před 10 měsíci

    Great videos! Keep ‘em coming

  • @Dardasha_Studios
    @Dardasha_Studios Před 10 měsíci

    You are a gem,
    thanks for the detailed information.
    You didn't only explain complex information, you also explained how WAF companies work.
    Thanks again.
    Salam!!

  • @jacobclark288
    @jacobclark288 Před 6 měsíci +1

    these videos are so well produced, the jokes in the imagery are so on point.

  • @owenschwartz
    @owenschwartz Před 10 měsíci

    Loving these videos!

  • @TheDarkWayne
    @TheDarkWayne Před 10 měsíci

    Your content is pure gold. Keep up the great work!

  • @violetwtf
    @violetwtf Před 10 měsíci

    you are my favorite channel. so excited to watch this on my business trip

  • @mattym8
    @mattym8 Před 10 měsíci

    Excellent vid. Couldn't have been shorter. Didn't need to be longer. Learned lots. You might've achieved perfection.

  • @mangoodbad13
    @mangoodbad13 Před 9 měsíci

    Things exploding in your videos is the entire reason I wake up every morning, thank you friend, it's freakin hilarious

  • @MrNobbless
    @MrNobbless Před 10 měsíci +1

    i love these videos, keep them coming please!

  • @maddoggLP
    @maddoggLP Před 6 měsíci

    Didn't expect to hear about theoretical computer science (which is a subject I take this year) in this video but nice work. It's nice to see actual real-world usage of converting e-NFA to DFA's. I wish our prof would have included this video in his lecture...

  • @Timi7007
    @Timi7007 Před 10 měsíci +2

    I love the graphics, depicting real processes very well but hilariously funny at the same time!

  • @candle_eatist
    @candle_eatist Před 10 měsíci

    every time this dude uploads it's an absolute banger

  • @ladyravendale1
    @ladyravendale1 Před 10 měsíci

    I love your videos on the internet blowing up. Perfect blend of programming, memes, and good graphics

  • @Kevin-wj1do
    @Kevin-wj1do Před 10 měsíci

    I love these videos, they make me actually lol. Keep up the great work on these.

  • @ighsight
    @ighsight Před 3 měsíci

    A beautiful explanation, wrapped in humor. I need to subscribe to this channel.

  • @morgankuphal3417
    @morgankuphal3417 Před 10 měsíci

    I’m a subscriber after this! Excellent content!

  • @GottgleicherMaster
    @GottgleicherMaster Před 10 měsíci

    excellent Video and great explanation and great visuals. subbed :)

  • @Danielo515
    @Danielo515 Před 4 měsíci

    This is the best explanation of the incident that I saw. Good job

  • @himanshutripathi7441
    @himanshutripathi7441 Před 10 měsíci

    Thanks for the dfa nfa part ,, loved it.

  • @riddixdan5572
    @riddixdan5572 Před 10 měsíci

    love the content. lots to learn from

  • @quixadhal
    @quixadhal Před 10 měsíci +2

    Well, from the thumbnail image, the regexp (.*=.*) says "find the LARGEST chunk of text possible before a literal = sign, then find the largest chunk after it, including other = signs if they exist", and it will walk the entire chunk of data many times to ensure it gets ALL of them.
    They probably meant to do (.*?=.*?), which would have found the SMALLEST chunks of text around literal = signs, and would stop as soon as it found even a single = sign.

  • @shroomer3867
    @shroomer3867 Před 10 měsíci +5

    The regex that ChatGPT creates:

  • @soumyadeepbasak4565
    @soumyadeepbasak4565 Před 10 měsíci

    i found this channel by mistake but i am glad pls continuw posting:)

  • @OlegDorbitt
    @OlegDorbitt Před 10 měsíci

    11:00 Wow, you've explained the usefulness of using DFA way better than my professor! Now it all makes sense!

  • @fabricio5p
    @fabricio5p Před 10 měsíci

    Your channel is awesome learning material, keep it up

  • @EchoingRuby
    @EchoingRuby Před 10 měsíci +27

    I mean they could have just done .*?=.* but I guess RE2 is safer long-term. Still this screams "I don't understand regex, it's just magic to me" on the part of that developer.

    • @killingtimeitself
      @killingtimeitself Před 4 měsíci

      which is fair honestly, regex is basically just magic and once you understand the syntax you dont question it's ways.
      though im surprised nowhere else along the development process was anybody concerned over it. Apparently nobody that looked over it had any idea what it was doing.

  • @ThisIsAnAccount
    @ThisIsAnAccount Před 6 měsíci

    God I love watching these vids, I love the duality of high quality, digestible, information coupled with a nice sprinkling of "don't be a dipsh*t" commentary over issues and causation. Developer wise, nothing brings me more joy about my job than someone pointing out how much of an imbecile I *could* have been on that one day.

  • @Not_Even_Wrong
    @Not_Even_Wrong Před 10 měsíci

    Wow great explanation of the matching
    process, I already did not like regex but
    this is next level... especially because one could easily write UNDERSTANDABLE code that does the job, but runs in O(n)...

  • @LKRaider
    @LKRaider Před 10 měsíci +5

    I like the part of explaining non-capturing groups and then throwing them out the window immediately after

    • @yfakolh7154
      @yfakolh7154 Před 10 měsíci

      Except it's wrong. You do non-capturing for performance reasons, to consume characters in some group (this is simply parentheses syntax reuse). In this particular case this was so obviously wrong I can't imagine anyone familiar not to spot this, but in general you shouldn't capture what's not required after the match is done.

  • @n0tharv
    @n0tharv Před 10 měsíci

    loving the videos bro keep it up 👍

  • @Shimasen.
    @Shimasen. Před 10 měsíci

    Never have I ever expected to hear LEMMiNO soundtracks anywhere, glad to hear it from this channel!

  • @RandomDeforge
    @RandomDeforge Před 10 měsíci +3

    out of the 10,000 times that this topic has been covered on CZcams in this exact amount of detail,
    this is so far the most recent.
    kudos!

  • @krozareq
    @krozareq Před 10 měsíci +3

    When in doubt, implement your expression in a delayed loop so it doesn't murder everything.

    • @PinePizza
      @PinePizza Před 10 měsíci +2

      Even without being into coding I understand how this could work and it puzzles me why they didn't do this lol.

  • @nwordfword8073
    @nwordfword8073 Před 8 měsíci

    This shit is very informational and great funny editing! Glad I found you king

  • @EndMaster0
    @EndMaster0 Před 10 měsíci +10

    I guarantee all the engineers who reviewed it didn't even look at the regex. you don't go poking someone elses regex

  • @BurnerWah
    @BurnerWah Před 10 měsíci +2

    This video has a pretty good explanation of regex engines TBH

  • @TheDarkestPaladin
    @TheDarkestPaladin Před 10 měsíci

    Greatest youtube recommendation in a very long time.

  • @user-xo2iw6lz2n
    @user-xo2iw6lz2n Před 10 měsíci

    breuuugh, how has this channel not been suggested to me much sooner

  • @beep6844
    @beep6844 Před 10 měsíci +2

    The Lemmino music really rounds this video off

  • @catalystlover
    @catalystlover Před 10 měsíci

    Great video, really informative!

  • @alexanderjohnston2658
    @alexanderjohnston2658 Před 10 měsíci

    Love your vids

  • @qu765
    @qu765 Před 10 měsíci

    just wanna say that this is a high quality video, very nice

  • @qm3ster
    @qm3ster Před 10 měsíci +3

    Non-capturing groups (unlike lookahead and lookbehind) do get included in the match result (think $0), they just don't create an additional sub-match.
    Eg at 6:35, that would match $0: $1: $2: while removing the ?: would make it $0: $1: $2: $3:

  • @hemal7551
    @hemal7551 Před 10 měsíci

    12:17 I really loved how you used sarcasm throughout the video😂
    You have perfect combination bro ngl

  • @MogDog66
    @MogDog66 Před 10 měsíci +1

    2:37 I love that he checks the "delete master branch after merging" box

  • @CFEF44AB1399978B0011
    @CFEF44AB1399978B0011 Před 10 měsíci +3

    The entire world runs on regular expressions that were written in rage.

  • @skyracer-mk8hg
    @skyracer-mk8hg Před 10 měsíci +1

    Another video yay!

  • @genericyoutubechannel2601
    @genericyoutubechannel2601 Před 10 měsíci

    Came away from this with a moderately stronger understanding of regex, thanks!

  • @QuickQuips
    @QuickQuips Před 7 měsíci

    I really like how you described the newer forms of regex. They look like Moore/Meely machines which use 1s and 0s and are handy with digital design.

    • @abebuckingham8198
      @abebuckingham8198 Před měsícem +1

      Mealy/Moore Machines are a kind of deterministic finite automata for a special case. So it's no coincidence they look the same.

  • @Steamrick
    @Steamrick Před 10 měsíci +3

    I'm a simple man. The upside down Cloudflare over Australia made me laugh. Thanks.

  • @christopheralbright9650
    @christopheralbright9650 Před 10 měsíci

    ...been trying to grok coding how i do geometry. After years of intermediate searching, you helped me see some of said "theory"...thank you!

  • @lowe7372
    @lowe7372 Před 9 měsíci

    Amazing videos man

  • @AndersonPEM
    @AndersonPEM Před 7 měsíci +3

    8:46 what tool are you using here? Can I use it to visualize other programming languages' regex engine?

    • @GamBar64
      @GamBar64 Před 3 měsíci

      I was gonna ask the same thing

  • @elevul
    @elevul Před 10 měsíci

    Amazing explanation!

  • @dorboi
    @dorboi Před 10 měsíci

    love your videos!