Code That MURDERED 6 People | Prime Reacts

Sdílet
Vložit
  • čas přidán 16. 09. 2023
  • Recorded live on twitch, GET IN
    / theprimeagen
    Reviewed video: • how a simple programmi...
    Channel: Low Level Learning | / @lowlevellearning
    MY MAIN YT CHANNEL: Has well edited engineering videos
    / theprimeagen
    Discord
    / discord
    Have something for me to read or react to?: / theprimeagenreact
    Hey I am sponsored by Turso, an edge database. I think they are pretty neet. Give them a try for free and if you want you can get a decent amount off (the free tier is the best (better than planetscale or any other))
    turso.tech/deeznuts
  • Věda a technologie

Komentáře • 775

  • @tiagorosolen
    @tiagorosolen Před 8 měsíci +1078

    In general, I do agree with the whole "too much test is BS", but since I started working with medical software for implantable devices, I got absolutely crazy about testing. Specially system and integration tests that try to grantee that all different parts can work together.

    • @peppybocan
      @peppybocan Před 8 měsíci +157

      I work in fintech space, and I also got very crazy about the testing shit properly. You don't want to nuke the entire company for the laziness. This is no joke.

    • @CTimmerman
      @CTimmerman Před 8 měsíci

      ​@@peppybocan See: Knight Capital Gets Hammered Following $440M Flash-Crash Loss

    • @Mel-mu8ox
      @Mel-mu8ox Před 8 měsíci +87

      Its a bit different when you know there are lives on the line...
      Human error is terrifying.
      Not just from a programmer perspective, but also from a user perspective

    • @Expllosaoriginal
      @Expllosaoriginal Před 8 měsíci +59

      Yeah, places where bugs will lead to deaths are ABSOLUTELY the right ones to strive for that 100% test coverage of lines, branches and whatnot.... Throw in some integration tests and a separate environment as well

    • @insertoyouroemail
      @insertoyouroemail Před 8 měsíci +61

      Static analysis, unit testing, code contracts, property testing etc. Everything. Writing tests is boring? Types are annoying? There's the door. GTFO.

  • @crstnio
    @crstnio Před 8 měsíci +233

    „Radiation level too high or too low“
    Operator: „50/50 chance, let’s try again!!!“

    • @bitbraindev
      @bitbraindev Před 5 měsíci +33

      Also given that 'too high' will be fatal, why is this not its own error? The exception design is just wrong. Any exceptional case that could put someones' health at risk just the slightest should be a FULL STOP error that cannot be circumvented.

    • @pythagorasaurusrex9853
      @pythagorasaurusrex9853 Před 3 měsíci +3

      Ah yeah! That message made me mad! How can a person whoever wrote the software or the manuals can provide the customers with such a stupid error message? I am speechless.

    • @brandonandrews4009
      @brandonandrews4009 Před 3 měsíci +2

      @@bitbraindev Too low would also be fatal in the case of life-saving treatment, but I agree that they should have separate error codes.

    • @ultru3525
      @ultru3525 Před 3 měsíci +7

      Maybe it took integer overflow into account. If it’s an unsigned int and you get a value that’s lower than expected, then either it really was too low, or it was so high it overflowed.

    • @sukitta2
      @sukitta2 Před 24 dny +1

      @@bitbraindev Considering that they had a previous version of this machine that had a hardware safety measure for that condition, it wouldn't have been necessary, except the company removed it. It seems likely that the guy that wrote the software might've written the code with that in mind, even. Even if that's the case, the major fault is removing the safety system, imo. Crazy situation, whatever the case it was. Goes to show how far we come generally speaking, even if we still insist on screwing up every so often.

  • @ExpensivePizza
    @ExpensivePizza Před 8 měsíci +536

    As a software engineer for over 30 years this kind of thing still makes me feel a little sick.
    I think the "software never fails" mindset comes from the idea that computers will always do what you tell it to do.
    Unfortunately, sometimes what you tell it to do is still wrong.

    • @cerulity32k
      @cerulity32k Před 8 měsíci +117

      The computer does what you tell it to do: 😄
      *the computer does what you tell it to do:* 😳

    • @RandomStuff-zt6qf
      @RandomStuff-zt6qf Před 8 měsíci +2

      ditto

    • @Mark-kt5mh
      @Mark-kt5mh Před 8 měsíci +6

      Unless there is a bug in the processer's ISA itself

    • @matheusjahnke8643
      @matheusjahnke8643 Před 8 měsíci

      Also background radiation induces bit flips.
      There was an election where one candidate got an extra 4096 more votes because of that...
      Veritassium made a video on that:
      czcams.com/video/AaZ_RSt0KP8/video.html

    • @jeal5022
      @jeal5022 Před 8 měsíci +5

      I remember this quote I heard before: "Cpmputers only do what you command, not what you meant"

  • @rb9238
    @rb9238 Před 8 měsíci +660

    This is not just a software failure. The whole equipment design, UI/UX, the hospital procedures and software together killed the people.

    • @lowwastehighmelanin
      @lowwastehighmelanin Před 8 měsíci +21

      Yup, but it started with the healthcare workers who have no coding experience trusting that the machine works as intended.

    • @TheNewton
      @TheNewton Před 8 měsíci +17

      There's a name for it in risk management , Swiss cheese model failure

    • @stoogel
      @stoogel Před 8 měsíci +59

      ​​@@lowwastehighmelanin ...and there's nothing wrong with that. They should trust that in using the equipment the way they were trained, it will work. They don't need to learn to code- that's what a user interface is for.

    • @hwstar9416
      @hwstar9416 Před 8 měsíci +11

      this has nothing to do with UI/UX

    • @tomwillalwaysbe
      @tomwillalwaysbe Před 8 měsíci +12

      @@hwstar9416 weird that we studied this case in college in UI/UX course then.
      some of this could've been prevented with proper UI/UX design, if the UI indicated that at that moment you couldn't put anything in it while the other thread was running for instance if the UI made it mandatory to wait 10 seconds before changing the value this could've been prevented.

  • @combatcorgiofficial
    @combatcorgiofficial Před 8 měsíci +305

    Spent most of my career at NASA. Our test campaigns are extreme, to the point that the public likes to shit on us for being slow lmao. None of us give a shit, though. Better safe than a murderer

    • @ficolas2
      @ficolas2 Před 8 měsíci +18

      Nowadays, medical testing also tends to be extremely through, from hardware to software.
      With multiple safety measures for things that likely will never ever need those safety meassures.
      Or at least I have heard, I honestly am no NASA engineer, and no medical engineer lmao

    • @HermanWillems
      @HermanWillems Před 8 měsíci +7

      This. So much, yes such software takes long to develop. But it's part of the job. As you are with NASA. :) You probably know about Ariane 5 rocket....

    • @gerdokurt
      @gerdokurt Před 8 měsíci

      Nasa has "extreme" test campaigns in software, but then they use made in china stuff to safe 2 dollars on a ring seal and the shuttle explodes!
      very compentent decision making in nasa!

    • @Dogo.R
      @Dogo.R Před 8 měsíci +6

      Dont worry your laws, roads, medical insurence system, hospital ownership system, jails, financial laws, ect unalive tons of people. So even if you unalived a couple you would be way better then the rest of the systems in place.

    • @ficolas2
      @ficolas2 Před 8 měsíci +16

      @@Dogo.R are you mentally ok?

  • @martenkahr3365
    @martenkahr3365 Před 8 měsíci +100

    The "hobbyist programmer" was only a hobbyist because it wasn't his full-time job, which made him a hobbyist by the standards of that time. A big part of the emphasis on that is because the initial PR startegy of the company was shifting blame to an allegedly incompetent outsider, away from the deficiencies of their own development and QA process. While his exact identity remains unknown, he was in fact the same person who wrote the code for the earlier Therac-6 and Therac-20 machines (also alone and without supervision). There are also claims that he was an electronics engineer in his late 40s at the time of updating the old code for the Therac-25, but I've not been able to find the source I read it from a few years back so take that with a grain of salt, but it definitely seems to me that painting the picture of some nerd working out of his mother's basement might have been deliberate misdirection.
    Also, the hospital industry of the time wasn't entirely blameless here either: they were the customers paying 6+ figures for these machines, not the patients receiving treatment from it. As a result of those customer preferences, the primary directive for coding these machines was "Make sure the software can't brick the expensive electronics if the operator screws up ", rather than anything to do with patient safety. And unhandled software crashes bricking the physical electronics was a major concern back in those days, because those assembly programs had basically no safety layer to stop them from giving the circuitry instructions that would melt it. Which led to a lot of boilerplate error messages that were more intended for developer debugging than operator correction. Which meant a lot of alarm fatigue due to most error messages resulting from input mistakes being harmless, and the operator procedure for the vast majority of error messages would have been to ignore the error and proceed. Pressing [I]gnore or some equivalent for an error message was part of routine operation in a lot of electronics back then, not just medical equipment, and up until this disaster, nobody in the hospital industry had a problem with it.

    • @fltfathin
      @fltfathin Před 4 měsíci +4

      This also makes right to repair as paramount right so even the operator can know what's wrong and what to fix instead of 2 days support delay or operating in unsafe way due to intermittent problem that the software didn't handle correctly.

    • @somebodyoncetoldme1704
      @somebodyoncetoldme1704 Před 4 měsíci

      Thanks

    • @noctisocculta4820
      @noctisocculta4820 Před 3 měsíci

      Heh, typical. Take all the credit, pass on all the blame. I bet he stayed anonymous because they'd have to share the profits with him.

    • @AndrewTSq
      @AndrewTSq Před 3 hodinami

      most people I know was hobby programmers in the 80ies. Only my neighbour was working at a airplane company as a real programmer. But back in the 80ies we had 15 year olds developing their own 3d graphics routines in software, with even raytracing algorithms in 68k asm. So I would rate them over many real engineers of today.

  • @Cyberspine
    @Cyberspine Před 8 měsíci +279

    I work at a company that makes these kinds of radiation therapy machines. The way the software regulation works these days is that everything from requirements to acceptance criteria to test cases to executing the test cases is checked and double-checked, so that nothing ever is seen by one pair of eyes only. The same applies to any changes in the code. This means that the code base changes slowly and iteratively. Even seemingly small changes may necessitate a huge amount of re-testing. In my team there are almost as many SQEs as there are devs.

    • @HermanWillems
      @HermanWillems Před 8 měsíci +14

      And what is your salary for this responsibility? Lower or Higher than a front-end javascript scriptkiddie ? Mostly lower... Same in the Embedded world. The software is so much harder, but the pay is so much lower.

    • @lppedd
      @lppedd Před 8 měsíci +25

      @@HermanWillems it all comes down to the amount of money you generate with your code. A shitty platform made with some JS framework may generate way more money, especially in the short term.

    • @SeanJMay
      @SeanJMay Před 8 měsíci +38

      ​​@@HermanWillemshate to break it to you, but
      1. plenty of devices that get people killed have JS in them for UI purposes
      2. there are plenty of fantastic engineers that also do JS
      3. Bjarne Stroustrup, himself, was hired to work on several plane contracts, because developers could not be trusted to follow the try/swallow or try/ignore pattern of coding. Rebooting a plane at Mach 2 is not a thing that can occur. Note that they were not using JS, and still needed oversight from the highest levels.
      It's not the language, it's the people, the culture, and the lack of care or oversight to protect the real people on the other side.

    • @SeanJMay
      @SeanJMay Před 8 měsíci +15

      The deeply unfortunate piece here is that regulation in a lot of these circumstances is based less on having the best of the best analyzing the code and documentation (on the regulators’ side), but rather how well things are tracked, and how similar this thing is to other things that came before it.
      It's why, in the past I’ve made sure teams know:
      “This will be used as a suicide device. This will be used as a murder device.
      And that's sad, but it can't be helped. What we must never, ever allow, is for it to be an accidental death, because during operation, they had a tiny tremor and accidentally clicked the button twice, or because we didn't clear state and start over, for every mode-switch, or because we wrote the code in a way that the logic in a vacuum, or the system outside of the hardware, couldn't be isolated and tested."

    • @userubuntu1373
      @userubuntu1373 Před 8 měsíci +2

      ​@@SeanJMay I do not think that using a GC language for such applications are a viable choice.
      Nevertheless, it is as you said the people, the culture, lack of care, oversight and comprehension of when you code these kinds of systems that a real person will be affected by it.

  • @MatzWerk
    @MatzWerk Před 8 měsíci +160

    Back when I was in engineering school, we had this one teacher who was both awesome and super strict. He was strict because he had to deal with a court case after two people got killed due to mishandling machinery.

    • @ww123ification
      @ww123ification Před 8 měsíci +12

      Back when I ...........(the same text, only replace "super" with "extremely super").............because he had seriously hurt himself while working with a machine tool. He didn't want the students to experience the same.

  • @istasi5201
    @istasi5201 Před 8 měsíci +102

    if i recall the story right, the software weren't even designed for that machine, but it was reused without the devs input for the later machine

    • @johnnycochicken
      @johnnycochicken Před 8 měsíci +29

      that is even worse and further exonerates the programmer if true

    • @stoogel
      @stoogel Před 8 měsíci +48

      Yes, it reused software from earlier machines that had hardware locks (which had masked the software errors). They trusted the software simply because it had worked previously. Lots of stuff on the wikipedia article- as with most engineering disasters there were numerous causes.

    • @tiranito2834
      @tiranito2834 Před 6 měsíci +10

      @@johnnycochicken I mean, the video says that the machine used to have hardware locks, and that the dev was there during the time he developed the software and just left. This is a piece of information I was not aware of, I always assumed it was completely the programmer's fault for making a bug, but it seems like he actually did a perfect job, because his code was built for a machine that was completely different from the one that was deployed. I'm kind of surprised this part of the story tends to be eliminated when people share the story, I mean, I sort of get why, because if this is true, it not only exonerates the programmer, it also makes it clear that he made no bugs, worked all on his own, had no credentials and was simply a hobbyist working in assembly, and made a perfect job that would have worked flawlessly had the company not cut corners and never removed the hardware locks, and we all know how much people hate "rock star" programmer stories like that... because it further encourages people to do wild shit. In any case, if what the video says is true, then the programmer was never guilty to begin with and it was entirely the company's fault for not only not testing the device, but for deploying something completely different from what the code was designed for in the first place.

    • @AntiCookieMonster
      @AntiCookieMonster Před 4 měsíci +2

      If purpose of these retellings is to impart some important lesson about code safety and you misrepresent such a major part of the story, you are the irresponsible one.
      I believe there is a different take away: make redundat testing in not unlikely case that negligent people run your code in environment much different then one you developed for; be prepared that they will try pushing all the responsibility for errors occuring on you.
      I was expecting programmers being more pedantic about details then to tell falk stories.

    • @woopsserg
      @woopsserg Před 4 měsíci +4

      @@tiranito2834 It still was a race condition bug, however previous machines were prevented to run with incorrect hardware configuration with presence of hardware interlocks. Bug or not, the fault is in the hardware. From my perspective of electronics engineer, it's absolutely ridiculous that you allow software to configure hardware to do something deadly or just self-destructive. Of course there are some edge cases when it's impossible to do so. However this wasn't one of them by far. From what I've seen, a few limit switches and relays would be enough to prevent device running if it wasn't configured correctly for the mode used. Even $40 microwave oven has at least 3 limit switches preventing it running with door open.

  • @xerostyle
    @xerostyle Před 8 měsíci +121

    in college i took M programming. an esoteric programming language that's similar to C, but written more like ASM. it was a fun class, and i got one of the highest scores. afterwards i was approached by a local medical hardware company called McKesson. turns out the whole class was basically a "last starfighter" type scenario looking for employees. i ended up turning the job down for exactly this reason. my intrusive thoughts keep saying: knowing me, i'd pull some "office space" type shiz and unalive a bunch of people over a decimal point.

    • @nandoflorestan
      @nandoflorestan Před 8 měsíci +13

      I love all the cultural references. This man has good taste.

    • @neonraytracer8846
      @neonraytracer8846 Před 8 měsíci +17

      It's important to decide if you're up for such a task. Good that you made the decision you're comfortable with.
      That being said, this is a comment section. You can write kill.

    • @raffimolero64
      @raffimolero64 Před 8 měsíci

      You turned it down.
      Who will take the burden instead? Were you hoping that they would change their methodology?

    • @TheNewton
      @TheNewton Před 8 měsíci +14

      @@raffimolero64 Who will take the burden instead? Not unlikely someone that wouldn't be burdened by such thoughts of accountability.
      That's the cost when conscientious objectors don't participate.

    • @fltfathin
      @fltfathin Před 4 měsíci

      ​@@raffimolero64 yep, there's plenty way to write code nowadays with good programmers who can make computer inside games or if equipped can wire it themselves.
      and you expect someone to babysit a machine older than themselves alone without references and full knowledge on how the thing runs?
      Might as well build it from scratch

  • @KangoV
    @KangoV Před 7 měsíci +7

    Boeing did the same DELIBERATELY by not adding a backup sensor on the 787MAX. This was done to avoid FAA re-testing of the aircraft. As it was not tested, there was nothing put into the flight manuals either (to avoid FAA spotting it). This was all done as the newer engines were much bigger and had to me moved up the wings which disrupted the airflow/lift. The software was supposed to compensate this be taking over the yoke. When the main sensor malfunction, there was no backup which caused the plain to dive with no way for the pilot to pull up. The pilots had NO idea what was happening. This is what caused 2 crashes.

  • @TMRick1
    @TMRick1 Před 8 měsíci +127

    I can't imagine taking the responsability to deal with code that manages something like this, for real. One time I dealed with a fairly simple database that stores crucial medical data for patients on a hospital and already got a lot of anxiety by just thinking: what if I mess up some SQL statement or flow control that causes wrong data for someone on blood type, for example?

    • @JohnDoe-up2qp
      @JohnDoe-up2qp Před 8 měsíci +20

      Especially when the company is to greedy to provide necessary nor care for sufficient capacity, testing and even removes safeguards to save some $$$

    • @rifle
      @rifle Před 8 měsíci +11

      Same thoughts here, creating rules for test results, and if a false negative happens, without going into too much detail, it could lead to lives being ruined.

    • @oleksiistri8429
      @oleksiistri8429 Před 8 měsíci +6

      ​@@JohnDoe-up2qpif that happen just leave that company and never come back

  • @ethanphelps5308
    @ethanphelps5308 Před 8 měsíci +84

    One of my Comp Sci professors, Clark Turner, was part of the investigation into the Therac-25 incident and I remember him telling us the story about how he and another person found the race condition that led to these people's demise. They wrote a paper about the investigation. Crazy stuff

    • @NuncNuncNuncNunc
      @NuncNuncNuncNunc Před 8 měsíci

      Link: web.stanford.edu/class/cs240/old/sp2014/readings/therac-25.pdf

  • @flaxeneel2905
    @flaxeneel2905 Před 8 měsíci +60

    Edit: Oh, looks like the turntable position overflow wasnt mentioned, well thats another problem that was found with the therac 25, and fixed, after which AECL claimed some stupid 5 9s (99.999%) safety increase (i dont remember the exact number of 9s but it was smth like that), but that turned out to not help much either, in fact it just led to further radiation related injuries since people were made to believe its safe to use the machine now. but yeah the point still stands, no proper code review is asking for trouble
    The therac-25 having NO hardware interlocks was a huge jump that did not make sense being taken. Yeah software is great and all, but where it is someone's life on the line, it is never bad to have more failsafes. At worst, its more of a headache for the operators maybe, but, it would at least not kill the patients. The entirity of the program was just made by a hobbyist, so things like overflow of the turntable position checker (8 bit) to read 0 and the treatment proceeding because the turntable is deemed to be at the right position by the software is a mistake any programmer can make, no matter how skilled and experienced, and it would've not even been caught for a while cuz it was all written by a singular person, with no one else cross checking the code written. It is ALWAYS better to have hardware failsafes.
    Kyle Hill also has an awesome video about it, which goes more in depth. I would highly recommend it if you are interested

    • @gljames24
      @gljames24 Před 8 měsíci +5

      But hardware costs money and we can't stop making bigger and bigger margins for the shareholders.

    • @CTimmerman
      @CTimmerman Před 8 měsíci +1

      SPARK is a mathematically safe subset of Ada that was made to rule out such software errors.

    • @vasiliigulevich9202
      @vasiliigulevich9202 Před 8 měsíci

      How would hardware lock help, if operator can only check the settings in the buggy UI? UI says - good to go, operator removes the lock and kills patient.

    • @flaxeneel2905
      @flaxeneel2905 Před 8 měsíci

      @@vasiliigulevich9202 by this, i mean hardware interlocks that are completely independant of the UI/Software. interlocks that make it physically impossible for something like this to happen. Interlocks that cannot be disabled by the software

    • @flaxeneel2905
      @flaxeneel2905 Před 8 měsíci +3

      @@gljames24 Till the cost savings kill people and destroy ur company's rep, yeah

  • @timeimp
    @timeimp Před 8 měsíci +37

    This is a first-year topic in almost all Computer Science degrees. Its a really horrible way to make you realise that your code may kill - even if accidentally.

    • @arthurmoore9488
      @arthurmoore9488 Před měsícem

      There's still plenty of stuff that can injure or kill people controlled by software. That's obvious. What should scare you is some of it it's written by a single person with a boss upset about deadlines. I'm not talking about the 80s. I'm talking about the 2000s, because I have been that dev.

    • @AndrewTSq
      @AndrewTSq Před 2 hodinami

      Not only software, but also the wrong type of sensors for the software. Like Tesla using cameras only to detect depth etc in a picture, instead of a combination of sensors like Ultrasound, radar etc. But I would say this "agile" software releases also is bad, cause when I tell them the software is not ready, they just tell you to ship it anyway, and fix the bugs later.

  • @helidrones
    @helidrones Před 8 měsíci +21

    I‘ve been a software developer in the 1980s and had to deal with assembly language a lot. From today‘s perspective programming back then has been both, easier and more difficult at the same time. There were less layers of abstraction and the tasks were less complex but also the tools were less sophisticated and programmers often needed to talk to the machine directly. Compilers/assemblers were slow, imagine two hours compile time. For this reason we often applied little changes directly to the object code and marked those changes in the printed (and archived) listing. Sometimes someone forgot to put a change note to the listing…

    • @snooks5607
      @snooks5607 Před 6 měsíci +2

      a bit younger but I remember the early internet wild west days before proper version control and build pipelines. people sometimes hacked stuff directly in prod out of convenience/laziness and then got their changes overwritten when things were "deployed" from QA. nowadays changes are supposed to go through a release process but hotfixes still sometimes override procedures (was swapping library files live this week when AWS deprecated old TLS and broke part of a sales system, git was the last place where they landed)

  • @gameprogramme
    @gameprogramme Před 8 měsíci +14

    The important takeaway on this was kind of glossed over: the software with the race condition was written for the device with the hardware interlock, preventing the dangerous dosage. The next version of the device removed the hardware interlock but reused the lions share of the software. (If I recall correctly, but can’t find corroborating documentation, the engineer that made the changes for the machine that removed the interlocks was a DIFFERENT ENGINEER than the one that originally wrote the code, so was likely less familiar with it.)

    • @bakirev
      @bakirev Před měsícem

      You'd still be giving the wrong type of radition right? Not great either.

  • @sdstorm
    @sdstorm Před 8 měsíci +25

    Too bad this was before Tom's time. JDSL would have prevented this. There would have been no way to change anything in the 8 seconds, because the UI would have been frozen waiting for a Subversion checkout.

    • @AndrewTSq
      @AndrewTSq Před 3 hodinami

      Or just print "moving arm" and wait in a loop for a value from a sensor that all parts where were they were supposed to be, and if the loop takes more than x seconds make it fail.

  • @crusaderanimation6967
    @crusaderanimation6967 Před 8 měsíci +22

    5:22 Ok so error that told patient got too low or TOO HIGH dose of radiation needed to be decoded by a freaking technician like it's MCdonald ice cream machine ? Instead of.. you know, having that information at least in manual so user can know at the spot that patient might have a problem.

    • @anlumo1
      @anlumo1 Před 8 měsíci +2

      Yeah I don't get it why the errors are so obscure, this is just a medical radiation machine and not something really complex like an ice cream machine.

    • @R_got_a_name_change
      @R_got_a_name_change Před 8 měsíci

      Yeah why is the death ray interface designed like a sony playstation?

    • @crusaderanimation6967
      @crusaderanimation6967 Před 8 měsíci

      @@anlumo1 I'm not referring that MCDonalds ice cream machine is complicated but error codes are purposefully confusing to earn money on tech support.

    • @anlumo1
      @anlumo1 Před 8 měsíci +1

      @@crusaderanimation6967 yeah I know, it was sarcasm.

    • @crusaderanimation6967
      @crusaderanimation6967 Před 8 měsíci +1

      @@anlumo1 touché

  • @annaczgli2983
    @annaczgli2983 Před 8 měsíci +15

    Ah, the Therac case, a staple of SW engineering classes since God knows how long. Now being covered by YT content creators. It's an Oldie but a Goldie.

  • @Kane0123
    @Kane0123 Před 8 měsíci +98

    It’s one of those things. I love writing code that can’t hurt anyone. But someone has to do it…
    Big ups to those who are passionate about writing code for stuff that matters!

    • @vitalyl1327
      @vitalyl1327 Před 8 měsíci +10

      even a puny little frontend developer can write a code that hurts people. My eyes and my aesthetic feelings are hurt almost every time I browse web. Not to mention my time wasted on navigating badly designed UI with too mamy steps to do simple things.
      All developers must be regulated and treated the same way as doctors and civil engineers.

    • @Legac3e
      @Legac3e Před 8 měsíci +1

      ​@@vitalyl1327Eh, frontend devs have only a minor possibility of hurting. Whereas those working on medical (and other life/death) devices have a highly probable chance of hurting without regulation. It doesn't really make sense to hold all software up to that standard unless we had something like an overabundance of regulators (maybe AI could get us there someday).
      Not to say that we shouldn't be looking to improve things across the board - but your stance doesn't seem reasonable or productive, imo.

    • @Legac3e
      @Legac3e Před 8 měsíci

      Oh, and I'll just add that any developer who is working on something with a level of risk that is anywhere near what a civil engineer or doctor (or other) has should 100% be regulated up to the same standards as them. So my apologies if that is what your original point was.

    • @vitalyl1327
      @vitalyl1327 Před 8 měsíci +2

      @@Legac3e yep, and this is exactly what is not happening in the industry. And if I am to chose between regulating the crap out of everyone and not eegulating at all, since selective regulation can be very hard to apply, I am for the former.

    • @Legac3e
      @Legac3e Před 8 měsíci +1

      @@vitalyl1327 That is fair. Of the two extremes, I'd be for regulating everything, too. And at minimum increasing our current standards of regulation would likely be a positive overall, even if it isn't applied to everyone (yet?).

  • @de-ep
    @de-ep Před 8 měsíci +29

    this is the first horror content on prime's channel

  • @ArkhKGB
    @ArkhKGB Před 8 měsíci +16

    If you can't imagine this kind of cowboy coding with machine outputing xray, check the story around the demon core or some of the criticality incidents. Early nuclear handling was wild

    • @raffimolero64
      @raffimolero64 Před 8 měsíci

      Just how many scientists made the exact same mistake...

  • @duncanw9901
    @duncanw9901 Před 8 měsíci +5

    This is what those weird people at your school with incomprehensible ramblings about formal semantics of programs were worried about.
    I just got a job writing testing tools for aerospace systems, so this all is top-of-mind.

  • @Mth-Ryan
    @Mth-Ryan Před 8 měsíci +11

    Usually, this kind of machines are made with special distribution of languages design for: "critical missions". Is a real incredible field, spark, a variant of ada, is one of this languages. Rust has a toolchain like this too called Ferrocene. Imagine have to build something like this in C or goddam assembly.

    • @HermanWillems
      @HermanWillems Před 8 měsíci +2

      You have MISRA C. But... Rust already covers a lot of MISRA C rules. If you would make a strict version of Rust it would be very well suited for such systems.

    • @Mth-Ryan
      @Mth-Ryan Před 8 měsíci +3

      @@HermanWillems oh didn't know about MISRA. I agree with you, a strict rust has a great potential for this niche. Actually, Ada Core, company behind Ada and Spark are major sponsors of the Rust Foundation. Given that this company is funded by the US Department of Defense, this is quite interesting.

  • @7heMech
    @7heMech Před 8 měsíci +112

    Here is why writing good code is important kids.

    • @IAmOxidised7525
      @IAmOxidised7525 Před 8 měsíci +6

      No, you need to test rigorously, good or bad is decided after testing

    • @AlexanderNecheff
      @AlexanderNecheff Před 8 měsíci

      @@IAmOxidised7525 Oh trust me, you can have shit-tier code that is difficult to review/modify/etc. that passes tests, maybe even for the wrong reasons because some shit happened to line up in memory just right.
      Good code is a separate thing from tested code.
      You need both good code and good testing to have a good product.

    • @vitalyl1327
      @vitalyl1327 Před 8 měsíci

      ​@@IAmOxidised7525Testing is not sufficient, you need a formal verification.

    • @Cyberspine
      @Cyberspine Před 8 měsíci +1

      @@IAmOxidised7525 The approach the companies take is that they make sure the code is good and then they test it exhaustively.

    • @HermanWillems
      @HermanWillems Před 8 měsíci +2

      @@IAmOxidised7525 Do you have also written tests for your tests? By separate people ?

  • @eufrozinak9461
    @eufrozinak9461 Před 8 měsíci +18

    Low Level Learning is great but the Well There's Your Problem podcast covered this topic in much more detail and they even had a medical professional as a guest

  • @MasterHigure
    @MasterHigure Před 8 měsíci +3

    It is wild to me that the software in cars isn't openly available. You just know thousands of volunteer hours would go completely unprompted into hunting for weaknesses in those things. "What is the failure mode on the brake pad thermometers and the tire pressure meters, and how does the code that takes in brake pedal position and transforms it into actual braking react to said failure mode, on a 2018 Nissan Qashqai?" And millions of similar little questions, many of which would be answered, because someone became curious and just went and found out. And the only thing that would happen is we would all be a bit safer.

  • @notapplicable7292
    @notapplicable7292 Před 8 měsíci +21

    Hardware interlocks are programmers best friends. They genuinely help me sleep at night

    • @thewhitefalcon8539
      @thewhitefalcon8539 Před 8 měsíci +2

      Software interlocks are hardware designers' best friends.

    • @nyahhbinghi
      @nyahhbinghi Před 4 měsíci

      the ultimate interlock is Erlang or Node.js where zero memory is shared between threads. I am sure this program written in assembly had some concurrency memory issue race condition bullshit, which won't happen if you run in an environment that's universally strictly thread-safe. Only hard-real-time systems (like control systems for airplanes) need the performance level of assembly. This laser machine needed soft-real-time at best.

    • @fltfathin
      @fltfathin Před 4 měsíci

      I still don't understand how such failure of a machine were made.
      There should be a feedback switch or something that indicate raw screen state like lamp on = E mode with the intensity shown in dial or somewhere.
      Also if it can dual mode you should be able to limit the output of the beam with hardware switch that change the dac input to the safe amount.

  • @b0b0_
    @b0b0_ Před 8 měsíci +5

    Prime: sees a picture
    Prime: "is that a chat GIBIDY image ?"

  • @kaansal9523
    @kaansal9523 Před 8 měsíci +18

    When I read the title, I read the "Prime Reacts" as "Crime Reacts" because of word "killed".

    • @devqubs
      @devqubs Před 8 měsíci

      Me too

    • @andrejszasz2816
      @andrejszasz2816 Před 7 měsíci

      When I see the moustache I don’t even read that part of the title ;)

  • @Brawaru
    @Brawaru Před 8 měsíci +31

    There's a much better and detailed video on that by Kyle Hill, I really recommend watching it too. It wasn't just this bug alone, and it wasn't just about the code (although you pretty much figured this out at this point). It's a must learn topic in computer ethics class.

    • @gilbertovampre9494
      @gilbertovampre9494 Před 8 měsíci +1

      Could please send a link to that video you mentioned?

    • @benjie_wh
      @benjie_wh Před 8 měsíci

      ​@@gilbertovampre9494just search for Kyle Hill Therac 25

    • @velifurkanturkoglu1387
      @velifurkanturkoglu1387 Před 8 měsíci

      @@gilbertovampre9494 czcams.com/video/Ap0orGCiou8/video.html I guess this is this one.

    • @stefano_schmidt
      @stefano_schmidt Před 8 měsíci

      ​@@gilbertovampre9494
      links are deleted. Type the name:
      "History's worst software error"

    • @polygontower
      @polygontower Před 8 měsíci

      @@gilbertovampre9494 czcams.com/video/Ap0orGCiou8/video.html

  • @danielreed5199
    @danielreed5199 Před 15 dny +1

    This reminds me of the programmer who was found dead in the shower holding a bottle of shampoo.
    The autopsy revealed he died from starvation and exhaustion, they could not figure out why this had happened until
    they read the instructions on the shampoo bottle "Wash, Rinse, Repeat".

    • @tempname8263
      @tempname8263 Před 13 dny

      This is horrifying...

    • @AlexRodriguez-gb9ez
      @AlexRodriguez-gb9ez Před dnem

      I have an idea of fixing infinite loops, its called probabilistic coding. You can change the formal semantics of Lisp/eval and Haskell's lambda calculus eval to change all expressions from being E to being (E, 0.95) where 0.5 is a probabilitiy. You can then run a program 1000 times and see some constraint holds to see if the code is being fixed or not.

  • @colemichae
    @colemichae Před 8 měsíci +5

    In the 80's some people did think it would just work, and remember it was the operator altering a previous selection after the system started adjusting. So would never have been a primary test, they would have tested likely changes not X to E as that would have been weird, the operator would never select the wrong value, and not change it after all the other figures were entered.
    Don't start moving untill the last Punch has been verified.

  • @sdstorm
    @sdstorm Před 8 měsíci +4

    It's interesting how this story exploded recently. I've heard this story like 5 separate times in my university lectures 10 years ago.

  • @RandomStuff-zt6qf
    @RandomStuff-zt6qf Před 8 měsíci +8

    It's not that we believed "software will never fail", but rather "computers don't make mistakes, people do". That hasn't changed. I've been coding since 1992 btw.

    • @TheNewton
      @TheNewton Před 8 měsíci

      And the missing unsaid part "computers don't make mistakes, people do" "and don't need to accountable as an discipline the same ways as engineers, or doctors are regulated"

    • @RandomStuff-zt6qf
      @RandomStuff-zt6qf Před 8 měsíci +3

      @@TheNewton what are you even saying? Doctors have one the highest error rates of any profession

  • @klausjoachimrichter3495
    @klausjoachimrichter3495 Před 8 měsíci +3

    I started programming in 1981 as a student and financed my studies by programming during the semester breaks. I mostly programmed in Z80 assembler language. Using assembler was quite normal back then. However, the mindset back then, either for me or my colleagues, was not that software always works if it has worked once before. Testing was necessary even then. But there were no unit tests at that time. Therefore, I often worked with the debugger.

  • @matthewstott3493
    @matthewstott3493 Před 6 měsíci +1

    I read this story somewhere else years ago. They didn't have the developer redesign the logic. I seem to remember something about technicians removing keys from the keyboard to prevent the operators from altering the prescription once the process started. Deleting the keybinding by removing the physical key itself on the keyboard! The operator had no way to change the X / E parameter once the form is submitted to the next stage. Only an abort function keybinding which would reset the machine to baseline and start the whole thing over from the beginning. Eventually the machines were recalled / replaced. It really was the Wild West in the early 80's. Software Engineering was completely new. Most programmers were self-taught without formal college engineering programs. There were no regulations nor industry standards nor oversight, etc. You had very limited memory, storage, and processing power, every byte counted. Ultimately the manufacturer was liable. This isn't exactly a bug and it wasn't entirely operator error. The flaw was not handling the unexpected input properly.

  • @blarghblargh
    @blarghblargh Před 8 měsíci +2

    glad you gave the example to the people in chat. this is as serious as you're presenting it

  • @bassycounter
    @bassycounter Před 21 dnem

    Whoever said “b for beam - you know that’s a good program” in the twitch chat made my day 😂

  • @scheimong
    @scheimong Před 8 měsíci +8

    I was writing a version extraction function a bit earlier today, something that takes the string printed by `program --version` and extracts the version number.
    I thought "oh it's literally a simple regex and five lines in total so why bother with a test". And if it was my personal project that would have been it. But then I thought "okay this is a public repo, I'll be a bit more responsible" and went ahead and wrote a simple test.
    Lo and behold, I incorrectly used a `+` modifier on a capture group instead of a `*`, so now a naked major version is not matching 🤦. Needless to say I'm glad I wrote that test.
    I guess the lesson here is to just accept the fact that you will always be a shitty programmer, no shame in that 😅.

    • @TinBryn
      @TinBryn Před 8 měsíci +7

      I'm reminded of the saying "I have a problem so I will try regex, now I have 2 problems"

  • @tankspl7
    @tankspl7 Před 8 měsíci +4

    Having programmed PDP-11s using RTX-11 real time operating system in assembly I can understand how those errors can occur. That being said, race conditions etc. are typical errors one should watch and test for. That's when code reviews (yes, we had them back then) and exhaustive testing come into play.

  • @reyco1982
    @reyco1982 Před 8 měsíci +4

    Also happening in self-driving cars, despite they are using the best practices in writing software.

    • @vitalyl1327
      @vitalyl1327 Před 8 měsíci +3

      Not even self-driving - see the Toyota breaking bug.

    • @HermanWillems
      @HermanWillems Před 8 měsíci

      @@vitalyl1327 Toyota has more, Toyota had spaghetti code that killed many people. By suddenly accelerate !!! Without you pressing the gas pedal and kill people.

  • @claverbarreto5588
    @claverbarreto5588 Před 8 měsíci +2

    dude i was feeling sad at first, but that "Dividing by Zero shit" caught me off guard 🤣

  • @FabulousFadz
    @FabulousFadz Před 8 měsíci +14

    I'm 5 minutes in and already horrified. This is why I'm extra cautious with my code quality, even if the stakes aren't this high. It was such a long time ago (still in high school) when I read Alan Cooper's book, The Inmates are Running the Asylum" and it had a number of ways code could screw things up... from mildly annoying to downright dangerous. Each chapter started with a question, "What do you get when you cross a computer and a *Foo?"* for any foo. The answer was always a computer because of how screwing things up in the code would mess up the other thing. You can't reboot a plane, or laser gun, a car... the code shouldn't require that when mixed with these things.

    • @monad_tcp
      @monad_tcp Před 8 měsíci +3

      " You can't reboot a car " I literally had to reboot my car in the middle of a onramp stop because of an ECU problem. That day I was thinking, what if that thing happened when I was on the road. I thank the engineering for my brakes being totally manual and having no electronics.

    • @FabulousFadz
      @FabulousFadz Před 8 měsíci +1

      @@monad_tcp ooh, that sucks. (PS: I should have qualified that statement. Haha. You can't safely reboot a car, plane, etc when in motion.)
      From the book: _*"What Do You Get When You Cross a Computer with a Car?*
      A computer! Porsche’s beautiful high-tech sports car, the Boxster, has seven computers in it to help manage its complex systems. One of them is dedicated to managing the engine. It has special procedures built into it to deal with abnormal situations. Unfortunately, these sometimes backfire. In some early models, if the fuel level in the gas tank got very low-only a gallon or so remaining-the centrifugal force of a sharp turn could cause the fuel to collect in the side of the tank, allowing air to enter the fuel lines. The computer sensed this as a dramatic change in the incoming fuel mixture and interpreted it as a catastrophic failure of the injection system. To prevent damage, the computer would shut down the ignition and stop the car. Also to prevent damage, the computer wouldn’t let the driver restart the engine until the car had been towed to a shop and serviced."_

    • @monad_tcp
      @monad_tcp Před 8 měsíci +1

      @@FabulousFadz also, to prevent damage the computer would not let you apply the brakes because it was fly by wire.
      Those are my nightmares with cars.

  • @gavinbarnard2220
    @gavinbarnard2220 Před 2 dny

    This is a story about a Hydro Power company in northern BC. Bills were sent out on computer punch cards; in the late 70s to mid 80s. There was an error and a person was over billed. They contested that the bill was incorrect and received a response "The computers are never incorrect!". This was sent back with a request to have it notarized and returned and if it was they would pay out the balance. A few months later; after paying the bill normally to not raise a suspicion immediately. The person began altering the punch card portion at the top right to make it so that the bill was a negative balance; and paid it for 0. The person did this for about 3 months until finally the Hydro Power company determined there was a billing error, they don't really owe a credit. The person responded with notarized copy of the previous letter. This end up in claims court with the Judge ruling in favour for the full credit balance to be paid to the client as per the Companies own notarized letter "The computers are never incorrect!"

  • @makesnosense6304
    @makesnosense6304 Před 8 měsíci +2

    Software does EXACTLY what you tell it to do. And that is also the problem as you might have missed edge cases or not handling certain extreme values. This is why fuzzy test is good. Writing unit test is good, but it's not only about coverage, but testing all kind of input values to make sure you handle that properly. It's easy to write code that works and have unit tests for that, but the problems occur when input is something you didn't think of, and when it fails and you don't handle cases you didn't think of. This is also by the way why I like the Go's approach where every function follows the flow of return on failure, and if you reach the bottom of the function, all above is OK. But this still requires you to unit test with data beyond what you initially were expecting.

  • @fatalsg92
    @fatalsg92 Před 7 měsíci +2

    Note to self: if your software involves the safety of human lives. make damn sure that you implement testing and diligently do it properly even if your managers are already harping you about some stupid deadline.

  • @Mel-mu8ox
    @Mel-mu8ox Před 8 měsíci +3

    I watched a different vid about this.
    It had pictures of the ppl before they died...
    It was like holes were rotting through them where the radiation had gone through :/

  • @sabarblatoe
    @sabarblatoe Před 8 měsíci +2

    Honcho: "So what Dev Team are we using for the software?"
    .............
    Emp: "Craig........."

  • @whatwhat9519
    @whatwhat9519 Před 8 měsíci +1

    i told my dad this yesterday that behind every app, website, program, computer there is a person who chose (either purposely or accidentally) it to look and operate that way
    and that it all doesn't just all come out of the ether

  • @almari3954
    @almari3954 Před 3 měsíci +2

    This was a concurrency bug. No amount of traditional testing will help with that. You need model checking and/or formal verification. Alternatively no tasks, just a single polling loop and even then rigorous measurement of timings is necessary.

    • @nackha1
      @nackha1 Před 25 dny +1

      This. It seems like those mathematical methods are completely forgotten during discussions of current software development, even though the foundations have been around for decades

    • @almari3954
      @almari3954 Před 25 dny +1

      @@nackha1 Yeah, totally agree. There is some adoption by industry (distributed systems mostly), but way too little in my opinion.

  • @PorthoGamesBR
    @PorthoGamesBR Před 8 měsíci +8

    For me there should be a big red sign saying: "If you messed something up, dont try to fix it. Turn everything off and start again" because this is as much error from the manufactures as it is from the user.
    Edit: I put the operator when actually i was refering to the user, the hospital in this case. They should train the operator to a default case, basically when a error happens, something out of the planned (like writing something wrong in the input), no matter how small it seems, always turn the machine off and reestart the whole thing.

    • @kevinscales
      @kevinscales Před 8 měsíci +2

      Only if the manufacturer has communicated that rule to the operator. Otherwise it's all on the manufacturer.

    • @Ausar0
      @Ausar0 Před 8 měsíci +1

      nah the operator shouldn't be blamed at all in this instance. they were trained how to use it, and they used it exactly according to that training. I can't even blame the person who wrote the code, because it should have gone through testing.
      This is 100% on the manufacturers for removing the hardware locks and not properly testing the machine before sending it out.

    • @andrejszasz2816
      @andrejszasz2816 Před 7 měsíci

      This is just a workaround for bad or nonexistent testing. This approach only works because the default “happy path” behavior is usually tested by the programmer itself. Many consecutive times. It must be daunting to start everything over from scratch all the time.
      Like you have misspelled the zip code for the patient which would then reset the form to blank and start over. I don’t think even nurses can handle this long term (they are experts at repetitive tasks)

  • @WisherTheKing
    @WisherTheKing Před 8 měsíci +12

    It’s quite scary to think somebody like us programmed such machines. I do freak out thinking about this. 😮

    • @bionic_batman
      @bionic_batman Před 4 měsíci

      Not that scary, at least nobody programs those machines in Javascript yet so they are at least somewhat reliable even when not properly tested

  • @billeterk
    @billeterk Před 8 měsíci +5

    We covered this in undergrad comp sci.

  • @JArielALamus
    @JArielALamus Před 8 měsíci +3

    People being too confident in software is something still present today. Just this year, I found out that a company that offers tax calculation and payment services, has their accountants being too confident in their in-house software resulting in them not doing their job properly, and wrong taxes being presented due to faulty data.
    They should detect edge cases but if the software doesn't ask for it, they won't ask for it. The cherry on top is, the software doesn't allow for corrections and breaks spectacularly if you try to do so

    • @TheNewton
      @TheNewton Před 8 měsíci

      ftfy: c̶o̶n̶f̶i̶d̶e̶n̶t̶ not legally regulated or accountable.
      Even though actual accounts are regulated, it's bizarre but inevitable to get changed.

    • @JArielALamus
      @JArielALamus Před 8 měsíci

      @@TheNewton they are legally accountable, after all they have to put their signatures and their professional license to get those tax declarations accepted.
      Also, the government here likes to wait to make the fines bigger, so it takes time before that company starts getting legal action coming to them and it won't be cheap

  • @zebedie2
    @zebedie2 Před 8 měsíci +3

    This was around 1982 so home computers were barely a thing at this point (think ZX spectrum's and other 8 bit machines) and the memory constraints were really tight. So it doesn't suprise me it was written in assembler, as it's not just the machine you're programming that has tight constraints but also the machine doing the programming. The big flaw was only having one guy write and test the code, but it was very specialist thing back then
    This does remind me of the plastic surgery machine in logans run that goes haywire, I think it was the "escapulator"

    • @CertifiedDynamite
      @CertifiedDynamite Před 8 měsíci +3

      I read a technical paper about this incident (there were, in fact, more glitches in Therac 25 that this one) and use of assembly as the programming language was not seen as a contributing factor. The real flaws were poor troubleshooting, incompetent risk analysis, failure to act in time and the fact that Therac 25 was never tested as fully assembled by AECL. Marietta hospital physicist Tim Still deserves a lot of recognition for figuring out how to repro the bug.

  • @voidmachines
    @voidmachines Před 8 měsíci +2

    that's why formal verification tools like TLA+ were created

  • @heyariaz
    @heyariaz Před 8 měsíci +5

    A finger chopping machine sent me here 😂

  • @ww123ification
    @ww123ification Před 8 měsíci +1

    I heard this during one MeetUp session where some SW related to DevOps was being introduced: "The log files will record all errors except for irreproducible bugs because if a bug cannot be reproduced, there is nothing we can do about it". So it "tought" me that if you ignored the worst kind of bugs, they did not exist (until someone got hurt or killed). Needles to say that the code of this SW was not functional or immutable, which is not a silver bullet by itself, but FP can help quite a lot to limit bug occurance.

  • @malcolmhutchison
    @malcolmhutchison Před 8 měsíci +2

    I remember the days we laughed about biiling software that sent final demands for 0.00

  • @justinbrooks7823
    @justinbrooks7823 Před 8 měsíci +2

    I programmed in assembler in the 1980s alongside engineers buliding prtotype hardware. Of course either could fail and part of the skillset was working out which was the issue for given bug. None was safety critical (sound mixing gear) but all still built and tested BEFORE going into the wild. This story was no more acceptable then than it would be today.

  • @homelessrobot
    @homelessrobot Před 8 měsíci +2

    you have to think about 'software responsibility' like you do human responsibility (because thats actually what it is). If the software is employed in a way that the people who commissioned and designed it knew that faults had a high potential of being directly involved in the cause of an injury or death, then they are legally obligated almost everywhere to take extra ordinary precautions to prevent such a thing, not doing so is criminal negligence; like drunk driving or shooting someone on a movie set with a real fake gun.

  • @bambitsunami4165
    @bambitsunami4165 Před 8 měsíci +11

    The really scary thing is that even if there were hardware locks, several professional engineers coding in a high level language, more extensive software testing AND more extensive testing with users/operators, there might STILL have been an “unpredictable” software error that resulted in deaths. Testing reduces the chance of errors (drastically) but doesnt guarantee that all possible errors are accounted for. Sometimes engineering is about preventing mistakes, and sometimes its about learning from mistakes.

    • @dioneto6855
      @dioneto6855 Před 8 měsíci +3

      not long time ago, 2 planes model 737 MAX crashed due to software relying on a single sensor, which in one case was installed and not calibrated, worth taking a look at the case.

    • @stoogel
      @stoogel Před 8 měsíci

      ​@@dioneto6855Seemingly another greed move- if I remember correctly Boeing had built redundant sensors for military planes with the MCAS.

    • @Andyzzzz501
      @Andyzzzz501 Před 8 měsíci +2

      There is something called "software correctness proofs"

  • @Astraeus..
    @Astraeus.. Před 3 měsíci

    I've seen software errors related specifically to a user changing an input parameter after having set it in the first place just as the nurse here did when changing her X to an E. Specifically I've seen cases where even though the visual interface shows the input change (in this case it changes from the X to the E) but the actual change in software never occurs. That's approximately the case in this scenario, it's just one of the many things that can go wrong in these kinds of systems. It's also one of the things that bug testing (which wasn't really even done here) doesn't necessarily even catch since you can't really check for things you don't think to check for, and it's the kind of simple little error that somebody could very easily overlook and just not ever check to see if it worked properly when doing it that way. I've even seen situations where the change from, for example, X to E actually works as intended but if you did it the other way around, E to X, then the UI updates but the actual change doesn't take place in software.....so again, it's one of those things that might not even necessarily be caught with testing unless the person specifically thought to check all possible iterations of input and input changes to make sure it worked in every possible configuration and with every potential change.....

  • @yannick5099
    @yannick5099 Před 8 měsíci +35

    This gives UX a completely new meaning. Industries where you can harm someone just by making a seemingly small or easy to overlook error are really scary. I find it crazy that we have so little legislation about software developers despite the huge role software has in modern devices.

    • @Cyberspine
      @Cyberspine Před 8 měsíci

      Medical device software is heavily regulated and monitored by the FDA.

    • @HermanWillems
      @HermanWillems Před 8 měsíci +6

      Uhm, developing software for devices that have the possibility to kill someone is a TOTAL different world that script kiddies writing front end Javascript. It's just totally different.

    • @tah3460
      @tah3460 Před 8 měsíci +2

      Even more legislation in the medical field scares me even more

    • @thibautconstant3942
      @thibautconstant3942 Před 8 měsíci +3

      Having some experience with medical UI for embedded systems and HIS. They are mostly confusing, error prone and often buggy. I attribute it to the fact that at the end of the day, the medical software industry is still a niche market.

    • @TehKarmalizer
      @TehKarmalizer Před 8 měsíci

      @@thibautconstant3942I attribute it to the fact that at the end of the day, it’s still software.

  • @rymdsate
    @rymdsate Před 3 měsíci

    I used to work as a programmer within medtech, more specifically visualization of MRI, CT, PET, NMR, etc.
    Back in the days, a small error in a printer routine could start a chain of delays, which in effect postponed emergency treatment, thus potentially costing lives. :/

  • @MrDivinePotato
    @MrDivinePotato Před 25 dny

    I watched Kyle Hill's video on this with some programmer friends and he opens with "We don't think of software as something that can break". Needless to say, hilarity ensued.

  • @pluvio_phile4554
    @pluvio_phile4554 Před 8 měsíci +2

    "a finger chopping machine sent me here"

  • @patrickhollywood93
    @patrickhollywood93 Před 8 měsíci +7

    omg. Saying software doesn't fail is like saying numbers don't lie. If you've ever worked with either you know this to not be the case. I suppose we live and learn. Sad story, but a good one. Thanks PrimeTime. ;)

    • @TheNewton
      @TheNewton Před 8 měsíci +1

      Ah the joys sidestepping rhetoric and strawmen
      "Computers don't make mistakes, people do"
      "Guns don't kill , people do"
      "Chemical don't pollute the waters, people do"
      "numbers don't lie, people do"
      Frame it as an argument about the formers inability to have involvement to avoid discussing accountability for the latter, and regulation for both.

  • @CoseDaTux
    @CoseDaTux Před 4 měsíci

    i read a book called rendezvous with Rama (by Arthur c. Clark) where the commander of a space mission died because of a infinite loop on the code of the medical system during an operation

  • @andrejszasz2816
    @andrejszasz2816 Před 7 měsíci

    I got an FMEA ad after watching this 😂
    Failure Modes and Effects Analysis (FMEA) is a systematic, proactive method for evaluating a process to identify where and how it might fail and to assess the relative impact of different failures, in order to identify the parts of the process that are most in need of change.

  • @heavymetalmixer91
    @heavymetalmixer91 Před 8 měsíci +2

    Imagine: Someone creates a new medical treatment machine, programmed with JS.

  • @jimtekkit
    @jimtekkit Před 3 měsíci

    I've done a little bit of assembly programming and you very quickly learn that you're basically walking on a tightrope. Make one wrong move and there's no safety net to catch your fall, it's game over. Just the idea of running it in production with no risk mitigation or testing makes you go numb with fear.

  • @fu886
    @fu886 Před 8 měsíci +1

    it was pretty common for the 80s to have a 2-3 studio and most code was written in assembly

  • @blargd
    @blargd Před 8 měsíci +4

    It sounds stupid but there's a reason I actively stay away from programming jobs that can have that kind of impact if I go wrong, I used to work with big pharma companies and that just was the worst, ever since then I've stuck to jobs with companies that don't have that level of responsibility if I get it wrong, currently working in e-commerce worst case is if I get it wrong a multi-billion company loses some money but nobody gets hurt at least

  • @Iscream4j0y
    @Iscream4j0y Před 2 měsíci

    I write tests to ENSURE that the code works the way I think it does, I write my tests to challenge my code and make sure that it's not doing something other than I intended. I work in the healthcare industry, my code gets hammered on by the public more than 10,000 times a day.
    It drives me nuts when I learn that the code that I've been logging out every single step of the way has not actually been working the way I intended.
    It's forced me to hard-type everything I possibly can

  • @ingrudmessenger1193
    @ingrudmessenger1193 Před 4 měsíci

    I mean... a guy i know was an electrical engineer and installed an infra red laser to cut metal at one of the world leading companies for car parts. An actual invisible laser that cuts metal. He was a hobby programmer... and alone xD That wasn't in the 90ies... something early 2000.

  • @UnityGuy149
    @UnityGuy149 Před 3 měsíci

    Literally the same decade as the Shuttle explosion. They literally threw caution to the wind with systems that they knew should be in place, but assumedddddddd would be okay without it. to quote the dad in Bluey, "The 80s was a wild place"

  • @kyoai
    @kyoai Před 5 měsíci

    Imagine having an dialogue message pop up "Radiation dosis either too high or too low. Continue?" Would you hit 'continue'?

  • @cprn.
    @cprn. Před 4 měsíci

    It's very very easy to understand 80's devs. Business: "Just write a proof of concept so we knew what we need and then somebody with engineering experience will go over it."

  • @H3cJP
    @H3cJP Před 8 měsíci +1

    Other of the times that it failed, it wasn't because of that, it was because of another bug related to buffer overflow, a classic that killed people

  • @haschid
    @haschid Před 3 měsíci

    7:25 To be fair, SPARK (which is used on planes, military equipment, etc.) has mathematical proof of code correction. This specific error would not have happened if the system was made with SPARK.

  • @drakecarman-loveless9151
    @drakecarman-loveless9151 Před 8 měsíci +5

    Finger chopping machine sent me here

  • @stanisawdawidowicz5100
    @stanisawdawidowicz5100 Před 2 měsíci

    They had hardware locks on the version that was manually operated (in those versions the software did some calculations for you, told you what values to set on various knobs and dials, but you still set those manually). They removed hardware locks on the version was software operated because of the whole "software never fails" mentality.

  • @123Handbuch
    @123Handbuch Před 7 měsíci

    Reminds me of priority inversion or whatever it was when parallelizing stuff not being allowed in software used for airplanes.

  • @pabloagsutinnavavieyra2308
    @pabloagsutinnavavieyra2308 Před 8 měsíci +4

    A finger chopping machine sent me here

  • @GRHmedia
    @GRHmedia Před 8 měsíci +1

    There are a few things that will also get you. The computer you are shown is just a terminal. It is connected to a PDP11 multiprocessing computer which is what controls the therac-25.
    It was do to the race condition which occurred from multiprocessing that the error was allowed to happen. If the process had been limited to a single thread it would likely have been caught.
    They probably did test the software but with a dummy system rather than the actual Therac25. In this case it probably did something like up a certain light if the signal came did what it was supposed to. The problem is it most likely didn't replicate the Therac25 behavior entirely such as that 8 second block of time it doesn't see input changes. That would most likely been done for several reasons. Cost of using the actual machine, legal requirements for the operator of the machine and their licensing requirements. (Imagine trying to find a programmer who also happens to be a trained radiologist.) I would bet that is were this issues really starts at a politicians lap. Some idiot created a law without regarding how it would affect the development process. I seen that to many times.

  • @Akaterial
    @Akaterial Před 5 měsíci

    If you want an idea of what people thought of computers back then, watch Superman III with Richard Pryor as a programmer named Gus. Gus creates an energy crisis by sending all the oil tankers to the middle of the ocean, creates a tornado with a weather satelite in addition to more run of the mill stuff like embezzling. Programmers were magicians.

  • @lucasirondesouzacamargo1540
    @lucasirondesouzacamargo1540 Před 8 měsíci +1

    7:40 better than the alternative, which is imagining those programs were written by machines.

  • @dezlymac
    @dezlymac Před 8 měsíci +2

    And...This is one of the reasons why I'll probably never get LASIK (Laser eye surgery). As a programmer, I can't afford to take such risks with biggest asset, my eyes. I know they test these things, but all you need is ONE tiny screw up either on the software, hardware, or staff level and yikes. Sure you could sue the company for money, but some things are just too priceless to lose.
    I think I'll stick with my regular prescription glasses unless I absolutely need surgery.

  • @demolazer
    @demolazer Před 8 měsíci +2

    Heard this story before. They had one guy coding the whole thing with no oversight whatsoever then denied the problem existed. Utterly disgusting and they faced no consequences.
    And 5 patients dying and not banning the goddamn thing for a year until its tested into the ground. FFS

  • @talis1063
    @talis1063 Před 8 měsíci +1

    Any software developer or engineer that's required to be solely responsible for any important part of a safety critical system should just refuse the job. The higher-ups that asked should be responsible as well if anyone should accept.

  • @pursuantspy1557
    @pursuantspy1557 Před 7 měsíci

    This is such a famous incident that it’s taught in engineering school one of the first things we learned about in fact

  • @Leppits
    @Leppits Před 3 měsíci

    10:31 this is probably the most scarry part, they'd literally blame the singular person who programmed it instead of the people who made a really bad decision to have only one person do it.

  • @IvanKravarscan
    @IvanKravarscan Před 8 měsíci

    So many things have gone wrong in the hospital in the story, so much negligence on every corner. If it was not a faulty radiation code it would have been mold or collapsing ceiling.

  • @roberthoople
    @roberthoople Před 4 měsíci

    "Software Doesn't Fail."
    Microsoft: "Hold my Kombucha..."

  • @skrypets
    @skrypets Před 8 měsíci +2

    Now we know where "Works on My Machine" comes from....so sad

  • @Altrue
    @Altrue Před 8 měsíci

    Before "It works on my machine" we apparently had "It works on my patient"

  • @RogerValor
    @RogerValor Před 6 měsíci +1

    I think in the 80s, the difference was not "software never fails", but rather "the only guy who can solve this on our hardware writes the code in the basement"
    Tests were also in my opinion rare to be part of general Software Development until the 90s.
    I am amazed it had "tasks"