Zebras All the Way Down - Bryan Cantrill, Uptime 2017
Vložit
- čas přidán 23. 07. 2024
- Zebras all the way down: The engineering challenges of the data path
Presented by Bryan Cantrill
Much attention is rightfully devoted to the development and deployment of stateless services, but these services are not themselves devoid of persistent state; rather, they rely on other services to manage this state for them. This data path, however -- that stack of software that is emphatically not stateless, being responsible for distributed and/or persistent state -- is entirely different in its constraints and failure modes. This software takes years or even decades to get right, can be arduous to upgrade, and -- even in a post-cloud era -- lives and dies by the fickle whims of hardware and firmware. This talk will reflect on two decades of building the data path, from the dawn of storage networking through modern cloud storage services.
Presented at Code & Supply's Uptime conference in Pittsburgh, PA. Learn more at www.codeandsupply.co uptime.events - Věda a technologie
Love Bryan, never a boring talk! We need more presenter programmers like him!
"We shouldn't use CAP theorm as an excuse to give up on humanity." I love it.
His energetic speech style is truly one of a kind. Loves it.
I think his energetic style has understandable reasons
As a paramedic and a programmer this is hilarious.
I love how Bryan just throws up his sister's lab specimens with not so much as a warning, someone probably clenched their eyes shut fast to that one. Fun talk!
Felt good to rewatch this talk. Very glad to have had Bryan in Pittsburgh.
This is my favorite of his talks
Ignore this comment, I'm just using them as a personal highlighter:
"Restarting a component is the wrong first motion for something that's misbehaving in production."
"The world is too complicated, we need to be very mindful about making it more complicated in the name of availability, because that complexity will cut into the very availability that we are trying to deliver"
"There's only one body of software where you can just drop work when you are under load, and that's the TCP/IP stack; if it's important they'll resend it."
28:40 "I don't know why anybody would run on anything else than ZFS"
Me neither. Being using it to store my data and backups since 2008.
I was able to detect a controller issue, and later a bad connection/cable thanks to the scrubs.
Try watching it at 1.5 speed. Soo much fun.
hah! having edited it, I was watching it at 8x and 16x sometimes. I was able to actually figure out what he was saying sometimes, so there's room for him to go even faster.
And then?.... The singularity
hahahahahahah...... very funny,,,,
I actually found it refreshing that he wasn't taaaaaaaaaalking iiiiiiiiiiinnnnnnnnnnnncccccccccrrrrrrreaaaaddddiiiiibbbbllllly sloooooowwwllly like it is common in other talks.
.8 nano meters is quarter of the width of a dna molecule or 6 silver atoms
14:05 it doesn't appear on x-rays? 🤔
❤️
abdominal pain is _indeed_ something you need to pay attention to.
if they did laparoscopic exploratory surgery they _barely_ "cut her open" 😁 it's like... a 2-3 cm long incision so the implements can get in?
Heya! The laparoscopic surgery was the appendectomy; as I explained, the Meckel's stones were found in an exploratory laparotomy -- which is a large incision, not a small one. They are not done frequently -- for good reason! ;)
@@bcantrill ah, so laparotomy, not laparoscopic surgery, i see.
can you imagine surgeons operating with the same rigor we write code?
Its not inconceivable to have firmware that can write its state and value of its variables to a log somewhere that is accessible higher up the abstraction but the challenges / options are (1) you could write to i/o device directly (it will slow the firmware as latency will go up) or (2) you could write to RAM and on error dump the RAM into i/o device, the software on top can access this i/o device to analyse, reproduce error and debug etc. The problem is firmware operates in a very memory constrained environment there just isn't enough RAM to do option 2 and option 1 is also not possible as latency is unacceptable.
The only real way to achieve this is by changing the system architecture as most architectures do not have this kind of support. I don't think is firmware vs humans is much of a fight, humans will whoop ass but realistically its more like human vs hardware manufacturing business constraints. Make it cheap, make it fast, make it small etc.
Speaking from experience, it would be way easier if specs and firmware were just openly accessible. Having narrowed down an issue to firmware is one thing, handling it is another. Just look how many workarounds are in OS kernels such as Linux just because you can't tell, see e.g. for the Dell SMM handler workarounds for hwmon.
Go ask a Pure Storage engineer about Tungsten sometime.
"We shouldn't actually use CAP theorem as a reason to give up on humanity" 🤣
❌👁️❌👁️❌
jfc. a great talk. But...stick to the topic.
i had to stop watching this... started to have an anxiety attack from the speaking style of the presenter
d0lvl0 I mean, enthusiasm is appreciated. I'd much rather listen to someone who cares about what they're talking about.
I agree, but this is a little too much for me.
@@LostieTrekieTechie we
@@d0lvl0 you
Interesting, his style for me is one of the only ones that can keep me hooked on