researchers find an unfixable bug in EVERY ARM cpu
Vložit
- čas přidán 9. 07. 2024
- ARM is a great computer architecture with some great security features. In this video we talk about TikTag, a new attack that shows how one can use speculative execution to see the future.
arxiv.org/pdf/2406.08719
🏫 COURSES 🏫 Learn to code in C at lowlevel.academy
🛒 GREAT BOOKS FOR THE LOWEST LEVEL🛒
Blue Fox: Arm Assembly Internals and Reverse Engineering: amzn.to/4394t87
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation : amzn.to/3C1z4sk
Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software : amzn.to/3C1daFy
The Ghidra Book: The Definitive Guide: amzn.to/3WC2Vkg
🔥 SOCIALS 🔥
Come hang out at lowlevel.tv - Věda a technologie
haha wow that lowlevel.academy guy seemed pretty cool huh?
Whos that?
Never heard of that guy... Does anyone know that guy?
Yeah, I like his hair
😮 Very tempted by this assembly course. I’ve done a bit of assembly in some really low-level optimisation work (comparing what different Rust functions compile to), very nice very cool
my bitdefender gives warning on that werbsite.
Modern day computing is too unsafe lets all go be amish.
lmfao yea
when i retire i'm building chairs in a log cabin
@@WarDucc amish computing is too unsafe, let's go back to stone tablets 😅
@@LowLevelLearning i will be reinventing the wheel see you when you retire!
You are confusing the Amish with Luddites.
Every time I hear the phrase 'speculative excution', I am reminded of what a late friend of mine used to say: "CPU designs should never incorporate speculative execution or branch prediction. They will inevitably lead to security vulnerabilities." He was also a big fan of the ARM architecture, because it did not use to do this thing. He passed away about fifteen years ago, but as it turns out he was right...
Only in architectures where it was added long after the instruction set was finalized. The problem is not that CPUs have speculative execution, but that the 8080 they're based on didn't.
the problem is that specultive execution / branch prediction brings huge performance benefits, there is a reason as to why we have it and still use it
@@darrennew8211 Not true. The ARM ISA is not based on the 8080 architecture and now also seems to suffer from it.
My friend was very adamant about this at the time, that this would not be restricted to architectures that weren't built around it.
@@juhotuho10 That is the counterargument that I put before him all those years ago and I was treated to a lecture about why the benefits could never outweigh the costs and why especially in multiprocessor/multicore systems this would lead to all kinds of security vulnerabilities. And he pointed out exactly the kind of security vulnerabilities that were discovered in the past decade or so.
@@juhotuho10 It brings huge performance benefits if your architecture is such that it pretends to execute one instruction at a time in order. You don't need it if your instruction set is designed from the ground up to keep every computational unit busy all the time. You need it because you execute one load instruction then one add instruction and then one multiply instruction then one store instruction and expect the CPU to behave like it's not doing all that in parallel.
people that figure this stuff out are so amazing. like I understand it, after you explain it, and am like "yep I get it," but I could never actually figure it out beforehand or even consider that it exists.
@@c.ladimore1237 I’m not claiming that it is easy by any means, but these people spend everyday searching for bugs like these. Surely, at some point, they develop some kind of intuition.
That's also part of the skill of the presenter. A good presenter can easily make you feel like you know more than you do.
@@c.ladimore1237 I don’t professionally find exploits, but I have found unique ways of using things in unintended ways.
My understanding is exploits like this are either people looking at how things work and being like “wait, that means theoretically it will do this thing too” or people being like “I wonder if it will also do this thing too” and trying it.
So to me, it seems more akin to educated experimentation with the scientific method, while software development (although there is experimentation) is more akin to writing a book.
Beacuase it was a team of hundreds of people working on it
If you know how a cpu works on the low level, I guess you can think up of these things?.
"There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors." (Leon Bambrick)
Let me add two other hard problem. Memory allocation and bounds checking, hunter2
What a quote lmao
Don't forget cache invalidation
@@BobFlats7 cache invalidation is 0th in the list!
Funny, but naming things isn't hard at all.
Weeks ago UEFI, now ARM last year I joked about hardware backdoors this year
STOP JOKING! :D
THANKS FOR JINXING IT XD
Please stop helping...
Except neither were backdoors. In the first case it's just a standard buffer overflow bug, except because you're running directly in ring -3 there's no ASLR to save you. The ARM bug is actually a feature that speeds up the CPU, which is good, but accidentally was implemented wrong. The difference is that buffer overflows can be patched by a software update (if you haven't downloaded the UEFI security update please do so right now), but a bug in the CPU itself means you need a new CPU.
You are the guy that says "q***t day" in the office/chat aren't you
My God. I guess time to check off "security vulnerability found in something you worked on" off my bucket list.
I was an intern at Arm, on the team that worked on MTE. I did some work around the generation of the tags, and on simulating the overhead they would have in caches and memory.
I have such mixed feelings right now. :D
This seems like something we could have thought of. Meltdown and Spectre were fresh on our minds and a major topic of discussion in the company. I can imagine an alternate universe where I told my manager (or someone else on the team) "hey, have we thought about if tag mismatches could be a cache side channel?" Yet I don't think we ever discussed anything related to this? At least not in any of the meetings I was in.
But hindsight is 20/20. In retrospect, these things always seem obvious.
We were mostly focused on minimizing the performance overhead of memory tagging, because we were worried it would get in the way of adoption. We wanted our new optional security features to be supported by hardware manufacturers, who might not be happy with there was too much perf or memory overhead, extra hardware complexity, or cost / die area increase.
Though, I guess, despite this new vulnerability, it still delivers on its goals. MTE was supposed to be something that offers substantial security improvements for cheap. A "better than nothing" optional feature which, when enabled, has a good chance of catching some bugs that might not be found otherwise. It is probabilistic: even if it worked perfectly, there is still a small chance a memory bug might go undetected by it (if different allocations happen to be assigned the same tag by chance). It was not meant to be perfect, or any sort of bulletproof defense. Just a way to hopefully catch more bugs in the wild. If a vulnerability makes it less effective, that's still better than every other CPU that does not have something like MTE at all.
It has its value as a hardware address sanitizer. I used it on C code within an Android App on the Google Pixel 8, which supports MTE, and it helped to figure out and fix a hidden memory management bug (a use after free).
@@olafschluter706 Yep. "Hardware ASAN" is pretty much how we thought of it when designing it. The motivation for MTE was "imagine ASAN but with low enough overhead that you could deploy it in release/production builds and just enable it everywhere, and hopefully also catch bugs in the wild instead of just during development."
@@inodedentry8887 yeah. Arm have said that the tags aren't secret. The video is somewhat misleading. Not all arm CPUs have mte and it isn't used much it seems
That’s good to hear. It’s a somewhat obvious exploit in the context of meltdown and spectre so the question of potential value is a business decision (as you reference) and not an engineering one. And I assume intern means you were young and less experienced so you certainly aren’t at fault.
Is it better than nothing at all? That’s the hard question.
@@HayesHaugen I think if it helps to catch memory management bugs, it helps to reduce the attack surface and the number of possible exploits of software checked by it.
I am a (retired) professional programmer. I never wanted my programs to run as fast as possible. I wanted them to run as reliably as possible, i.e. rock-solid reliably. I have seen countless examples of programmers being led astray by the siren song of premature optimization.
It depends. ARM processors are often used in embedded devices with few resources and hard real-time requirements, and programs that are not as efficient as possible may not be appropriate.
OK, interesting, but this is a way to defeat a secondary defence. The program still has to contain an exploitable memory corruption in the first place. I think describing it as an unfixable bug is to some extent click-bait.
@@sylviaelse5086 I agree. It's also not close to EVERY ARM CPU. Only newer Cortex-A CPUs, no M devices at all. Seems like a bad bug, but color me underwhelmed after that title.
Given how many "unfixable bugs" have been found and viola, fixed in one way or another, yeah, clickbait.
Clickbait doesn't win subscriptions, it wins unsubscriptions.
from what i understand, you need to achieve arbitrary code execution to achieve arbitrary code execution. it is a little silly.
@@nocakewalkthe M chips already have their own vulnerability lmao, they don't need this one
@@not_kode_kun which vulnerability?
Every time I hear Speculative Execution is about about a security vulnerability
i mean, when else are most people profoundly affected by low-level cpu optimizations
@@rccliRow hammer. A brute-force trick that we are having a hard time dealing with for as long as VRAM exists.
Yep it's a complex system for a complex problem. It's been around for decades, but it's still not perfect.
If we were to completely disable it now, we'd see processor speeds jump back 2 generations across the board. (very rough guesstimate)
So yeah... fun world we live in, huh?
What's crazy to me is that these CPU optimizations basically exist since the 1990'ties. When I heard about the first speculative exectution vulnerability it reminded me immediately of some presentation I held as an undergrad student end of the 90'ties: 'RISC Processors - Pipelining' ... and all those optimizations like Speculative Execution and Branch Prediction were part of my talk. But back then the idea would have never come into my mind to look at that from security perspective. All you thought about and talked about was how it improved performance. So the Meltdown and Spectre Vulnerabilites were found already decades after those optimizations were introduced in the first processors"
So basically you can say all those vulnerabilities are out there, because these optimization technques have have been developped and gotten more and more sophisticated over several decades of processor development, starting with RISC processors in the ninetees. But the awareness to look at things like that as a possibile vulnerability and attack point was non-existent ... I'd say as cyber security research progressed and looked at similar mechanism in software and elsewhere, then the researches suddenly became aware that there is also this huge problem with CPUs, turning all these awesome optimizations suddenly into security vulnerabilities and only then everyone started looking into it, after decades of not thinking of that at all.
IA64 had a ton of problems, but I really believe that explicit speculation was a great idea. So many of these attacks would be impossible on Itanium. (Insert joke about them not being attacked because no one used them)
What is explicit execution?
@@deusexaethera IA64 puts the work of avoiding problems due to parallel execution in the hands of the compiler. I.e., no mechanism to back out unexplored paths like with speculative execution. The idea was to run the CPU fast and loose, and just force compiler writers to deal with the burden to take advantage of full speed. Problem is, there are lots of languages and compilers, and not everyone wants to incorporate this stuff into code generation, and not everyone is good at it.
so the "feature" was it didn't do anything special?
@@MadsterV more correctly, the CPU didn’t hard-code any of the behaviors: the pathways existed in similar ways to x86, but required explicit control via ultra-wide instructions (VLIW architecture) which meant explicit, multi-instruction parallelism. In some ways, this arguably complicated the CPU as it made instruction parsing many times more complicated; on other archs those features would run mostly on autopilot while the instructions remained easy to parse and prevent collisions/weird behavior.
Hitachi SH5 also had a very nice branch expliceit prediction architecture. Unfortunately that did go nowhere :/
CPU vulnerabilities usually need relatively low hardware access in order to work.
But when I heard you saying somebody managed to exploit it from within V8 (being a web dev) it literally just hit me - We're f**d.
JS isn't as much of a toy these days. You can easily manipulate raw binary data in JavaScript. Some more tinkering and this would easily escalate to a sandbox escape and really, really low-level code injection... From within a browser...
reject modernity, let's go back to monke! err... I mean DHTML
Tbh v8 0-days are being discovered every week now. It's easy to get RCE without some crazy CPU bug.
@@theairaccumulator7144 Yes, but for good results you'd need to escalate privileges, injecting direct CPU instructions omits that completely.
@@theairaccumulator7144 Yes, but in general, first you have to escape the sandbox, then find a a way to execute your code in something like a shell, and then gain admin access.
The paper covered in this video describes how it was done all in one step.
also: does web assembly still exist? This is lower level than js so it should be more easy to predict which wasm instruction transpiles to native machine code, making side-channel attacks even easier & more reliable then using js.
Great breakdown! Not surprised to see that speculative execution is causing vulnerabilities on more than just x86 - really feels like it was only a matter of time before something like this was uncovered. The way it was done, though, is absolutely wild.
Lets wait for dozen of fixes that will decrease productivity compared to leaving the feature off. No lessons learned whatsoever.
@@alexturnbackthearmy1907 Not doing speculative execution isn't really an option though...
That would cause a FULL pipeline stall after every branch. And not doing prefetching is even worse.
Complex problems require complex solutions and those oversights are sadly the cost of that.
We can only hope that most things are found and fixed before they can turn into widespread exploits in the wild or hope for memory to suddenly get 1000x faster without any other downsides.
@@Momi_V Eh, if thing were actually done the right way, we wouldnt have this conversation whatsoever. At least there is hope that they dont throw it under the rug (just like "superior" windows ARM hardware which isnt really).
@@alexturnbackthearmy1907 modern cpus without any branch prediction wont stand a chance in terms of performance to one that has all mitigations enabled, even the non applicable ones
I did not expect to find a MY here 😂😂
Access to leaked tags doesn't ensure exploitation. It simply means that an attacker capable of exploiting a particular memory bug on an affected device wouldn't be thwarted by MTE.
But since this re-opens the door for buffer overflows, which after all is the most commonly found attack vector, we're basically back to square one. If someone finds an exploitable buffer overflow bug in the V8 sandbox, then you're looking at unprivileged code execution, which can be problematic enough. If someone finds one in both V8 and a kernel call then you have complete device pwnage. This smells a lot like how the PS3 was pwned.
@@andersjjensen or uglier, crash-o-matic, one runs into race conditions if the software didn't return a clean abort.
Still, code should be able to work around, like all of the other "unfixable bugs" over the years.
I am Pentium of Borg, you will be approximated.
The door was never "shut" to buffer overflows by MTE, its a second line of defence, and to breach it you still need a memory vulnerability in a target program (which MTE in this specific case will never catch anyway, its not designed to be perfect) and an incredibly niche one at that for this exploit. Problems like this can be better prevented when we move towards safer languages for userspace like rust and the lot.
As is usual with security, you cant rely on any one countermeasure, you need defense in depth.
My jaw dropped when you said it works inside the V8 sandbox. Bless the researchers for finding this.
I think specter and meltdown did also work in JS, in the browser. The speculation engine will see any code that runs on the cpu.....
Misleading title, there are ARM "chips" that do not have these extension, a lot of them even do not support virtual memory
You have in my opinion some of the best content over hosted on CZcams. If this existed in 2004 my early programmer self would have had a much easier time learning how to exploit for fun ;).
V8 engine screams to me : "you can do this on your phone right now"
"EVERY ARM cpu" article shows that it was introduced in arm v8.5
And everyone talks about Cortex A and forgets that Cortex R and Cortex M realtime and microcontrollers are massively different.
spec. execution is not only about filling up the cache to be ready, it can actually execute part of the code in different execution units but later either keep or discard the results depending on the path taken
Tf is your pfp
Exactly. See Lex Fridman's first podcast with Jim Keller for a really good explanation of how modern processors work in this way.
Somewhere I read and/or saw John Hennessy and David Patterson. They discussed the limitations of current processor designs, emphasizing that security vulnerabilities like Spectre and Meltdown, as well as diminishing performance returns, stem from reliance on techniques such as speculative execution. They propose a shift towards domain-specific architectures (DSAs) and processors capable of executing high-level language constructs directly. This approach would enhance security, performance, and energy efficiency by reducing the need for complex compiler translations and leveraging the open-source ecosystem for rapid innovation. But then legacy support as we have it now digging back to the 70s would be hard to maintain .. ;)
If you can run arbitrary tik tag code on the cpu, you don't need to break the memory tagging, just run whatever arbitrary code you want on the cpu.
Half true, this can be used for privilege escalation.
This reminds me of PAC introduced in iOS 14 that made jailbreaking very difficult. Eventually a couple Chinese researchers found a way to sign the pointers themselves to bypass it, but I still was fascinated enough by it that I did a college presentation on it in my computer architecture class.
The way you explain in these videos even a golden retriever can grok these topics. No pun intended
OMG It's amaizing!, when you said they did it in V8 was... OMG, incredible! how many layers of security they get to bypass!
the "hats off" right after talking about a hair cut was accidentally brilliant 😂
I suspect we're heading towards a fundamentally unpatchable, ubiquitous and catastrophically effective exploit that forces us to fundamentally re-think chip design.
With software moving faster than hardware this has always be inevitable but it's still crazy to think this is probably coming in my lifetime.
Even crazier to think that the chip that's supposed to solve all these problems may end up being the Mark of the Beast described in the Bible
This just defeats a defense in depth measure. The computer is still secure.
The answer is rust. Rust all the way down.
@@mfaizsyahmi If an r0 exploit can for example manipulate any memory, nothing running on that system is secure, at any level. Not rust, not other drivers, literally every computer state can be manipulated - the entire stack even the bios.
@@74Gee A vulnerability is not automatically an exploit. If your computer only ran rust programs compiled with a trusted compiler, the chance of an r0 vulnerability leading to an exploit would be drastically reduced. Similarly, if I had a fully secure interpreter I could run untrusted interpreted programs on a CPU architecture without any hardware/firmware security features at all and still be secure.
Ergo any hardware vulnerability can theoretically be patched in software, with a certain performance penalty. In practice, any sufficiently severe exploit could take down the internet causing untold damage.
Thank you for your vids. Any update on that php vulnerability? Couldn't find further info on the details of it, beyond being related to language/encoding.
@@kiverismusic iconv chinese extended character bug, the fix is with a glibc update
The first sponsorship I’ll click and use in my life 😆 thanks for your awesome content! 💪
Damn this is such a good video, thanks for explanation. I have only recently started learning stuff abt comp architecture and security and this video is still explaining the paper in the most crystal clear way possible that even I understood it.
0:09 You know that there's three computers in the term "ARM computer"?
First, the obvious "computer". Second, "ARM" stands for "ACORN RISC Machine", "Machine" referring to a computer. Third, "RISC" stands for "Reduced Instruction Set Computer", revealing the third computer.
Almost blew my mind when I first realized that XD
@@Lampe2020 so spell it out, Acorn Reduced Instruction Set Computer Machine Computer 😂
@@nicholasvinen
Exactly.
That brings to mind the people who say things like, "ATM machine" and "PIN number".
Arm no longer stands for anything.
It stopped standing for Acord and moved to Advanced RISC Machine in the mid 90s. And in 2017 moved from ARM to Arm.
(Source: I'm and employee.)
@@m1geoYour message explains a lot.
TMA = Too Many Acronyms
Jeez. What's up with all of those serious recent exploits?
honestly this is common, i'm just making more people aware of it. bugs are everywhere
Probably recency bias. Exploits come out all the time, but due to the big ones early this year people are on edge and more of them go mainstream.
@@LowLevelLearning all these code issues is why I'm waiting for the day computers program computers. Humans arguably suck at it, as we've seen.
@@IncertusetNesciothis kid really thinks AI is going to take over😂😂😂
@@IncertusetNescio I don't think that's happening anytime soon. AI is trained off human data, and thus makes just as many errors as the average human, if not more
Found Ed thru John Hammond, but since John doesn't seem to do vids that aren't just straight ads anymore, I'm excited this is still here to learn from. Thank you, sir!
Yea John hasn't been a reliable source of info in years, bros sold for real.
I find amazing that the people can speak about such advanced subjects, while I try simple to fit an excess 127 code for a normal overflow fix in a vhdl dsp fpu unit. My God, where do you have the time to read these subjects?
Sending my appreciation. Sometimes when searching for work you have a not so wonderful interview for various reasons including just forgetting a term you couldn't recall in a moment. Sometimes a few can affect your mental health especially if not handled with understanding that it has nothing to do with your worth. I had known and worked with assembly. I had known and worked with memory, pointers, understanding buffer overflows, operating systems, and so on building up to a good, extensive software engineering mastery, ethics, and leadership. All of the concepts you mentioned as part of my education. I felt so let down as it seemed no one cared that I knew this stuff and it made me question if I should have specialized in a different path (CE, CS, EE even, physics, etc) when feeling like things weren't working out. I was lifted up as I could follow everything you noted and that I was able to see how worthwhile my time and degree were at my university. I just mean to say I appreciated so much having a reminder when you feel a job struggle to see that you have value and no one can take that away, including in this small way like having an education even if no one is acknowledging it yet. 🙌🏾
The pacman vulnerability has existed for a few years, the big take away from this paper is that they found a pattern to exploit it in other code.
@LowLevelLearning
Just because I'm not sure if I've understood everything correctly.
This memory tagging is just an additional security mechanism in ARM processors and not the only one?
So this design flaw doesn't make ARM processors less secure than other processor architectures, it just makes them less secure than intended. Correct?
Or do ARM processors lack other security mechanisms that other architectures have?
Remember Pointer is the variable holding the address not the address itself, Dope content, massive respect …
2024 - The year of the backdoor and the vulnerability
hold your popcorn... AI is comming hard
Spectre and meltdown did not break the internet.
Thanks for the video and book suggestions 👍
Love that they’re called gadgets, like in hardness proofs
Spectre broke literally nothing. It was a hype wave that lingered for a couple weeks and went away. Nothing ever was heard about any hacks exploiting it after. I expect the same is going to happen to this bug too.
This is fundamentally similar to a hash collision exploit, so the solution is the same. Increase the entropy on the memory tags so that the reuse is practically impossible.
I think calling speculative execution "execution in the future" is misleading as it conveys they idea of a "front-running thread", which is a very distinct and different thing.
The processor simply runs a program and if it needs to make a branch/turn and does not know which way to go, it speculates.
To keep a proper program state, this speculative execution cannot do certain things, but once the speculation is confirmed to be correct, the accumulated speculated results can be committed.
From the processors perspective running the program, it's just execution current code, just of a speculated branch.
There is of course a lagging program-state that represents the validated non-speculative outcomes.
It can restart from this state when the speculated code turned out to be the wrong code and resume with the correct code instead.
A processor is thus not "executing future code".
It might run the wrong code and discard the results, but it's not running ahead of the actual program.
That is a lot less mystic and magical to me.
I just assume that all computers are inherently insecure and act accordingly
Great video and information!
Will this require a hardware level redesign or can it be fixed with compiler patches?
Who would've thought that doing insane things just so you wouldn't have to admit to yourself that Moore's Law has been dead for a lot longer than people imagine would've caused so many security issues?
Seriously underrated comment.
nice sponsor, heard good things about that dude
Because of you I am more interested in assembly language and CPU architecture
Thank you, very distinctive explanation ! Keep up ! Good luck ! I have some different CPU boards (AllWinners family) but luckily they are v6 and v7.
Pretty awesome find by the team
It's a classic side-channel attack, more exactly a timing attack. It's pretty well-known in cryptography. Nice work, in a way. That's hardly a bug, but I suppose the title is more catchy.
Seriously, the people behind that paper needs to be praised as heroes.
Kinda neat explanation of virtual memory, wish had it when wrote driver for Armv8 MMU. Also not the speculative execution exploit again
That is a super cool exploit.
if we know buffer overflows in certain areas are prone to attack, can we just monitor those buffers for hack attempts?
Assembly code since the 70s here .. and yes, we're still longhaired and play music .. approaching 62 :)
Its been a bad few months for security vulnerabilities
The burning question I have after all this is... are the implementations of speculative execution flawed or is speculative execution itself flawed?
It seems like this is very similar to PACMAN except that paper breaks pointer authentication code instead of memory tag. Both takes the approach of brute forcing a 16-bit secret by abusing speculation.
The mere mention of "speculative" and "prediction" already makes my neck hair stand up...
This is why I use an abacus. Granted, AR/VR apps are tricky, but no viruses!
The JavaScript V8 engine uses a technique called NaN Boxing and Pointer Tagging which attaches the variable type inside the pointer address
So this is bruteforcing tag speculating on cpus' assumption of outcome of a code to be ran? Brilliant!
Can this be fixed with the next gen of cpus or will arm be always vulnerable to this?
I remember reading that from Aleph1 back in the day 😯seeing that paper just took me way back!
I'm pretty sure tagging is not the only memory protection mechanism, but rather additional one.
Is the browser sandbox additional around the JS sandbox or is it the same and people just call it differently?
Love this guy. Incredibly smart, incredibly articulate. Really impressed. An inspiration to us all.
2:32 I wonder if Vanguard does something similar with a virtual buffer.
What i mean is basically when it does memory scanning to keep a duplicate to always check back if something changed.
It does this on the PCIe slot iirc, in order to detect dma cheating hardware
@@Seppevh Really??? Quite interesting to know, + it makes perfect sense! Thanks!
sick video thanks
Why doesn't speculative execution invalidate and clear the speculatively cached data lines when dismissing the results of erroneously executed code?
I'm guessing it's too complicated and nobody realized this could be an issue.
The only time I ever hear about speculative exec is as a security vulnerability😂
Speaking of which, could you do a video on the *benefits* of spec exec? I’m really curious now lol
In a nutshell, branch prediction and speculative execution exist to prevent the performance hit that would come from stalling the processor until the correct outcome of the branch instruction is known.
Ever since the 486 and Pentium, CPUs have been prefetching instructions from memory and decoding them in anticipation of executing them; the difference being that the 486 would stall its pipeline until it knew which way a branch would go. The Pentium was faster in part because it would predict which way a branch would go and continue fetching and decoding (but not executing) instructions along that path. It was also able to execute instructions up to the jump point, as long as all the inputs were known (out-of-order execution). Speculative execution takes this mechanism further by out-of-order executing the instructions ahead of the branch, placing the results into temporary registers; committing them to real registers (and saving execution time) if the branch was predicted correctly.
Out-of-order execution on the Pentium was interesting, because well-optimized assembly code could actually arrange to have the inputs to a jump instruction available just as the CPU was ready to execute the jump; simply by changing the order of seemingly unrelated instructions.
I would like to know more about how they determine if the cache was filled or not
It may do wonders for performance and optimisation, but nondeterministic processing is abysmal in terms of security. Cache management, branch prediction, and speculative execution, what an unholy trinity.
despite of your wonderful presentation, why the initial lower case in the title bothers so bad? Thanks for the content
There's a lot of 'IF's in there. If you can find the right code, if you can find the tag , if you can change it, if... if.. if...
Whilst this is a possible route for an attack has anyone actually used this in the real world, not just in the research lab.
@@kevintedder4202 if anyone did, it would probably be state level threat actors. These are the kind of zero days that sell for tens of millions.
so does this affect apple silicon? could be interesting if i tried to do it on my macbook
Apple: "It's not our Apple Silicon ARM chip, you're using your Macbook wrong"
I would need to read original paper, but if behavior of speculative execution can be tweaked to put a fake tag in cache even in cases of failing, wouldn't that fix this bug?
it's another "we speculated, rewound and forgot to invalidate the cache" error. When will CPU designer learn to have cache invalidation be the default behavior in case of speculation rewind if there was a cache swap during the speculative block?
@@fluffy_tail4365 exept they never got cached, and thats how they figure out what the memory tag is, they iterate trough the numbers and see wich one was in cache, cuz thats the real one. The real exploit here is the side channel memory access.
performance hit from failed speculations would be a dog
This issue here is that there is no cache fill happening for the speculated code, which can be detected later on.
And as the wrongly speculated generates no error, they can keep trying with new tags until they found the correct one.
For me the real question is how they consistently fool the branch predictor to speculatively execute code for a branch never taken!
Because that is what bypasses the security here.
I would not call this a timing attack, but a branch algorithm attack.
@@TheEVEInspiration It's in the paper. You can see it in the short glimpse you see of the page before he zooms in (around 6:48). It says that they run the code multiple times with correct pointers and *cond_ptr true, to condition the branch predictor. They then make one guess with *cond_ptr false that triggers the speculative execution.
@@HerrNilssonOmJagFarBe Interesting, that is just changing data out after a few tries, so simple.
Somone correct me if I'm wrong but it sounds like really on the Cortex-A series is affected and the Cortex-M series is not. We make MCUs with the M22, M33, and M85 in which we these don't have the same memory instructions as the A series.
The M4 had zero cache but from M7 onward they put L1D and L1I cache in there and have been implemented with branch prediction units. A couple of years ago there was talk in the ARM dev scene that it was "theoretically" possible that side channel attacks *may* be possible during a DMA operation but I think to worry about these kind of theoretical rather than practical issues is just a waste of time.
The target hardware controlled by an MCU makes it especially difficult as they're far more locked down by definition and not designed to be running random arbitrary, compiled code off the internet. I guess if someone is crazy enough to JTAG their way into a system then anything is theoretically possible but again... the probability is vanishingly low.
Not *EVERY* ARM cpu! I moved into developing 32 bit asm on the ARM2 and even had a go at an original ARM1 BBC Micro cheese wedge which never was really a product, just a dev system.
I can categorically say that this exploit will not work on either of those CPUs as they had exactly zero kilobytes of cache :) With 4k cache on the ARM3, and a 24/26 bit address bus and processor status stuffed into the remaining 6/8 of 32 bits... I still think you'd find it impossible.
Very interesting, but when I google this I'm seeing no news on it. That's strange no? Is the sector sleeping on this issue or dismissing it?
are RDIMM ram based servers using encrypted and scrambled data also vulnerable?
Sometimes I imagine the biggest security flaw ever, one that will wreck almost every computer and grind the world to a halt for a decade as companies had to bootstrap back up to the kinds of machines capable of making more computers since those were affected too. I imagine that this security flaw is being implemented around now, by some guy in an office making a small arbitrary decision in some new architecture that nobody thinks to question and eventually makes its way into the industry standard. Eventually leading to that security flaw being discovered decades from now.
So what's left? RISC-V will be bug free now?
This is crazy smart
Pro tip: show hex values (like pointers with embedded info for tags or virtual memory) in a monospaced font. Programmers can visually parse the fields much more easily. Thanks.
Hi @LowLevelLearning I just took your course from Low-level academy... Would be great if u can add a detailed OS course to that... Also add more content for ARM and C
Broken arm
Speculated execution was always pandoras box. This is quite clear after Spectre and Meltdown. Its damn hard for chip designers and ISA designers to do it 100% correct.
Even if they do get it 100% correct, it's still going to be vulnerable to a cache timing side channel attack.
@@BrendonGreenNZL Yes true. My statement above is not precise enough. Spectre lives from the behavior of the cache itself in combination with speculative execution and branch prediction.
Interesting. I wonder if this is just ARMv8 or older variants too? And if it could be used as part of something like an iPhone jailbreak, as those run ARMv8. Really fascinating - hacking the future to pwn the present!
awesome content
It seems similar to the architecture bug which was discovered on Intel CPUs (more than 1 Generation) some 5(?) Years ago.
Were preloaded memory loaded in cache could be read w/o encryption or protection from the op-sys.
Every smartphone be quaking
Could this be exploited via WebAssembly?
I don’t know if WASM code runs inside V8 sandbox, but considering JS integration, I think it might be.
The pointers not being entirely uppercase past the 0x prefix really bothers me for some reason... Nice video though.