How The RIDL CPU Vulnerability Was Found
Vložit
- čas přidán 22. 07. 2024
- In this video we explore the basic ideas behind CPU vulnerabilities and have a closer look at RIDL.
This video is sponsored by Intel and their Project Circuit Breaker: www.projectcircuitbreaker.com/
How to Benchmark Code Execution Times: www.intel.com/content/dam/www...
Anders Fogh: cyber.wtf/2017/07/28/negative...
Speculose: arxiv.org/abs/1801.04084
RIDL Paper: mdsattacks.com/files/ridl.pdf
Foreshadow PoC: github.com/gregvish/l1tf-poc/...
Sebastian Österlund: osterlund.xyz/
Chapters:
00:00 - Intro & Motivation
00:57 - Concept #1: CPU Caches
01:57 - Measure Cache Access Time with rdtscp
05:00 - Concept #2: Out-of-order Execution
06:11 - CPU Pipelining
07:13 - Out-of-order Execution Example
09:19 - CPU Caching + Out-of-order Execution = Attack Idea!!
10:33 - Negative Result: Reading Kernel Memory From User Mode
13:45 - Pandoras Box
14:23 - Interview with Sebastian Österlund
17:24 - Accidental RIDL Discovery
19:31 - NULL Pointer Bug
21:50 - Investigating Root Cause
23:28 - Conclusion
24:24 - Outro
=[ ❤️ Support ]=
→ per Video: / liveoverflow
→ per Month: / @liveoverflow
=[ 🐕 Social ]=
→ Twitter: / liveoverflow
→ Instagram: / liveoverflow
→ Blog: liveoverflow.com/
→ Subreddit: / liveoverflow
→ Facebook: / liveoverflow
Your comment about the page size isn't quite correct: A modern x86 CPU fetches and writes 64 byte chunks of memory (the cache line size). The 4096 byte page size refers to the minimum chunk of memory that can be virtually addressed, i.e. mapped from virtual to physical memory. So basically, as you're watching replace "page" with "cache line" in most of this video. Page size only becomes relevant later when it comes to memory access controls.
Also, when accessing a virtual address, the processor places the virtual page number and the corresponding physical frame number in tlb for faster lookup, which also speeds up data access.
@@TheAirr13 and to ensure that no process can access the virtual memory of other processes, each entry in the TLB is tagged with its corresponding process ID, which is what Sebastian was talking in the video, wondering if the tag check can be circumvented.
Also 1’or1’ doesn’t always = 1
These aren't Linux process IDs. The
hardware functionality is called PCID. There exist only 4096 different ones. Linux uses effectively only 6 (+ the upper bits for meltdown mitigation). So whenever a new thread is scheduled the TLB doesn't have to be flushed all the time.
And to my knowledge the check cannot be circumvented. I did some experiments and wasn't successful
@DNA I am not a Linux kernel engineer but to my understanding you are right. The PCID stuff is necessary to mitigate a TLB flush everytime a process switches between Kernel and userspace. Each process has a part which is running in userspace and a part which is running in kernelspace. PCID increases the switching speed. I remember I read that a few months ago but I did not understand it fully yet. But from within the kernel you can leak everything.
I love how a negative result was so pivotal.
I discovered this channel 5 years ago, thanks to the reverse engineering playlist. I took CS in Uni 3 years ago, inspired by this channel. Some months ago i started writing my thesis on the formalization of relaxed memory models and their speculative behaviour, and today this video is uploaded. What a journey, Live :)
Congrats and good luck with the thesis man. I know how you feel, I finally decided what master I want to do thanks to this channel and got also inspired to take CS
@@naxneedssomeprivacy You can find a playlist right in this channel.
I really respect Intel for not only taking silicon vulnerabilities so seriously, not only starting a bug bounty program, but sponsoring people to promote it by analyzing existing bugs. This is dedication, and I really hope we see more companies treat security in this way. I've seen more and more companies start bug bounty programs recently, and it's definitely a move in the right direction.
The reason they make this is to make sure no one else finds their backdoor like they did on the celeron.
@@MikaelIsaksson that... Doesn't make any sense
@@DanKaschel sure it does. Now they can have a bunch of really smart people trying to find it. If they don't, great. Now we can feel like bit more sure it won't be found in the wild. If they do, oops, a "vulnerability" better fix it. To be clear, it's really hypocritical from them to care about hardware vulnerabilities when they have put them in on purpose in the past. If you didn't know they crammed in a small operating system in the CPU that could be accessed from user level by calling secret opcodes, elevating following commands to above ring 0. Basically a hardware trojan.
@@MikaelIsaksson
Did they become self conscious and stop doing that in newer generations of cups, or do they more effort into hiding the hw trojans better?
While I might be a bit biased, I really have to say that this video turned out extremely nice! Great job explaining this in a very easy to follow way!
I never had someone explain branch prediction so well to me. Thank lord.
🙂 yes
This video doesn't really talk about branch prediction, but rather only speculative execution.
Branch prediction is only concerned with conditional jumps like JNZ (jump if not zero). It is a function in the CPU looking for patterns in whether a certain conditional jump is taken or not and tells the CPU which branch to load into the pipeline (for older CPUs, before speculative execution) or which branch to speculatively execute (for modern CPUs). Note that some CPUs may speculatively execute both branches (jump taken as well as not taken), the branch predictor would merely tell the CPU which branch to prefer when neither branch is stalled (waiting for memory or slow computation result).
"The forty-twoth page" really gets me.
Forty-second.
Fourty-tooth.
I looked it up.
English is still weird about number names, where the 1st, 2nd, and 3rd numbers in each group of 10 starting at 20 have separate names -- but at least it ain't French or Danish!
21: twenty-first (21st) note ...th
22: twenty-second (22nd) note ...nd
23: twenty-third (23rd) note ...rd
24: twenty-fourth (24th) note ...th
25: twenty-fifth (25th) note...th
etc.
Same for 31, 32, 33, ... 41, 42, 43... etc.
I'm also a German and French speaker so I can relate -- I ALWAYS forget that 81 and 91 in French DOESN'T use the "-et-" before the "un" or "onze" but it does in 21, 31, 41, 51, 61, 71 ("...et-onze") -- but not 81 and 91 as they are "too long" for adding the "-et-".
GAHHHHH!!!!
What a great timing of that upload hence I just read about them but didnt know how you would discover something like this
You actually show out-of-order-execution (; ) vulnerabilities, like meltdown. Speculative execution (foo: xor rax, rax; jnz bar; jmp foo; bar: ) vulnerabilities like spectre are slightly different concepts. The first class is afaik intel-only, the second class is an issue for other modern CPUs of other ISAs too.
@DNA Cortex-A75 and IBMs Power microarchitecture seem to be also affected…but basically all modern (till 2019 I guess) Intel CPUs, so, this is basically a Intel-issue. the IMHO more useful speculative execution vulnerability, which can be triggered without a signal handler and therefore could not be mitigated by the kernel that simple and can also be done in non-native code like javascript, also affects a lot of other CPUs.
@@RepublikSivizien Meltdown is far not Intel only. Btw, "signal handler" can be avoided by self-modifying code, like changing nops into jmp right before transient instructions. Have never heard about this method before but it was also worked.
@@PS-bp4ju: That is spectre, not meltdown. You might have luck with the illegal out-of-order instruction in a thread. It should be possible that an illegal instruction in a child does not kill the parent, but it must be on the same core due to cache, iirc.
Awesome video! A difficult topic but very well explained and broken down to smaller pieces!
Super interesting, thanks for sharing and the great editing/research. Love your channel, huge fan!
waaaaaaaau, I always wanna understand that issue and you just explained it briliantly! I salute you, man!
Amazing video! You interested me in security years ago and at finally ended up on DEFCON CTF. Might bait me into CPU bugs now...
Amazing video to start digging CPU vulnerabilities!
4:12 I don't know if anybody has said this yet, there are only 243 comments right now, but:
What you pronounced as "fourty-two'th" should be "fourty-second"
.
In general: good job. Your work is appreciated.
Always awesome content @liveoverflow!
In my view, every field should have journals of negative results. I had no idea that the history of the speculative execution vulnerabilities was so rich.
I mean, they do. Scientific journals very frequently publish negative results.
Amazing video! I hope we get more content on hardware-type vulnerabilities and “hacking”!
I made my Bachelor's thesis about RIDL, it was awesome! 😍 I basically used it to leak the hash of the root password of my Professor 's PC remotely through ssh. Cool video, thank you !
This is one of the best video's you've posted. Well done!
This was awesome! Been grinding through your binary exploitation playlist. Keep it up🔥
Very interesting topic. I must admit I didn't understand 100% of everything but it definitely gave a nice insight into the topic.
If anyone else wants more videos like this to watch, Christopher Domas' Defcon talks on x86 architecture are extremely fascinating.
The dude probably has the Intel architecture documents as light bedside reading lol. He did write "reductio ad absurdum" which is a program with 13 lines of x64 assembly and is turing complete.
I did not understand much of the video but still find it intresting
shoutout to intel for sponsoring this, lol!
amazing video as always
Holy smokes, i was waiting on this one ! Big Thanks.
Love watching your videos man. Amazing detail.
Big props to you and intel for doing this!
42 TOOTH lmao. These things just make my day. Thank you!
this reminded me of Chris Domas on his research on the x86 instruction set. loved his defcon talks
The research was 🤯, think time to start exploring micro architecture
This video is just pure Gold. Thx
Thank you for this high quality content!
Very nice video! I wish I understood 100% of it!
In reality bug bounties are the most cost effective way to handle security related topics, as you find the people who are very vested in the topic spending countless hours that you don't have to pay for. Then just pay for the result.
I am surprised it took them so long to find someone that figured that out O_o
This is fascinating 👏 This is a very great video and in depth explanation. I love your channel 😃 keep it up sir
great video man the fact that intel sponsored the video is crazy haha
Anyone knows what's that IDE theme (2:50)? Looks nice
This shows the importance of publishing negative results! In some areas of research, negative results never see the light of day because they have a much smaller chance of getting accepted into journals. I think this needs to change!
nice video! very informative and relatable
great video! very educational
I was at eurobsdcon in 2017, and someone modified the kernel to exit instead of throwing an segfault. I didn't understand at the moment, but now i think this could mitigate this bug.
Maybe we rely to much on bugy code that segfaults are not handled critical enough..
Amazing video
Great Work ❤️
Could you maybe look into the USB-JTAG vulnerability on older Intel CPUs?
Thread is a kernel side term for process, to be specific thread whose id is the same as the thread group id is a process, while thread whose id belongs to a different thread group id is a thread in the userspace sense.
love for your super explanation.
There is a talk/video from 33C3 back in 2016 titled "What could possibly go wrong with (insert x86 instruction here)?" which goes through the CPU cache side-channel attacks.
Can you tell from where your intro music / medley is from or who prdocued it? Cheers!
How would the speculative execution behave if one instruction *will* change the opcode of one of the next instructions? I know it's not the usual case for the executable code to change the next executable instructions, but it's still possible to do this, right?
4:12 I'm sorry but the forty twoth (?) is triggering me so much . . . nonono, forty second (!) :(
What was the "small mistake" the initial blog/paper missed in exploiting leaking kernel memory?
What a great video !
4:10 42th? :DDDD I think you meant 42nd?
Where can I find the code you show in the video at 18:30?
Spectre and Meltdown really changed the way we look at malware
VUSEC gives great courses by the way!
They teach it at the Vrije Universiteit Amsterdam
In the courses I took we got to reproduce one of their papers actually. I reproduced GLitch :)
High quality content fr
Awesome!!!
Yes cpu pipelining has been here for ages(Motorola 68040 from 1990 was pipelined, 386 and 486 definietly was pipelined).
Out of Order Execution goes back to pentium pro(pentium 2).
I somehow expected superscalar to be mentioned as that came before Out of Order Execution, but I see how it wasn't super relevant to the video.
Intel: Bounties are too expensive, we need to hire a hacker on the cheap... 😂🤣
This again is a great showcase of the outstanding cyber security research going on in germany! No matter whether its the CISPA in Saarbrücken or the HGI in Bochum.
Developing CPU attacks? Standardizing the new post-quantum cryptography schemes? Germany takes a major role there!
Of course our neighbours from the netherlands and other universities are also very good;)
Amazing!!!
If checking cache access times after an invalid access is how you have to exploit any of these, can't you just have the kernel flush the cache completely before it calls the sigsegv handler?
This might mitigate out-of-order-execution vulnerabilities like meltdown, but not speculative-execution vulnerabilities like spectre. In the latter, there are no segfaults.
"42th page" was kinda painful
2:04 compiler explorer is everywhere
bit of a random question, but what kind of shop would I find club mate in? is it just any old supermarket, or do i have to go to a special mate shop? (assuming im already in germany)
You can find Club Mate in a lot of normal supermarkets, e.g. REWE or Edeka, but your best chances are in beverage markets, where there might also be other types of Mate (e.g. Mio Mio) or other lesser known types of beverages.
@@felixe2890 thanks!
Wish i could understand what you said. I was intrested none the less :)
11:42 how could I get a kernel address? doesn't my process use virtual memory? I should not be able to address kernel pages at all... ?
EDIT: ok, if it's just another user process, it's not weird. But reading kernel memory still eludes me.
Have you considered doing more general overviews/tutorials related to programming oriented towards a more professional audience? While I love computer science, your channel is one of the very few that has managed to keep me interested. Of the programming channels I have tried watching, most are either lengthy tutorials for complete beginners or short overviews of frameworks/libraries. I wish there was a place I could find programming deep dives on more advanced/novel concepts while assuming some industry experience from the viewer.
maybe the interest is a "you-problem"
This is more of a defcon-style approach, which the general hacker community has. I'm sure if you want a more professional-audience catered style, you could look at Def Con or BlackHat conference talks. If you're looking for much different I'll tell you now that most of the audience does not want that.
What a great video
Really was looking for it. So nice, that Intel actually contacted you, since they reacted quite "salty" to the doings of one of my lecturers (whom I admire, you might know him: Michael Schwarz). Really really cool video! :) He tought us about fencing etc. and the simplicity of analyzing the "performance" via plotting a histogram. No big ML needed here. :D I don't know, but the segfault handler seems either like a really useful feature or as if you shot yourself in the foot. xD
Interestingly enough in 2017 i watched the Computer Scienece CrashCourse Videos and when they mentioned caches and pipelining i thought of if you could measure the cache access time of forbidden variables. But i brushed it off, thinking that when the CPU miss predicts it would also flush the cache.
I would like a video on what microcode is and how it can fix these problems.
Did I understand correctly?
The parent code will try to make read on secret value, which is same address on both processes, and speculative execution will run it. The speculative execution will run with actual secret value, and then it will learn that it made error because the secret's value in parent process is nullptr. Then it will trigger exception. And then we can't simply check which page table is loaded very fast.
with those steps of:
1. prepare weird payload (something known that shouldn't work)
2. use it
3. measure
seems awfully like how people use cheatengine.... interesting.
Hope you have a great day & Safe Travels!
Thanks for explaining, you have such great energy
Cool!
Now do the TPU and the baked in "Management" ROMS. ;-)
I'm shocked this is the thumbnail that won the poll
when the sponsor wants them to talk bad about him, that's wild!
so can this be automated?
very good explaination, but looking at the example code: does fixing something like this make sense vs. losing performance??
Personally, I would not bother and dismiss this edge case finding. Nobody should even be able to execute arbitrary code anyway, plus with knowledge about the issue, software if required can guard itself from these flaws.
What CPU manufacturers plan to do with these vulnerabilities?
I believe that in the accidental discovery, you need to guarantee that you are running both process in the same core...
P.S.: It's curious how the video approaches RIDL without the necessity of talk about Meltdown.. time really goes fast...
Thanks for the video.
18:40 "I hope this code looks familiar"
Me who only used nested loops for printing stars
Isn't it fun seeing the wheels turn inside the minds of incredibly intelligent people?
15:33 this is what students are supposed to do when writing their thesis :)
Probably meant 42nd as in forty second instead of 42th? :)
21:55 is this him rocking Grado cans with shipibo pads???
okay! i new challenge for you (I don't know how to do it).
How to get firmware(bootloader+os+app) from an embedded system (from device not from url). I don't know where uart and jtag interfaces on the device and there might be some flush mechanisms or read-write protection which i don't know.
Amazing video... name should be How to find new class of vulnerabilities 😅
Really good stuff stuff gave me some idea's ,ll defe provide credit if its holds cve :D
We are crowd sourcing practically for free the work that intel should be doing their self.
To be fair, it did take collaboration between several security researchers to find this class of bugs. I don't know if Intel is to blame here when it seems this could affect any type of processor of any architecture.
@D M except intel has the source code (VHDL or Verilog) for the circuitry. They could analyse it much easier.
the reality is intel couldn't hire enough people to find these kind of bugs. The best situation is having countless people trying to exploit the systems and having a meaningful reward for finding them so that they can then be fixed. This is true of all companies not just intel. bug bounties are great as they are open to everyone who wants to give it a try they just need to have good enough rewards to make them be worth turning in over the black market.
Then intel would just end up with all the people working security research (hardware and software) and keep going on in that loop. There is something called a product development cycle and there are a lot of additional new things being researched on.
No one writes bug free code, its how they approach their mistakes and fixes makes them better.
Plus this is a global scale research and thats how all bug bounties work.
@@tomaspecl1082 The issue comes not from the source code for the circuitry. It comes from architecture, and this is a hard topic to reason about till the field exploded in 2018.
I feel like a 10x hardware hacker now!!!🤪
you are so pro
Thanks for this very good and interesting video! I personally like these low level / computer architecture videos a lot more.
A thread is an actual code being executed. A process is the container that has addressing and other process data including the thread. A thread is executed not a process.
4:13 "Forty-second", not "Forty-tooth"
Isn't the release of failed results (15:34) contradicting guidelines for ethical disclosure of vulnerabilities?
I mean, the bad guys might already have managed to use these informations to figure out the remaining piece of the puzzle before researchers did and whence also before intel would have the opportunity to fix it?
So, refering to some other comments I read:
I agree that sharing negative results is a good idea, but just from the scientific perspective!
Taking into account the above mentioned negative side-effects, this may be a bad idea for IT-Security.
What do you think?