Lightning Talks: 5 Things You Didn't Know Your CPU Did For You - Matt Godbolt - C++ on Sea 2023

Sdílet
Vložit
  • čas přidán 3. 06. 2024
  • cpponsea.uk/
    ---
    Lightning Talks: 5 Things You Didn't Know Your CPU Did For You - Matt Godbolt - C++ on Sea 2023
    A very super fast tour of the magic the microarchitecture of your CPU does for you without you even knowing!
    ---
    Slides: github.com/philsquared/cppons...
    Sponsored by think-cell: www.think-cell.com/en/
    ---
    Matt Godbolt
    I'm a C++ developer who's passionate about the seemingly opposite goals of good, readable code and high performance code. I love taking the lid off and looking underneath, be it the compiler, the operating system, or even the silicon that runs everything.
    By day I write software for quantitative trading company Aquatic. By night I hack on hobby projects ranging from emulating old computers in your browser to compiler exploration tools.
    ---
    C++ on Sea is an annual C++ and coding conference, in Folkestone, in the UK.
    - Annual C++ on Sea, C++ conference: cpponsea.uk/
    - 2023 Program: cpponsea.uk/2023/schedule/
    - Twitter: / cpponsea
    ---
    CZcams Videos Filmed, Edited & Optimised by Digital Medium: events.digital-medium.co.uk
    #cpp​ #cpponsea​ #cpu
  • Věda a technologie

Komentáře • 13

  • @xDeltaF1x
    @xDeltaF1x Před 7 měsíci +14

    Does anyone have a link to the branch prediction reverse-engineering he mentioned?

    • @timcussins
      @timcussins Před 6 měsíci +7

      Look for the Half&Half paper from May 2023.
      "Half&Half: Demystifying Intel’s Directional Branch Predictors for Fast, Secure Partitioned Execution"

  • @elliotbarlas
    @elliotbarlas Před 7 měsíci +6

    Brilliant presentation, Matt!

  • @capability-snob
    @capability-snob Před 7 měsíci +4

    The CISC to RISC thing always bothers me. If you told me the php interpreter converted my code to elixir internally, I would be asking why I can't just be allowed to write the nicer language in the first place?

    • @coolcax99
      @coolcax99 Před 6 měsíci +1

      Technically risc is worse for humans to write in. Going from cisc to risc would be like going from Python to C. Simpler, sure, but more verbose and less abstract. In X86 (cisc), for example, you straight up have a memcpy instruction. In a risc system, you would have to write the same with at least 6 instructions.

    • @VFPn96kQT
      @VFPn96kQT Před 6 měsíci +2

      There are billions and billions of lines of compiled code that expect the same old instructions. You can't just make all that code to stop working.

    • @capability-snob
      @capability-snob Před 6 měsíci

      @@coolcax99 Not at all. If you want to MUL or IMUL on x86, for example, you first have save EAX and EDX if you need them, then perform the multiply, then you have to move that data to wherever you need it, and then possibly restore those registers. On a RISC system, thanks to the large number of general purpose registers and the orthogonal instruction format, you can usually just multiply specifying the registers you want to operate on. It's much more complex for the programmer.
      As to using REP to do memcpy: do modern compilers even use it? I thought the move was generally toward using SSE registers as a scratch space.

    • @capability-snob
      @capability-snob Před 6 měsíci

      turns out memcpy is implemented on pretty much every platform on most libc by prefetching, loading up a stack of registers, and then writing them out. it's not SSE though.

  • @whamer100
    @whamer100 Před 7 měsíci +1

    damn, i didn't realize my cpu was THAT smart, just wish it acted that way sometimes lmao

  • @anon_y_mousse
    @anon_y_mousse Před 7 měsíci +6

    Unfortunately for Matt's title, anyone that's ever read the manuals and/or written a compiler likely knows all of this already. The title should probably s/Didn't/Might Not Know/;s/Did/Does/;. The more important lesson is that the CPU having the access that it does to your program and its running state can do this better than your compiler, or even you writing it by hand, can do. Further, it can also do it better because it doesn't have to guess what you're trying to do with your instructions since it's blatantly obvious, instead of being split up in weird ways by either you or your compiler. This is why CISC CPU's are faster, have always been faster, and will never be beat in a like for like comparison. Realistically, development time should be spent on making more power efficient CISC chips, regardless of ABI, instead of all of the RISC-based designs that keep getting created. A completely new chip design would be awesome, especially if it was designed from the ground up without making the mistakes of every other chip that currently exists.

    • @Yupppi
      @Yupppi Před 6 měsíci +5

      Haha, people who have read the manual.

    • @coolcax99
      @coolcax99 Před 6 měsíci

      It’s not “blatantly obvious” to the cpu what you are trying to do outside of the instruction(s) currently executing. While it may have runtime information, it is limited by resources and time. Cpus actually are pretty blind to the larger structure (eg outside maybe one nested loop) of the program.
      Compilers on the other hand have practically infinite time and resources but no runtime information. Both need to cover each others blindspots.
      It is similar with risc vs cisc. Hennessy and Patterson seem to be content seeing X86 ultimately break down into risc like micro ops, but it’s not as straightforward. Cisc allows compressed information in a single instruction but now executing that instruction is more challenging; it will be harder to design the cpu pipeline and keep it busy at high frequencies. Risc allows a simpler pipeline design but blows up instruction sizes. This micro-ops thing seems like a nice compromise most have settled on but there’s hardly any conclusive evidence on it (or cisc) being the best strategy. Indeed, the extreme point of cisc designs, VLIW, has failed to catch on in the general purpose cpu space.

    • @coolcax99
      @coolcax99 Před 6 měsíci

      Also, I would argue the concepts in this talk are more familiar to an architect/cpu hardware designer rather than compiler writers and software developers that read manuals. After all, the devs are programming for an abstract machine which works quite differently from the real machine, and compiler writers are interact more with the ISA than the micro architecture when compiling code. Compilers don’t rely on features like prefetching or out of order processing to be available after all since they aren’t specified in the ISA and are usually processor specific. So compiler writers who write processor specific optimizations will probably be more intimately familiar with these concepts.
      Then again, it’s just a lightning talk - how much content can you really present in 5 minutes.