5. C to Assembly

Sdílet
Vložit
  • čas přidán 15. 05. 2024
  • MIT 6.172 Performance Engineering of Software Systems, Fall 2018
    Instructor: Tao B. Schardl
    View the complete course: ocw.mit.edu/6-172F18
    CZcams Playlist: • MIT 6.172 Performance ...
    This lecture focuses on how C code is implemented in x86-64 assembly. Dr. Schardl reasons through the mapping from C code to assembly in two steps: C to LLVM IR and then LLVM IR to Assembly.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

Komentáře • 144

  • @leixun
    @leixun Před 3 lety +185

    *My takeaways:*
    1. How does C code become assembly 5:20
    2. LLVM IR primer 10:25
    3. C to LLVM IR 19:45
    4. LLVM IR to assembly 48:18
    5. Case study 1:07:42

    • @Sam-AZ
      @Sam-AZ Před 2 lety +1

      Thank you.

    • @leixun
      @leixun Před 2 lety +1

      @@Sam-AZ You are welcome!

    • @frejustossou9910
      @frejustossou9910 Před 2 lety +1

      @@leixun Hello which is your Channel ?

    • @frejustossou9910
      @frejustossou9910 Před 2 lety +1

      @@leixun Done. Your Channel is very interesting.

    • @leixun
      @leixun Před 2 lety +2

      @@frejustossou9910 Thanks!

  • @klausehrhardt4481
    @klausehrhardt4481 Před 2 lety +4

    Nice to have a look on the intermediary LLVM output files and the way compilers do their work.

  • @shirleyachara3809
    @shirleyachara3809 Před 2 lety +7

    Best lecture on this topic! Thanks 🙏

  • @josephgibson2548
    @josephgibson2548 Před 4 lety +20

    Thank you for these videos!

  • @SuperlativeCG
    @SuperlativeCG Před 2 lety +49

    The only assembly language I can read are IKEA instructions.

  • @jeffpowell860
    @jeffpowell860 Před 3 lety +19

    primer, priming, primed....primmed?

  • @filipecotrimmelo7714
    @filipecotrimmelo7714 Před 2 lety +1

    This class was really amazing.

  • @jimjrivan
    @jimjrivan Před 2 měsíci

    Parabéns pela aula, professor!

  • @byarkan
    @byarkan Před 2 lety +32

    Amish man is my new favorite instructor right now

  • @bariswheel
    @bariswheel Před 2 lety +4

    Great stuff thank you

  • @norbi4148
    @norbi4148 Před 2 lety +4

    MIT logóre nézve megijedtem és hírtelen a BME MIT jutott eszembe. A mai napig PTSD-m van tőle :D

  • @piotrlenarczyk5803
    @piotrlenarczyk5803 Před 2 lety +1

    Thank you for video.

  • @user-ju1qd9ek2m
    @user-ju1qd9ek2m Před rokem

    very clear talk, thanks

  • @custodiogomesbarcellos4972

    Great content.

  • @ill_t5
    @ill_t5 Před rokem

    حقـاً شيـيء رائـع 🤍🤍.

  • @davereid-daly2205
    @davereid-daly2205 Před 2 lety +6

    Brilliant simple straight forward explanation. Why does everyone else make this seem complicated????

    • @pschneider1968
      @pschneider1968 Před 2 lety

      Because it is complicated?!

    • @davereid-daly2205
      @davereid-daly2205 Před 2 lety +4

      @@pschneider1968 I don't find it complicated. Generally, in my experience, people who don't fully understand a system struggle to teach others how it works and they tend to complicate things. I see this all the time in my area of expertise. People love to teach others, but there are not many good teachers because too few of them have an in depth grasp of the material.

    • @pschneider1968
      @pschneider1968 Před 2 lety

      @@davereid-daly2205 Don't get me wrong - the lecture is great! But for me, this is really complicated stuff... E.g. if you look at the x86-64 instruction pipeline, branch prediction, etc etc. Someone once said (I don't recall who it was), that "the true sign of advanced technology is that it's indistinguishable from magic" 😁

    • @davereid-daly2205
      @davereid-daly2205 Před 2 lety +2

      @@pschneider1968 I'm not sure what you are trying to say, to be honest. But I don't agree with your quoted statement. Magic is largely a product of misdirection, where as computers are a product of engineering and great design, two very different things.

    • @pschneider1968
      @pschneider1968 Před 2 lety

      @@davereid-daly2205 Yeah, I see that you are not getting my point. Let's leave it at that.

  • @pythontron8710
    @pythontron8710 Před 2 lety +19

    I would appreciate a Holy C to assembly course.

    • @Antagon666
      @Antagon666 Před 2 lety +12

      Functions are called Prayers, and are executed by God himself.

  • @raghav151196
    @raghav151196 Před 3 lety +7

    In the final example of fib, under LBB0_1,
    we calculate (n-1) as -------> leaq -1(%rbx), %rdi
    but calculate (n2) in 2 steps.
    Why did we do so?

    • @rickr530
      @rickr530 Před 2 lety +1

      Yes it does not make sense. It should be optimized to a single instruction: leaq -2(%rbx), %rdi.

    • @digama0
      @digama0 Před 2 lety +6

      It is because the optimization was set to -O1 in this example, instead of -O3 which can make line-by-line comparison harder. You can see on godbolt (link block seems active, see /z/nzfvh5Gfc on the compiler explorer) that clang will in fact optimize the second ADD to LEA when you crank up the optimization level. My guess is that the reason it didn't select LEA to begin with was because the second calculation was the last use of %rbx, so it figured it would reuse %rbx for the result of computing n-2, and then filling the linkage arguments is a separate stage, where %rdi gets filled. (The -O3 code is even more confusing because it actually inlines one of the recursive calls into a while loop, so it would not have made a good demonstration for the example.)

  • @deanlhouston
    @deanlhouston Před 2 lety +35

    One of the students asks why is C called 'C'... and the instructor didn't know.
    C language is called 'C' because C comes after 'B'. C was derived from and an improvement over an earlier programming language called 'B'. Nothing mysterious or philosophical was involved in naming it.

    • @josh5457
      @josh5457 Před 2 lety +8

      B itself was derived from BCPL (which came from CPL). One could imagine that C's name is a double entendre, as C comes after B alphabetically but is also the next letter in BCPL (so should C's successor be called D or P?). Honestly I doubt Ritchie and Thompson had this in mind lol but it's fun to think about

    • @jamesmillerjo
      @jamesmillerjo Před 2 lety

      And no theory has confirmed

    • @slbewsp
      @slbewsp Před 2 lety +10

      I believe the question was about phi, not about C.

    • @OEFarredondo
      @OEFarredondo Před rokem

      I'd have called it G

  • @TB-jl9fr
    @TB-jl9fr Před 2 lety +1

    Much of india dudes in lecture. Seems to be the same anywhere in that sort of lectures :D
    After watching 5min of that video i remembered why i changed the subject from embedded systems to electrical engineering xD

  • @richardhelper167
    @richardhelper167 Před 2 lety

    I wonder why in llvm ir representation of mm_base function on 25:22 parameters A and B marked as readonly. They'd better be preceded by a const keyword.

    • @digama0
      @digama0 Před 2 lety

      Adding the readonly keyword only means that writing to the pointer is UB, so as long as the C compiler knows that the pointer isn't written to and doesn't pass the pointer to something else that does, it is free to insert the attribute. LLVM can do this as well during optimization, but the C compiler knows the C spec better and infer constness in places where LLVM might not be able to.

  • @banalestorchid5814
    @banalestorchid5814 Před 2 lety

    @9:34 "Primer" is not pronounced "prim-mer" if it were then it would be written "primmer". It is "prime-er" as in "priming" you for something.

  • @CaseyAnthonyVEVO
    @CaseyAnthonyVEVO Před 2 lety

    @16:44 it might just be the network security guy in me but when I saw ICMP I thought of something other than Boolean logic LOL. How confusing.

  • @StephenCameron
    @StephenCameron Před 2 lety +2

    At 1:18:26, why did it use leaq to calculate n-1, but movq and addq to compute n-2? Why not again use leaq -2(%rbx), %rdi to compute n-2?

    • @StephenCameron
      @StephenCameron Před 2 lety +1

      To answer my own question, the answer *may* be: because leaq is done by address decode hw and add/mov are not, and so they can be executed simultaneously by the CPU, but it may not do two leaq's simultaneously. But this is just a guess, I don't actually know this.

    • @isovideo7497
      @isovideo7497 Před 2 lety +2

      The n-1 calculation needs to preserve the original n as n is used later. The n-2 term doesn't need to preserve n, so the addq is faster, and this is what you see prior to optimization. The optimizer could indeed combine the addq and movq as that is a simple multi-instruction optimization.

  • @vivekkaushik9508
    @vivekkaushik9508 Před rokem +1

    Watch this before going to bed. You'll wake up smarter.

  • @fabiobairros3582
    @fabiobairros3582 Před 2 lety +4

    Congrats for the class !! How can I implement the C operation (x % y) in Assembly ?

    • @MrGeorge1896
      @MrGeorge1896 Před 2 lety +12

      Use the DIV/IDIV divide instruction. The remainder of the division will be stored in register RDX. (=EDX in 32 bit mode)

    • @H33t3Speaks
      @H33t3Speaks Před 2 lety

      @@MrGeorge1896 Pretty sure there’s a ‘mod’ instruction between to registers or an ALU.

    • @karolmaczek
      @karolmaczek Před 2 lety +2

      @@H33t3Speaks no

    • @isovideo7497
      @isovideo7497 Před 2 lety +1

      A div instruction should also give the remainder, but if y is a constant power of two, it may use faster and-mask operations (e.g. x % 4 == x & 0x3).

    • @fabiobairros3582
      @fabiobairros3582 Před 2 lety

      @@isovideo7497 Thanks !! Do you know how the Python function pow(a, b. m) works ? what algorithm it is used ?

  • @luserdroog
    @luserdroog Před 2 lety

    The shape of the greek letter phi looks like the graph with lots of loops.

    • @StephenCameron
      @StephenCameron Před 2 lety +2

      Seems like psi would have been a better fit.

  • @93hothead
    @93hothead Před 2 lety

    im just looking at these and am completely clueless as to what is going on.... is there anyway to get used to RISC-V architecture??

    • @sergiog5543
      @sergiog5543 Před 2 lety

      no

    • @pschneider1968
      @pschneider1968 Před 2 lety +1

      @@sergiog5543 There is a reason Andrew S. Tanenbaum, in the initial newsgroup discussion "Linux is obsolete", said that Intel x86 was a "weird" architecture 😉😁

  • @marcioaso
    @marcioaso Před 4 lety +20

    I thought it would be C to Assembly, not C to LLVM.

  • @sethtrowbridge9122
    @sethtrowbridge9122 Před 2 lety +10

    you. in the background. with the squeaky chair: I don't know who you are. I don't know what you want. If you are looking for ransom I can tell you I don't have money, but what I do have are a very particular set of skills. Skills I have acquired over a very long career. Skills that make me a nightmare for people like you.

  • @kenichimori8533
    @kenichimori8533 Před 2 lety

    Define

  • @solome6478
    @solome6478 Před 2 lety

    1:10 ahh college life of pulling multiple all-nighters to cram...

  • @bagtea
    @bagtea Před 3 lety +6

    who else is strugglin on Assembly to machine :(

    • @mvisperas
      @mvisperas Před 3 lety

      Try hand assembling. You will learn how the assembly is translated to machine codes. Used to do hand calculation on relative jumps, this can teach you how positive and negative numbers work.

    • @bagtea
      @bagtea Před 3 lety +5

      @@mvisperas lol i somehow managed to learn and do well in that exam but now i forgot everything

  • @pekertimulia125
    @pekertimulia125 Před 2 lety +1

    Smalltalk and C++

  • @magno5157
    @magno5157 Před 2 lety +7

    59:34 the use of "top" and "bottom" here (referring to the fact that stack grows "downward") is very awful.
    It's much, much more common that people say "the top of the stack" and "the base of a stack frame". In which case, %rbp points to the bottom of the current stack frame and %rsp points to the top of the stack, instead of reversing the sense of direction by describing %rbp as the "top" and %rsp as the "bottom".

    • @haydenp14
      @haydenp14 Před 2 lety +1

      good insight

    • @evanconley9825
      @evanconley9825 Před 2 lety +1

      The professor's reference was accurate and is definitely the proper way to define both the "direction" that the stack "grows" as well as "where in the stack frame" each of %rsp and %rbp points to. It is exactly as shown, regardless of changing the semantics. Using the phrases "top of stack" and "bottom of stack frame" doesn't change the definitions, and that explanation doesn't result in the description you provided, because %rbp does not point to the bottom of the current stack frame, it points to the top; %rsp points to the bottom of the current stack frame.
      This is not a matter of interpretation or perspective. %rbp position 0 is near the top of the current frame, holding previous return address, and %rbp position >0 is at the top of the current frame, holding the current return address. %rbp

    • @magno5157
      @magno5157 Před 2 lety

      @@evanconley9825 I never said it was wrong. His way of referencing was just against common usage.

    • @akshatghoshal6098
      @akshatghoshal6098 Před 2 lety

      @@magno5157 i think this is how they taught me in university as well just like how this professor is teaching

    • @akshatghoshal6098
      @akshatghoshal6098 Před 2 lety

      @@magno5157 actually nevermind i dont know exactly what ur talking about since I am bad lol

  • @AndreyVarlamov
    @AndreyVarlamov Před 2 lety

    learning assembly in 2020 = reinventing wheel...

  • @day7141
    @day7141 Před 2 lety

    What’s this, only plebes use C?

    • @longlostwraith5106
      @longlostwraith5106 Před 2 lety

      C runs everything, BOI.

    • @day7141
      @day7141 Před 2 lety

      @@longlostwraith5106 You’re delusional. Crypts run everything? By that you mean Vice Lords, which are a gang of Rabbi’s.
      There’s a reason the BL’s always attack those old Jews in New York all the time. It’s because they’re the leadership of the Vice Lords. The Folk are the White people branches. The Crypts are the PoC.
      You’re delusional. These groups are all trash. They ruin lives for dollars and street corners. Gangs are stupid people trying to compete in a modern economy.

  • @darkwoodmovies
    @darkwoodmovies Před 5 měsíci

    Love this open courseware, truly thank you... but as an aside, I don't think we need to donate to a school with a $23.5 billion endowment.

    • @mitocw
      @mitocw  Před 5 měsíci +1

      The additional funds we are asking for is not survival but to thrive! MIT gives $1-2 million every year to MIT OpenCourseWare and that's not going away. We've been publishing for 20+ years now, e.g. MIT has given tens of millions of dollars away for free (not to mention the generous material contributions of all the instructors and students at MIT... which is purely voluntary). We will always be publishing courses... but we could always publish more with more money. You can help us publish more courses and help us share more knowledge. ocw.mit.edu/donate

  • @whkee
    @whkee Před 2 lety +7

    Assembly to Machine ☝️🤣

  • @tamoghnamukerjee9283
    @tamoghnamukerjee9283 Před 2 lety +3

    I am sorry, I thought I get a 'C' just to sign up and 'assemble' an imaginary study table.
    You mean to say I was wrong.

  • @jonathanjollimore4794
    @jonathanjollimore4794 Před 2 lety

    Will yea look at all the numberphile someone times yea need a good old head from logic problems and puzzles ;)

  • @justcurious1940
    @justcurious1940 Před 7 měsíci

    I was planning to dig more under 'C' but after seeing this video I lost interest, It's so messy and more complex than 'C' or even 'C++'.

  • @arturo.gonzalex
    @arturo.gonzalex Před 2 lety +2

    In european universities they teach us exactly the same. But we can study for free, and in US the price is 100k. Why?

    • @love_pets1363
      @love_pets1363 Před 2 lety +1

      It's the american dream.

    • @timothydee1507
      @timothydee1507 Před 2 lety +2

      MIT is ranked #3 in the world and like 8 of the top 10 universities are also American

    • @swarnavasamanta2628
      @swarnavasamanta2628 Před 2 lety

      The value of a degree from MIT is far more than any university from EU

    • @arturo.gonzalex
      @arturo.gonzalex Před 2 lety

      @@swarnavasamanta2628 what exactly makes it more valuable? enormous student debt?

    • @swarnavasamanta2628
      @swarnavasamanta2628 Před 2 lety

      @@arturo.gonzalex what makes it valueable is the people who are around you. Everyone who gets in these colleges are smart and works hard so you're in a good environment. And also the research opportunities in these colleges are manifolds. Not to mention there are numerous scholarships for these prestigious colleges. And you get picked up like a hot cake when you graduate from any of these colleges. European Universities are good but these are the best.

  • @ScoopexUs
    @ScoopexUs Před 2 lety +2

    This isn't really Computer Science, since it analyzes one type of compiler, not even one type of CPU let alone all types (for which Assembler works very similarly). To take the phi loop example, in another compiler there would be nothing to correspond to a phi "instruction", and the variable i would correspond to a single register (by any alias).
    If instead Assembler was taught generally, anyone could write their own compiler for a current or future CPU (by synthesis, bottom-up software engineering). They would still be as unable as you (or any professor) to parse the output of another compiler. You are left with identify patterns, part of which you understand. But what actual code is run, and how CPUs execute code, is left a mystery to CS students.
    This is why there are no hands in the air. Doing it this way leaves the subject too alien to what they've been taught for years, which is a top-down software engineering approach (yes, even for C which has all the power of Assembler without the benefits and speed.)

    • @alex_pincha
      @alex_pincha Před 2 lety +6

      6:57 "this is not a compiler class........"

    • @generessler6282
      @generessler6282 Před 2 lety +4

      He didn't mention that phi is a standard part of static single assignment, which has been part of the compiler literature for 15 years or more. Any compiler that uses SSA as an intermediate rep (which these days is virtually all) will need a phi representation. In fact the reason llvm and clang exist is to use best-of-breed compiler techniques - like SSA - in a fresh impl rather than trying to glue them into gcc. So this lecture is about compiler engineering as it affects the students in the course, which is about implementing high performing software. Whether that's computer science is a religious issue.

  • @kenichimori8533
    @kenichimori8533 Před 2 lety

    Cyrillic alphabet assembly Osakana.
    お魚

  • @josemanuelquispemamani9672

    We need spanish subtitutles

  • @filipecotrimmelo7714
    @filipecotrimmelo7714 Před 2 lety +1

    Assembly is easiest than LLVM lol

  • @setandforgetinvesting6708

    My brain can't understand this

  • @jamesmillerjo
    @jamesmillerjo Před 2 lety

    So many MIT jokes

  • @rty1955
    @rty1955 Před 2 lety +1

    OMG x86? Really? What a highly limited, brain dead processor.
    The x86 is a toy trying to compete in a grown up world.
    Im am old timer who has written about a million lines of assembler code for over 14 different CPUs. X86 has got to be the worst live coded on, the best? IBM mainframes. I've written for: Data General, General Automation 460, CDC 1700, PDP 11 series, Quotron, IBM Series/1, Intel 8080, Zilog Z80, PIC processors, TI, IBM mainframes from 1401 thru s/390. Cray, Silicon graphics and a bunch of other types of machines. I even wrote microcode for IBM 360/30 and PDP 11/44. So I am pretty well versed in many CPU architectures.
    I taught COBOL programmers how to read core dumps as well.
    I used to read 3,000 page core dumps often when major subsystems failed.
    I taught mainframe operating systems concepts at NYU in NY.
    To me, computers are tools to get a job done. Some computers do things very well others do not. Same thing for computer languages. There is no one language that is efficient for all cases. This is why I learned 12 different languages. As a programmer you should ALWAYS be learning. I was self taught IBM assembler and when I did, a big light went on. Suddenly everything made sense. I was a sponge to learn more. Things I could do in assembly could not even be dreamed of by a high level language programmer. I could do more with very limited memory by adopting assembly concepts. I made the mainframe do things that IBM said could not be done. Some code I wrote in the 80s is still running today

  • @MrTiagovla
    @MrTiagovla Před 3 lety

    Just jump to the last 10min.

  • @notgate2624
    @notgate2624 Před 2 lety +1

    Way too high-level. I feel like anyone who has heard of LLVM wouldn't get a lot out of this. No details were given on how it WORKS. He just talks about what it does.
    How to draw an owl:
    1) Draw a circle
    2) Draw the rest of the owl

  • @makerofstartup7902
    @makerofstartup7902 Před 2 lety

    This video is for complete youngsters, and seeing at 8:57 fib call you pretty safe to close this video.
    Tip: for modern software you using graphics processing, input proc and many more, but not fib algo or calls.

  • @astaghfirullahalzimastaghf3648

    I thought every lecturers
    In this university is good
    Unfortunately some are pure capitalist or mercenary
    Who didn't know to give proper lecture

    • @ASCENDANTGAMERSAGE
      @ASCENDANTGAMERSAGE Před 2 lety +10

      What?

    • @josephphillips865
      @josephphillips865 Před 2 lety

      @@ASCENDANTGAMERSAGE This university has very good lecturers however many universities seem to be all about the money rather than providing a quality education that best serves the needs of students. Ex: A school that charges a bunch of money while having low standards for hiring instructors. Students might get a degree but likely will not have credits that will transfer to a properly accredited university.

  • @illonggoako1372
    @illonggoako1372 Před 4 lety +2

    To Mark Zuckerberg this just history... museum information..

    • @tratbagd4500
      @tratbagd4500 Před 4 lety +7

      What are you talking about ?

    • @Itachi.Uchiha.Offical
      @Itachi.Uchiha.Offical Před 3 lety +7

      Yes, what are you talking about? :D

    • @piggubiggu5324
      @piggubiggu5324 Před 3 lety +1

      He's saying that Mark Zuckerberg thinks all of this is garbage. He's probably right.

    • @starc0w
      @starc0w Před 2 lety +9

      @@piggubiggu5324 No, he's definitely not right about that.
      Compilers don't fall from the sky. And someone has to understand the basics. That is essential.
      This knowledge is very valuable and important.

    • @Raison_d-etre
      @Raison_d-etre Před 2 lety +2

      He also said he was too busy to read books. He wouldn't care about politics but for its impact on his company. Why would you think Zuckerberg is a good judge of anything other than what would improve his company's bottom line?

  • @illonggoako1372
    @illonggoako1372 Před 4 lety +1

    Obsolete

    • @doggo660
      @doggo660 Před 4 lety +6

      lmao what?

    • @mvisperas
      @mvisperas Před 3 lety +10

      Assembly is not obsolete. The only language a CPU knows is the machine codes. Somebody has to write those compilers.

    • @vladusa
      @vladusa Před 2 lety +7

      whoever this commenter is has no idea what the absolute shit he's doing

    • @davidomar742
      @davidomar742 Před 2 lety +7

      go back to writing you cute little JavaScript kid

    • @vladusa
      @vladusa Před 2 lety

      @@davidomar742 u talking about me?

  • @ridwanm5789
    @ridwanm5789 Před 2 lety +1

    gcc -S untitled.c > untitled.s (am I correct?)

  • @thomasclapton2010
    @thomasclapton2010 Před rokem

    Amish man is my new favorite instructor right now