What Happens After The Compiler in C++ (How Linking Works) - Anders Schau Knatten - C++ on Sea 2023

Sdílet
Vložit
  • čas přidán 10. 09. 2023
  • cpponsea.uk/
    ---
    What Happens After The Compiler in C++ (How Linking Works) - Anders Schau Knatten - C++ on Sea 2023
    We all know roughly what the compiler does, it translates your source code into machine code. Thanks to Compiler Explorer, many of us now even look at the generated Assembly code.
    But wait a minute, that code is full of labels and function names, the CPU knows of no such things! Most of these are also defined in different compilation units, how can we jump to code we don't know where comes from? And even for our own compilation unit, how can the compiler know where in memory the machine code will eventually be loaded, so it can generate the right jumps? Even worse, what if that function comes from a dynamic shared object?
    This talk gives an introduction to how the compiler, linker, loader and operating system cooperates to get from a compilation unit to a running process. We'll look at static and dynamic linking, relocations, position independent code, sections and segments and virtual memory. The talk covers Linux only, but similar principles apply on Windows and Mac.
    ---
    Slides: github.com/philsquared/cppons...
    Sponsored By think-cell: www.think-cell.com/en/
    ---
    Anders Schau Knatten
    Anders started programming in Turbo Pascal in 1995, and has been programming professionally in various languages since 2001. He's currently a principal developer at Ascenium, working on a new general-purpose CPU design. He's the author of cppquiz.org and the blog C++ on a Friday.
    ---
    C++ on Sea is an annual C++ and coding conference, in Folkestone, in the UK.
    - Annual C++ on Sea, C++ conference: cpponsea.uk/
    - 2023 Program: cpponsea.uk/2023/schedule/
    - Twitter: / cpponsea
    ---
    CZcams Videos Filmed, Edited & Optimised by Digital Medium: events.digital-medium.co.uk
    #cpp​ #cpponsea​ #compiler
  • Věda a technologie

Komentáře • 22

  • @deckard5pegasus673
    @deckard5pegasus673 Před 7 měsíci +5

    50:14
    -fpic = enforce memory limits on the size of the GOT.
    -fPIC = no size limit for the GOT

  • @hedgechasing
    @hedgechasing Před 8 měsíci +17

    Around 8:50 the mov eax, 0 before the call is actually not about main returning zero if you don’t specify anything, but actually part of the C abi. The functions here are written with nothing in the parenthesis as is normal in C++ but in C empty parenthesis does not mean no arguments, it actually declares a K&R style function with an unknown number of arguments of unknown types. The actually definition would need to specify args in order to use them, but callers could just write extern void whatever() to declare them since K&R function calls are not type checked. What the 0 specifically represents is the number of vector arguments (usually floating point values) passed to the function. Variadic functions need to know how many registers to save and so that value allows them to have an upper bound. If the empty parens were replaced with (void) the mov eax, 0 would go away even on -O0 and without making that change it will persist even at higher levels of optimization (assuming that the two functions are actually in two translation units so the function doesn’t get inlined)

    • @andersknatten
      @andersknatten Před 7 měsíci +3

      Thanks for correcting that! I had forgotten about this difference between C and C++. I'm mostly writing C++, I guess it shows.:)

  • @Byynx
    @Byynx Před 3 měsíci +1

    This video is a gem!!!

  • @nitsanbh
    @nitsanbh Před 8 měsíci +4

    I learned so much from this talk,
    Thank you!

  • @widnyj5561
    @widnyj5561 Před 8 měsíci +2

    The part about function calling near the end was the most interesting

  • @alx9r
    @alx9r Před 8 měsíci +3

    I can also recommend James McNellis’ “Everything you wanted to know about DLLs” on this topic.

  • @denisfedotov6954
    @denisfedotov6954 Před 8 měsíci +5

    Nice talk! However, lazy binding is disabled by default in modern Linux distributions as one of the attack mitigation techniques so that plt table is read-only during program execution. This is known as RELRO.

  • @rezwanarefin3493
    @rezwanarefin3493 Před 5 měsíci

    18:05 Actually the compiler does know which compute() function you are calling in this example, compute() was in the same file. In fact, even at -O1 it will remove the call and inline compute(). The compiler wouldn't know that if compute() was not available in the same translation unit.

  • @VincentZalzal
    @VincentZalzal Před 8 měsíci +1

    Excellent talk, the clearest I've seen on this topic!

    • @cpponsea
      @cpponsea  Před 8 měsíci

      Great to hear! Thank you for your comment.

    • @andersknatten
      @andersknatten Před 7 měsíci +1

      Thank you! I'm very happy to hear that.

  • @Danielm103
    @Danielm103 Před 8 měsíci +1

    Awesome talk, interested to know what, Use Link Time Code Generation, and other optimizations like COMDAT folding and /OPT:REF do

  • @dascandy
    @dascandy Před 8 měsíci +1

    @6:21 Middle line on the right has "48 89 e5" which is the start of your compute function, bottom left has "b8 01 00 00 00" which moves 0 into eax, followed by 5d (pop ebp) and c3 (ret).

    • @andersknatten
      @andersknatten Před 8 měsíci +2

      Yeah, that's what I'm trying to point out at @7:25 too.

  • @gustavbw
    @gustavbw Před 7 měsíci

    53:20 (on lazy-loading): I understand the concept as being partially preparing data when declared, and only loading the full extent when used (or not even then) - or, disguising accessing some data as actually fetching it first, meaning it is declared, you can reference it, but it's not actually there. Instead the instructions to get it there is.
    What you're describing sounds to me like caching - i.e. storing the output of some functionality in an easily accessible way so that you do not have to invoke said functionality again. But I might be off here (also I come from a very much not systems/compiler background so I completely understand if "lazy-loading" is the term used for it in your field).
    Side question: Would this mean that you could have runtime dynamic linking if you implemented cache invalidation for this step of the process? (i.e. be able to change bits of the machine code as stored on disk, which when the invalidation occurs, would take effect?)

    • @andersknatten
      @andersknatten Před 6 měsíci

      Yes, lazy loading is a good way to describe this!
      I guess you could do some sort of runtime dynamic linking if you had some way of resetting the GOT to point back into the PLT stub and then convince the dynamic linker/loader to load something different next time. Provided that you have prepared GOT/PLT entries for everything during compilation. Depending on what you mean by runtime dynamic linking of course, I'm just replying very generally here.
      Note, btw, that we never change any *machine code* here, we only change data. It's just pointers in the GOT that are updated, from pointing at the stubs in the PLT to pointing at the real functions.

  • @rudalert
    @rudalert Před 8 měsíci

    Thank you for the interesting talk!
    Question about the last chapter: will the loader copy ("load") the function from the shared object into the .got section? I am confused how the state (if the function has any) is differentiated between the processes using the same shared object.

    • @andersknatten
      @andersknatten Před 7 měsíci +1

      What kind of state are you thinking of? If you're thinking of function arguments and local variables, these go on the stack or in registers, which are unique to each process and in fact each invocation in that process. If you're thinking of local static variables, these go in data sections like `.data`, which each process gets a unique copy of. Only the read-only segments are shared between processes.

  • @mikefochtman7164
    @mikefochtman7164 Před 7 měsíci

    Boy this explains a lot of nitty-gritty details. We had an application that required several separate processes to have access to a large block of common memory (about 64kbytes). We did this by defining a large int-array in a shared object and initializing it to non-zero. This was 20 some years ago, it might not still work, I don't know. But by initializing it the array was put in the shared .data segment. So each process had access to the same large array and one process could 'see' what another process wrote. (yes, there were other concerns about collisions and such, but the gist of it was that the DLL and its .data segment where shared by all)

    • @kayakMike1000
      @kayakMike1000 Před 7 měsíci

      Well, I suppose you could put a lock on that shared memory to ensure concurrent integrity.