Linkers, Loaders and Shared Libraries in Windows, Linux, and C++ - Ofek Shilon - CppCon 2023

Sdílet
Vložit
  • čas přidán 22. 01. 2024
  • cppcon.org/
    ---
    Linkers, Loaders and Shared Libraries in Windows, Linux, and C++ - Ofek Shilon - CppCon 2023
    github.com/CppCon/CppCon2023
    This talk would give a crash-intro to linkers, loaders and the layout of program binaries, and explore just enough internals to understand some observable differences in C++ builds between Linux and Windows.
    We will discuss the GOT, the PLT, symbol visibility, interposition, lazy binding and more. There will be a lot of details, but also a lot of 'why's and opinions.
    We will also touch/rant on what the C++ standard has to say on adjacent matters. There's a good chance you've heard before "shared libraries are outside the scope of the standard", but it doesn't mean what you think it does.
    ---
    Ofek Shilon
    A Mathematics MA by training, but a 20Y C++ developer, writer and speaker in both the Linux and MS universes. Fascinated by compilers, debuggers and pretty much anything low level. Fiercely hated by his cat for no apparent reason.
    ---
    Videos Filmed & Edited by Bash Films: www.BashFilms.com
    CZcams Channel Managed by Digital Medium Ltd: events.digital-medium.co.uk
    ---
    Registration for CppCon: cppcon.org/registration/
    #cppcon #cppprogramming #cpp
  • Věda a technologie

Komentáře • 38

  • @origamibulldoser1618
    @origamibulldoser1618 Před 4 měsíci +20

    Once again a very interesting talk from Ofek. Excellent speaker. Clear and concise.

  • @jopa19991
    @jopa19991 Před 4 měsíci +5

    Amazing talk. The more complex you project gets, the more useful this information comes.

  • @sjswitzer1
    @sjswitzer1 Před 4 měsíci +6

    Excellent talk. This thoroughly addresses so many nagging questions I’ve had for ages.

  • @gjvdspam
    @gjvdspam Před 3 měsíci +1

    I like this talk a lot. Very good speaker. Clear, calm, nice to listen and interesting

  • @AlexVaisman
    @AlexVaisman Před 4 měsíci +4

    Great presentation. special thanks for the resources list.

  • @niklkelbon3662
    @niklkelbon3662 Před 4 měsíci +3

    Thank you, i think this is most useful hour on cppcon
    About interposition, both gcc and clang use it to create singletones + plugins and this functional do not work on windows =(
    Another example - pointers of global contants, for example 'type id', so you can compare pointers to compare type, while on windows you will be required to store + compare strings
    So, it is VERY big thing for removing difference between dynamic and static libraries, on linux you literaly cannot 'observe' it, while on windows you are required to know about it

  • @athavale1989
    @athavale1989 Před 4 měsíci +2

    Legendary talk, I will come back to this from time to time

  • @jankrasinski1265
    @jankrasinski1265 Před 3 měsíci +1

    Very interesting talk! Answering Ofek question - I use sometime interposition/ld_preload for debugging purposes - to block some not relevant calls and be able to reach particular execution flow or add some simple debug version of new/delete with extra logging, canaries etc - kind of usefull when working on some custom hardware without asan. But I have never found a case when this has been treated as a feature in production code.

  • @JohnDoe4321
    @JohnDoe4321 Před 4 měsíci +12

    One consequence of the differences in Windows vs. POSIX symbol handling is that there is no such thing as *the* C runtime or *the* STL on Windows. Every module (EXE or DLL) choses which C/C++ standard library it uses. You can't assume that memory allocated by malloc/new in one module can be released by free/delete in a different module. You can't assume that STL objects like std::vector can be shared across modules.
    You can avoid these problems if you control all of the modules, can dictate how they're built, and force them to use the same standard runtimes. But it only takes one external DLL dependency to ruin that plan.

    • @ark.1424
      @ark.1424 Před 4 měsíci

      Yes this is how STL in Windows works. I did not get what is wrong with that?
      Isn’t it great that 3rd party DLL can choose which STL to use?
      Imagine you are linking 10 dynamic libs from 10 vendors, so in Windows you are fine with any versions as long as the interface remains the same. Additionally to that, in Linux you also have to load only libs compiled with the same STL version.
      So in Windows it is enough to have one DLL for any program, and in Linux you should have one lib per used compiler/STL.
      Another thing is maintaining (bugfix, optimization), in Windows you can use latest and greatest version, in Linux it depends if the changes are part of the required version.
      It is all about design, both have pros/cons for sure.
      In Windows you can have more RAM be used for shared libs with such an approach, but there is also optimization in Windows kernel to load the DLL to RAM just once (code section) and then map to all processes. Not a real issue.
      What else is wrong with Windows approach?
      IMHO it is more flexible and easier to maintain

    • @ark.1424
      @ark.1424 Před 4 měsíci

      The C and STL runtime is implemented in “The Visual C++ Redistributable installs Microsoft C and C++ (MSVC) runtime libraries. “ for Visual C++
      It is up to developer to use dynamic runtime or static (compelled into binary) runtime for EXE or DLL

    • @JohnDoe4321
      @JohnDoe4321 Před 4 měsíci +2

      ​@@ark.1424There is nothing "wrong" with how Windows handles this. But it does force you to think about library design, and what types are used in public interfaces. Some things might be a "best practice" on POSIX are a "requirement" on Windows.
      It isn't just the STL. You can't freely pass C runtime "objects" across DLL boundaries either (e.g. FILE* or int file descriptors). And dynamically allocated memory must be freed by the module that allocated it. I've encountered many C libraries that originated on POSIX that have problems with this.

    • @maksymiliank5135
      @maksymiliank5135 Před 3 měsíci +1

      @@JohnDoe4321So what you are saying is that every lib on windows should export their version of a "free" function? For example: void some_object_free(some_object* ptr) { free(ptr); }

  • @djouze00
    @djouze00 Před 4 měsíci +2

    Amazing talk about a very interesting subject!

  • @MrAmorphX
    @MrAmorphX Před 4 měsíci +6

    After 10 years of C++ development this topic is still highly relevant for me. Nevertheless the linkage types, symbols visibility, etc seems pretty basic knowledge, which is known by not many though. Why? Its questionable. Maybe everything most bothering of a business logic.
    BTW while build systems may cover all aforementioned switches, which are really not related to C++ language, the __attribute__ and __declspec things do related and seems not convenient for me from the syntax perspective

    • @nullplan01
      @nullplan01 Před 4 měsíci +1

      Thankfully, both Windows and POSIX allow you to put the information into auxiliary files (DEF files in Windows, the dynamic list or version script in POSIX), so you don't have to clutter up the source code with them.

  • @ElPikacupacabra
    @ElPikacupacabra Před 4 měsíci +3

    Excellent talk!

  • @rostislavstelmach9168
    @rostislavstelmach9168 Před 3 měsíci

    Great talk! Thank you!

  • @kspangsege
    @kspangsege Před 4 měsíci

    Learned a lot here. Thanks for the great talk.

  • @darranrowe174
    @darranrowe174 Před 2 měsíci

    29:05 There is one important restriction for lazy binding/delay loading on Windows. That is you cannot delay load kernel32.dll. The specific reasons are two fold, first, the delay loading mechanism cannot delay load individual functions from a library. Second, the runtime dynamic linking functions are exported from kernel32.dll. The delay loading mechanism is implemented in terms of runtime dynamic linking.

  • @ark.1424
    @ark.1424 Před 4 měsíci +1

    Grate interesting talk!
    I did not know there are such a lot of runtime overhead for calling functions in Linux compared to Windows

    • @denisfedotov6954
      @denisfedotov6954 Před 4 měsíci

      Linux indeed does have runtime overhead, but I think it wasn't covered in the talk. On Windows, if you declare an imported function as __declspec(dllimport), the compiler is able to effectively inline the PLT function so that there are 2 branches on Linux versus 1 branch on Windows.

  • @victoreijkhout7115
    @victoreijkhout7115 Před 4 měsíci

    Very interesting stuff!

  • @spacechild2
    @spacechild2 Před 4 měsíci

    Great talk!

  • @LucasSantos-ji1zp
    @LucasSantos-ji1zp Před 4 měsíci +1

    4:50
    Windows has a strategy for sharing executables even when they need a relocation. Raymond Chend talks about it on the article "What is DLL import binding?".

    • @JohnDoe4321
      @JohnDoe4321 Před 4 měsíci +1

      Import binding is rarely used these days. Address space randomization (ASLR) made it obsolete.

  • @ark.1424
    @ark.1424 Před 4 měsíci +1

    I did not get what happen with ordering of resolving names for lazy binding in Linux.
    Let say we have 2 delayed load functions from 2 libs.
    Let say func1 is called first, then lib1 is loaded - can the lib1 also resolve func2?
    In other exe run, func2 is called first, so lib2 is loaded before lib1.
    Meaning the func2 can have 2 different implementation (depending on which of func1 or func2 is called first) that is runtime execution side effect, right?

    • @Bokto1
      @Bokto1 Před 4 měsíci

      Lazy binding of symbols != Lazy shared object load. AFAIK if we don't talk about "dlopen()" or LD_PRELOAD, the order of resolution is bound by the filesystem state.
      And the three cases when it's not (dl, preload, messing with the FS) is a direct request for a runtime side effects

  • @Sarbajit-ye8pj
    @Sarbajit-ye8pj Před 4 měsíci

    Anyone has link to the discord channel where this presentation is uploaded?

    • @Roibarkan
      @Roibarkan Před 4 měsíci +3

      Go to the github page mentioned in the video description, where all the slides have been uploaded, and look at Presentations/Shared_Librariess_CppCon_2023.pdf

  • @MonochromeWench
    @MonochromeWench Před 4 měsíci +2

    Windows dll design predating C++ considerations, yeah mostly, dlls started out as a solution in memory constrained 16-bit systems (optimise for size or speed is not a choice you must do both). The requirements for a 16-bit platform really don't match a modern 64-bit platform and Microsoft wont make breaking changes here and neither will anyone else it seems regarding shared libraries.

  • @Bokto1
    @Bokto1 Před 4 měsíci

    I truly detest what the presenter suggest. Interposition and lazy loading makes debugging, testing and tooling binaries easier. Removing them from defaults of a tool chain would make things like fuzzing and working with legacy binaries a hell. Linux caters to developers and this helps to get more good quality software.
    Otherwise a great talk, a treasure trove of info, thanks.

  • @yaroslavpanych2067
    @yaroslavpanych2067 Před 4 měsíci

    16:00 okay, that looks like bs. What Windows has to do with that c++ standard rule? c++ rules end right after compiler is done. What has import export has to do with what already has happened during creation of dll? When DLL is being linked, all those questions with new are being resolved. Symbols resolution rules completely provide ability to conform that rule. Object files searched first, than import libraries and static libraries. Therefore rule is followed

    • @ALivingDinosaur
      @ALivingDinosaur Před 4 měsíci

      Your argument is valid only if you apply the term "program" to a single DLL. The fact that a shared library can define its own "operator new" and/or "operator delete" without exporting them means foo() cannot safely call "delete" for a pointer returned by "new" in bar() if foo() and bar() happen to be in different binaries. In Linux that is non-default behavior but the order in which symbols with "default" visibility are resolved by the loader makes the clause possible. In Windows exported symbols are visible only to direct importers so there's no not-too-hacky way of making sure an EXE itself and every DLL it loads use the same implementation of memory allocation functions (even standard ones if different binaries are linked to different versions of VCRedist libraries).

    • @yaroslavpanych2067
      @yaroslavpanych2067 Před 4 měsíci

      @@ALivingDinosaur Well, again, now tell me please if Shared library is part of the "program" according to the definition of ISO C++ Standard? Probably not, or "implementation defined".
      "Interposition" thingy that Ofek talks about - works on MSVC on windows no problems within binary, regardless where operator new/delete was defined/invoked, whole program has access to it. And yes, I stick to the opinion that described behavior is defined within binary: exe, dll, whatever. Technically C++ doesn't care much what happens at link time as soon as program behavior ramains complaint. Loader time - even less interesting.
      The fact that Linux, ELF, or whatever it is, has ability to interpose functions at loader time(if binary allows it) - it is just a feature of that specific platform, that specific implementation, at load time. It is nice to have LD_PRELOAD in some cases, I agree, but as Ofek said "Security books a vacation at this point". I don't want my module be possible to hijack that easily by 1 line script.

  • @u28OO
    @u28OO Před 4 měsíci +1

    I hate msvc!!!!
    Why do they have a billion different ABIs!!!!!
    wtf!!!

    • @AlexVaisman
      @AlexVaisman Před 4 měsíci

      at least since 2015 c++ name mangling is stable

    • @araarathisyomama787
      @araarathisyomama787 Před 4 měsíci

      I really hate MSVC from an interface/compiler/standard compliance standpoint as well and I think GCC and Clang do it so much better, yet they did get linking more right than Linux IMHO. Sometimes it's just compiler internal errors out of nowhere and sometimes it works surprisingly well, like when I wrote 2 DLL proxies for 25 year old game and it just worked. Maybe I was just lucky.