The Hidden Performance Price of C++ Virtual Functions - Ivica Bogosavljevic - CppCon 2022

Sdílet
Vložit
  • čas přidán 18. 03. 2024
  • A New Version Of This Talk Has Been Uploaded Here: • Introduction to Hardwa...
    cppcon.org/
    ---
    The Hidden Performance Price of C++ Virtual Functions - Ivica Bogosavljevic - CppCon 2022
    github.com/CppCon/CppCon2022
    Virtual function mechanism is one of the core concepts of C++, however, it does come with a performance price. But how high is that price? In this talk we are going to dissect virtual functions to understand when they are slow and why they are slow. We will investigate how well virtual functions use the CPU's underlying resources and how good is the compiler at optimizing virtual functions. We will also present several techniques to help you speed up your program using virtual functions.
    ---
    Ivica Bogosavljevic
    Senior Software Engineer with 10 years of experience active in the domain of Linux and bare-metal embedded systems. His professional focus is application performance improvement - techniques used to make your C/C++ program run faster by using better algorithms, better exploiting the underlying hardware, and better usage of the standard library, programming language, and the operating system. Writer for a performance-related tech blog: johnysswlab.com]
    ---
    Videos Streamed & Edited by Digital Medium: online.digital-medium.co.uk
    #cppcon #programming #functions
  • Věda a technologie

Komentáře • 38

  • @stavb9400
    @stavb9400 Před rokem +19

    I think the takeaway of this talk is focus on your memory allocation optimization first, then take a look to see if removing virtual functions actually makes a difference

  • @kamilziemian995
    @kamilziemian995 Před rokem +1

    Valuable talk.

  • @transire3450
    @transire3450 Před rokem +19

    9:23 - Can also be allocated on stack. Placement of the object does not affect vcalls.

    • @user-wl1sn8qr5f
      @user-wl1sn8qr5f Před rokem

      Was a bit discouraged about this moment too. Even took some time to look around and verify - it's all right with vcalls ))

    • @llothar68
      @llothar68 Před 10 měsíci +1

      That’s not what he is saying . He is talking about activation, bad wording. He means when the compiler must use the vmt and can’t deduce the right call like he can on the stack

  • @NielsProsch
    @NielsProsch Před rokem +3

    A really good presentation! To the point ☝️ Bravo!

  • @acestapp1884
    @acestapp1884 Před rokem +3

    Virtual calls apply primarily to heap objects because the compiler knows the exact type of the stack object and will devirtualize any calls.

    • @user-wl1sn8qr5f
      @user-wl1sn8qr5f Před rokem

      This seems to be applicable only to a straightforward object pointer access. If one have for instance two derived objects and a pointer to base that is assigned one or another using "if" statement, then using the pointer after that "if" will require full vcall mechanics, right?

  • @CartoType
    @CartoType Před rokem +24

    Error at 10 minutes. It is stated that objects with virtual functions have to be allocated on the heap. That is of course untrue, and there are various useful coding strategies which involve creating objects on the stack and passing pointers to them to functions taking the base class.

    • @ZeroPlayerGame
      @ZeroPlayerGame Před rokem +10

      I see you've learned from compilers by stopping on the first error.

    • @petera.schneider2140
      @petera.schneider2140 Před rokem +1

      Yes, true; but his "error" boils down to omitting "typically", because the life time of (copied) pointers is detached from the life time of the objects they point to, so "typically" you need the objects dynamically allocated. (Think pointers in containers or the factory pattern.)

    • @johnmcleodvii
      @johnmcleodvii Před rokem

      The vtable is known at compile time, and is of fixed size. The vtable will be in the code segment, not either the heap or the stack. Yes, there is a cost to virtual functions, but if you need them, use them.

    • @llothar68
      @llothar68 Před 10 měsíci

      He means out of lexical context where a compiler can deduce the type.

    • @oraz.
      @oraz. Před 7 měsíci

      Even is they are allocated on the heap, isn't what he's describing not actually a problem of virtual functions themselves and just a possibly related scenario?

  • @williamdavidwallace3904

    My expectation is that AMD's big increase in cache size might obviate some of the performance penalty of heap allocation of objects. ---With processors which have an adequate number of registers (32) the compiler optimizer could move the loads for the address of the virtual methods far enough apart that the performance hit is much reduced.

  • @AaronNGray
    @AaronNGray Před rokem +1

    What speed CPU, also if you are calling the same virtual function then its all going to be un Level 1 Cache anyway. The real overheads occur with real code when things are not in cache.

  • @KhalilEstell
    @KhalilEstell Před rokem +1

    I don't understand why so many presentations about virtual always go down the "heap allocation" route. You don't, in any circumstances need to use heap to use virtual. You can if you like, but it is not necessary. If you avoid this, then you avoid large portions of what is discussed here.

  • @oraz.
    @oraz. Před 7 měsíci

    It sounds like he's saying scenarios where you might use virtual functions are possibly ones where the cache is not coherent, but that's not a property of the virtual function calls themselves. Am I crazy?

  • @mminich87
    @mminich87 Před 6 měsíci

    I think the "153 ms" and "126 ms" (milliseconds) times for the "Short and fast function" (at video time 7:25) should surely be more like 153 ns and 126 ns (nanoseconds). Milliseconds on modern systems are a very long time!

  • @AaronNGray
    @AaronNGray Před rokem

    You dont need virtual dispatch on object arrays/vectors as the compiler used normal function call "dispatch".

  • @pawello87
    @pawello87 Před rokem +5

    Great material. I really like DOD, it seems to me, ironically, more natural than OOP which is supposed to 'model reality'.
    BTW Virtual functions go hand in hand with polymorphism, which consists in the fact that we do not know what type is hidden under the pointer. So how do we sort a vector by type if we don't know the types? :)

    • @niklkelbon3662
      @niklkelbon3662 Před rokem

      sort by typeid / vtable ptr(not equal to by type, but max efficient)

    • @SkorjOlafsen
      @SkorjOlafsen Před rokem

      Sometimes you can change the design in order to generate the vector by type.

  • @cyanmargh
    @cyanmargh Před rokem

    В начале запинается так, будто он на допросе)) Хотя, справедливости ради, я сам бы вряд ли провёл доклад такой длины без запинок.

  • @Dziaji
    @Dziaji Před rokem +8

    I recommend against using virtual functions entirely, when possible. A cool trick I learned is to use CRTP to have your base class accept the derived class as a template argument. Then instead of making a virtual function in the base class where the compiler has to use the virtual table, you write the function in the base class so that it casts itself to the derived class and calls the derived class directly, with no performance loss. The 2 caveats for this method are:
    1. the base class of all the derived classes are all different when using CRTP, so a function that accepts an instance of the base class as a parameter has to become a template function so that the paramaeter can accept "const baseclass& instance" instead of just "const baseclass& instance", which is not a big deal, and actually helps in a lot of circumstances because you always explicitly know the derived class when working with the base.
    2. You can no longer make a container of base classes with different derived types without type erasure, because even the base classes are now different types, so creating a vector of them is not possible while keeping the information about what the derived types are. You can however, make a container of baseclass, and still call any functions directly on the base class that doesn't involve the derived class, so you basically just ignore the fact that you are referring to a class of type baseclass as an instance of baseclass. If practical, you can also just make multiple containers, 1 for each derived class, and then write some helper templates that accept template lambdas and loop through the different containers.
    If you really just need a single container that can hold multiple instances of the base that keep track of their derived class, then i think you have to use virtual functions, but look into replacing some with CRTP and I'm sure you will find some awesome techniques.

    • @Dziaji
      @Dziaji Před rokem

      Also, you can just use a single class that has a "type" member and use a switch statement as control flow logic instead of letting thecompiler do it with vtables. This can get a little messier with complicated code, and may require you to define functions that you never actually want called, just so that compilation works nicely with variadic templates, but that might be preferable to virtual functions in some circumstances because the implicit magic of virtual functions can be a little confusing to navigate.

    • @niklkelbon3662
      @niklkelbon3662 Před rokem +2

      Type erasure allows you to use dynamic polymorphism only where it is really needed without changing the code.
      CRTP by itself can't help here because it's still static polymorphism.
      Personally i use a lib AnyAny (github) for type erasure, it have much advantages over virtual functions(less allocations, more readable code etc)

    • @dat_21
      @dat_21 Před rokem +2

      @@Dziaji I'd take a slightly messier code than deal with templates. Using mind trickery like CRTP means than you don't really need polymorphic behaviour. On top of that if you use type erasure that will have the same (if not more) overhead as virtual dispatch.
      Switch statement as alternative to vtable has an advantage of inlining, but it's still has same misprediction penalties. So not a huge win here, because you shouldn't virtualize small functions anyway. And data dependant switch won't let compiler optimize nicely either.
      So the only true method of reducing virtual function overhead is amortization. Doing more actual work for a given polymorphic call. Or avoiding them entirely by not over-generalizing. Which making containers of derived classes really is, it's the opposite of generalization.
      I mean let's be real, how often do you need more than 3-4 derived classes that would justify complex template machinery?

    • @christophclear1438
      @christophclear1438 Před rokem +2

      Having a `type` member seems like a poor implementation of variant to me that mixes variant into the underlying types.

    • @llothar68
      @llothar68 Před 10 měsíci

      But this goes against my recommendation to avoid templates as much as possible

  • @__hannibaalbarca__
    @__hannibaalbarca__ Před 9 měsíci

    Who care about performance when we have 20CoreCpu + 100...0CoreGpu + 1×xGRam +...acc+... .
    Remember 90s how we straggling.

  • @AaronNGray
    @AaronNGray Před rokem

    Theres cache overflows at 64K, 1M, and 4M

    • @4lpha0ne
      @4lpha0ne Před rokem +1

      Depends on the CPU. It could also be nearly any combination of (32K|64K)/(256K|512K|1M|...)/(0M|2M|...).

  • @Antagon666
    @Antagon666 Před 9 měsíci

    Dont you dare write anything like this on stack overflow.
    Virtual is superior to everything else, and you are stupid if you try replacing it with custom variants or data oriented design.

  • @AaronNGray
    @AaronNGray Před rokem +2

    The dispatch of Virtual Functions in C++ is slower than normal methods. This is for two reasons vtables rather than arrays of pointers to functions indexed by class/type (as used in Eiffle) and lack of HOTSPOT type dispatch mechanisms (as used in Self and Java).

    • @llothar68
      @llothar68 Před 10 měsíci

      Thumbs up for someone who knows how Eiffel implements virtuals. Unfortunately it’s not useable in c++ because it requires full system analysis and no DLL loading. Unless people start to work on linkers as much as on compilers