Non-conforming C++: the Secrets the Committee Is Hiding From You - Miro Knejp - CppCon 2019

Sdílet
Vložit
  • čas přidán 30. 09. 2019
  • CppCon.org
    Discussion & Comments: / cpp
    Presentation Slides, PDFs, Source Code and other presenter materials are available at: github.com/CppCon/CppCon2019
    -
    Non-conforming C++: the Secrets the Committee Is Hiding From You
    These days everyone talks about conforming and portable C++. Compiler vendors celebrate increasing conformance. Committee agents blind us with new shiny toys coming to the language. But there is a darker side to C++. A C++ you are not supposed to know about.
    What if I told you there was more to C++ than what the agents of The Committee want us to believe? Over decades programmers all around the world have added features to the language in form of compiler extensions that let us do even greater things. Some are completely new, and some are lifted from C to C++ to allow some interesting, and sometimes more efficient, application.
    We will see how statements can become expressions, how "goto" with extra superpowers can make your programs faster, and why there exists an operator named after a famous rock star. These are just a few examples of what to expect as listing any more would draw unwanted attention from The Committee. Unfortunately, because these extensions are not part of ISO C++, using any of them comes at the expense of portability. Or does it?
    -
    Miro Knejp
    Miro wrote his first line of C++ code in 1997 at the age of 12, and it has been his programming language of choice ever since. He’s especially passionate about low-level programming, assembly, 3D graphics, and games engineering. Miro holds a Master’s degrees in Computer Science from the Technical University of Munich. He has worked on projects ranging from designing 3D rendering libraries to building airport self-boarding control systems. He currently works as freelancer and trainer, with the goal of creating his own video game one day.
    -
    Videos Filmed & Edited by Bash Films: www.BashFilms.com
    *-----*
    Register Now For CppCon 2022: cppcon.org/registration/
    *-----*
  • Věda a technologie

Komentáře • 115

  • @henke37
    @henke37 Před 4 lety +432

    Best talk by a man wearing a tinfoil hat I've ever seen.

  • @luisc5922
    @luisc5922 Před 3 lety +57

    Glad to see a talk that makes sense, I mean the intermolecular interactions with the classes of carcinoma neurosis always caused problems with inheritance

  • @georganatoly6646
    @georganatoly6646 Před 3 lety +30

    Dude's a great charismatic speaker and presenter.

  • @thevinn
    @thevinn Před 4 lety +129

    OOOH This video DELIVERS!!!

  • @wolfgangsanyer3544
    @wolfgangsanyer3544 Před 2 lety +16

    Phenomenal slide transitions. Absolutely top notch.

  • @tomaszstanislawski457
    @tomaszstanislawski457 Před 2 lety +5

    Flexible Array Members are a part of standard C since 1999, same for designated initializers.

  • @MattGodbolt
    @MattGodbolt Před 4 lety +53

    This is so awesome! Thank you Miro :)

    • @mknejp
      @mknejp Před 4 lety +7

      Thank you for the CE talk on Monday. I learned about conformance view and used it to get the compiler versions for my slides the next day. You were also right about the ABAB pattern prediction, so thanks for telling me. The Hallway Track certainly is one of the best!

  • @andreasfett6415
    @andreasfett6415 Před 2 lety +7

    I love this talk... have been watching it a couple times just for pure entertainment. Let's disobey the committee!

    • @DerpMooseFish
      @DerpMooseFish Před rokem

      Just rewatched for my third time while playing games. Its enjoyable!

  • @brenogi
    @brenogi Před 4 lety +36

    Cool presentation! After seeing the flexible array part, I have some code to review...

  • @MatthijsvanDuin
    @MatthijsvanDuin Před 4 lety +13

    5:00 Depending on the situation a better reply may be: "GCC explicitly allows accessing the bytes of a union member via another member of that union. In fact it even allows this if the members have different types." Of course this is only a useful reply if you don't need portability to compilers other than GCC (and presumably clang, which aims to be GCC-compatible).

  • @StefanReich
    @StefanReich Před 4 lety +22

    42:00 That's really good. One thought: A smart compiler could theoretically take a normal switch statement and inline the jumps the way you do with computed gotos, no? Maybe they should do it by default for switch-based loops with a relatively simple switch expression?

  • @standingpad
    @standingpad Před rokem +3

    I swear, there's so many things that make sense (like range cases in switch statements) in C++ but aren't in C++

  • @oscareriksson9414
    @oscareriksson9414 Před rokem +1

    Refreshing talk! Cool!

  • @ninepoints5932
    @ninepoints5932 Před 4 lety +30

    There's also std::align if you don't want to do the alignment arithmetic yourself (although often I do the arithmetic anyways because I can never remember the order of arguments to that function)

    • @WolfrostWasTaken
      @WolfrostWasTaken Před 4 lety +6

      I think std::align is a bit overkill... most of the times the arithmetic is pretty doable on its own

  • @ebakes
    @ebakes Před 2 lety

    For the flexible array member section, you can make a template function with an unsigned as a template parameter and changing the value of the unsigned corresponds in compiler generated static vector

    • @mansandersen1110
      @mansandersen1110 Před 2 lety +1

      That wouldn't work with runtime values which is what the whole problem was about, if you had the number at compiletime you could just use a statically sized array to begin with

  • @drink__more__water
    @drink__more__water Před 4 lety +15

    49:55 is comedy gold. Plus all the stuff before ;P

  • @ratinabox1065
    @ratinabox1065 Před 3 lety +3

    I swear they were even hiding this video from me.

  • @on-hv9co
    @on-hv9co Před rokem +1

    Incase it's been burried in the comments the part about placement new at 24:40 seems to have been identified as a defect and updated. It now explicitly states "except when referencing the library function operator new[](std​::​size_­t, void*)"

  • @carmelkozlov1373
    @carmelkozlov1373 Před 4 lety +1

    Great talk!

  • @superscatboy
    @superscatboy Před 4 lety +4

    This is the stuff of legends.

  • @moonythm
    @moonythm Před 4 lety

    I loved the portal joke

  • @LDdrums20
    @LDdrums20 Před 2 lety +1

    😂😂😂 what an awesome and funny take on it

  • @NKernytskyy
    @NKernytskyy Před 2 lety +2

    32:42 - faster goto... Never expected to see that. LOL.

  • @evennot
    @evennot Před 4 lety +2

    That's some next level stuff

  • @Elite7555
    @Elite7555 Před 4 lety +4

    I didn't know GCC is so cool. I think I should use it more often. On the other hand the code also isn''t portable anymore.

    • @irvvine
      @irvvine Před 4 lety +8

      There is portability table near the end of the talk. MSVC is the only incompatible compiler of the commonly used ones even a decade ago.

  • @crafty1853
    @crafty1853 Před 4 lety +9

    tinfoil hat mode engaged

  • @treyquattro
    @treyquattro Před 4 lety +10

    10:51 the uint values for cyan and yellow are reversed. Good talk, cool hat (every C++ engineer has one, real or imaginary)

    • @IluvSD40s
      @IluvSD40s Před 2 lety

      It's being stored red-lowest. Note that cyan = green|blue and yellow=red|green

  • @TinBryn
    @TinBryn Před 4 lety +32

    with that indirect goto trick, could you instead put everything into functions and have an array of function pointers with tail call optimisation this should produce the same code without putting yourself on the committee's hitlist

    • @mknejp
      @mknejp Před 4 lety +15

      If you can get the compiler to inline all those indirect calls, maybe. I haven't tried yet. Though Jason Turner tried the tail recursion variant, and of course the compiler turned it into a loop which then produced the same assembly as the switch loop.

    • @Germanwtb
      @Germanwtb Před 4 lety +3

      ​@@mknejp According to: czcams.com/video/ieERUEhs910/video.html it won't. And as far as i understand it (which isn't very far) it can't.

    • @ZiggyGrok
      @ZiggyGrok Před 4 lety +3

      I've hit this performance difference myself in the 2006 ICFP programming contest (they had you implement a virtual machine for a made up assembly language just to get started). Both the switch statement & callback approaches were slower.

    • @mknejp
      @mknejp Před 4 lety +6

      @@Germanwtb It does work with function pointers, just got it to work: godbolt.org/z/XVfy3x

    • @VioletGiraffe
      @VioletGiraffe Před 4 lety +1

      A brilliant suggestion and comment!
      I'm sorely disappointed that the most straightforward and obvious way to implement this, which is switch, is not the fastest. This is wrong.

  • @VivekKumar-bb2pk
    @VivekKumar-bb2pk Před 4 lety +4

    Pretty advanced content

  • @friedkeenan
    @friedkeenan Před 2 lety

    For the operator new[] kerfuffle, couldn't you construct a std::array instead?

  • @meph2473
    @meph2473 Před 4 lety

    13:30 I have problems getting the ranged or the mixed types of designated array initializers to work in c++.
    In C they work fine.
    Errors are ranging from a compiler giving me the :
    "sorry, not implemented" (no joke, this is litteraly the error message) error
    to
    "either all initializer clauses should be designated or none of them should be"
    I guess these are the features we dont get, or have I missed something here?

    • @mknejp
      @mknejp Před 4 lety +1

      It seems like GCC is back-porting the C++20 rules for designated initializers to this, as in C++ designated and positional initializers cannot be mixed. It's weird considering array initializers aren't even covered by C++.

  • @IluvSD40s
    @IluvSD40s Před 4 lety +12

    At 28:40 : Is there any citation for why it's UB to treat the result of the last reinterpret_cast as a pointer-to-array instead of a pointer to a single object? If true, it seems that it would be impossible to implement std::vector in standard C++, as I'm pretty sure that std::vector instances just have to reinterpret_cast the void* from the allocator into a T pointer that is treated as an array (or I guess the allocator really does this -- the point is, at some point some chunk of memory gets cast and treated as an array). I know the issue is not the reinterpret_cast per se, but the idea is the same: an std::vector never uses an array new (placement new or otherwise), and the objects past the end iterator really are dead and not merely default-contstructed. Yet std::vectors seem to work without special compiler support, so what's the difference?

    • @mknejp
      @mknejp Před 4 lety +22

      You are correct in your assessment of std::vector. It is indeed impossible to implement std::vector in standard C++ without any UB right now. The std gets a pass because they are part of "the implementation" and thus can do whatever it takes (and exploit knowledge about how "the implementation" works) to implement the specified observable behavior. begin_lifetime_as is one step towards making vector implementable, but I'm not 100% sure if that is enough to also fix the other reasons. As references go, there is [basic.object] specifying the only ways in which objects are created (arrays are objects too) and there is [basic.std.dynamic.safety] describing the rules of how "traceable pointers" work. In fact there is nothing in the standard guaranteeing that two objects of type T declared next to each other have the same memory layout as a T[].

    • @kebien6020
      @kebien6020 Před 3 lety +11

      @@mknejp "It is indeed impossible to implement std::vector in standard C++ without any UB right now"
      Whaat? This actually blows my mind

  • @MatthijsvanDuin
    @MatthijsvanDuin Před 4 lety +7

    12:20 While this works in clang, gcc only supports this for C code, not for C++ code ("sorry, unimplemented: non-trivial designated initializers not supported"). You can syntactically use designated initializers, but you cannot use them to reorder the initialization of members, rendering the feature close to useless. About the most luxuious thing you can do is skip struct members that have a default initializer. I think the feature added to C++20 has the same restrictions.

    • @joshuascholar3220
      @joshuascholar3220 Před 2 lety

      I know this is a year later, but it sounds like talking past each other. He doesn't care what order initializers happen in, he cares what order members of an array are stored as - and do they correspond to an enum.

  • @eigenfield
    @eigenfield Před rokem

    Is the horror story in the 24th minute slide, the main reason why C++ was not considered in the Linux kernel?

  • @Mo6eB
    @Mo6eB Před 4 lety +3

    A note for flexible array members concerning Microsoft's compiler: While gcc and clang use the double data[]; syntax, msvc uses double data[1]. That is the official blessed way to have a flexible array member and the same thing that Windows' own structs use. On every other compiler it would be undefined behaviour, because you'll be accessing elements past the first one, but on msvc it's the official way of doing it. Fortunately macros still exist.
    Edit: I know I read this on MSDN somewhere but I can't locate the documentation now. If anyone finds it, please reply with the link.

    • @Germanwtb
      @Germanwtb Před 4 lety

      Is it really UB if you allocate enough space?

    • @mknejp
      @mknejp Před 4 lety +1

      @@Germanwtb Yes. Accessing an array of size 1 past index 0 is always UB. MSVC allows you to do it due to historical reasons with the Win32 API, which was created before flexible array members were a thing.

    • @Carewolf
      @Carewolf Před 4 lety +1

      It is not undefined on clang and gcc, because the flexible array implementation is actually declaring an array with 0 elements, so it also accesses over the size. It doesn't matter if you have the defined size as 0 or 1.

    • @jan-lukas
      @jan-lukas Před 2 lety

      ​​@@mknejp it's UB according to the standard, but the implementation of all compilers for arrays is with pointer maths AFAIK (don't know if this is part of the standard), array[n] is equivalent to *(array+n), so accessing non-existent indices will return the next parts of memory, which might result in a "correct" result if that memory is actually allocated or a segmentation fault if the memory isn't allocated. If the memory you access happens to be the right data type (data types don't exist at runtime but you know what I mean) your program might even not crash and just use the wrongly accessed memory

  • @Dth091
    @Dth091 Před 4 lety +8

    Man, the clause on placement new[] is really disappointing. Is there any reason for that unspecified value? Placement new[] is actually impossible to use correctly with it...

    • @mknejp
      @mknejp Před 4 lety +6

      I believe they use it to store the number of elements. Which is really unfortunate because you are saving this value yourself somewhere else and you technically cannot access this value to make use of it.

    • @gergelynagy1225
      @gergelynagy1225 Před 4 lety +1

      Is not this extra data to support overloaded placement array delete operator? It gets a size argument which cannot be deduced from the size of the type.

    • @marceloconceicao2587
      @marceloconceicao2587 Před 4 lety +6

      @@gergelynagy1225 That is indeed the reason why it exists. It is not needed for placement new operation though, as the data will not be deleted with delete[], but of course the standard says nothing about this so you're better off just not using placement new for arrays.

  • @VickyGYT
    @VickyGYT Před 4 lety +5

    very informative, what's the name of the font used for code snippets?

    • @paulcam206
      @paulcam206 Před 4 lety

      Looks like Consolas to me...

    • @antekone1
      @antekone1 Před 4 lety +3

      It's Fira Mono - mozilla.github.io/Fira/, also check out Iosevka if you like this one

    • @mknejp
      @mknejp Před 4 lety +2

      Fira Code, you can get it for free here github.com/tonsky/FiraCode

  • @SolomonUcko
    @SolomonUcko Před 4 lety +2

    Tweets: twitter.com/mknejp/status/1148689298201415682, twitter.com/bstamour1/status/1148747992809299973, twitter.com/Cor3ntin/status/1149046823832698889

  • @NKernytskyy
    @NKernytskyy Před 2 lety

    41:16 - goto is no longer faster? Plz, Xplane!

  • @foryouify
    @foryouify Před 4 lety

    23:11 Any idea which Compiler with Version (and compiler flags) uses an unknown 'x' which is not 0. I tried to replicate the issue using the example from the slides as a base, and godbolt, but it seems like, at least gcc and clang don't have an issue with that. Instead of the doubles I also tried using a small struct with a constructor which prints the "this" pointer of the object and later the "data" pointer. So if "data" and "this" of the first element of the array is not the same, an 'x' was added.

  • @altmindo
    @altmindo Před 4 lety +1

    39:31 why 0% branch prediction rate gets better performance than 100% prediction rate?

    • @mknejp
      @mknejp Před 4 lety +9

      I think you misunderstand. The 0% prediction case is slower. The computed goto version is always at least as fast as switch, but faster in most cases.

    • @KaneYork
      @KaneYork Před 4 lety +1

      @@mknejp Not always - It's not as fast as a topswitch in the special case of i$ exhaustion - if your total code size is bigger than your L1 instruction cache, the 20% inflation can cause a major performance hit.
      If you're hitting i$ exhaustion in real life, your code might be too complicated ;)

    • @nonamehere9658
      @nonamehere9658 Před 2 lety

      @@mknejp In theory, the branch predictor (which is effectively a black box ATM) could be as accurate on "normal" switch (if the branch predictor "remembers" the location of the previous jump) as on the "computed go" switch. Though the "In theory" part doesn't happen in practice, LOL!

  • @astarothgr
    @astarothgr Před 4 lety +9

    The flexible array part was... terrifying*.
    Rust, take me with you, and I will try to appease The Borrow Checker as best as I can!
    Just... don't let me deal with with this ...unholy abomination. I'm scared.
    * the talk itself was EXCELLENT.

  • @codenamelambda
    @codenamelambda Před rokem

    Couldn't you emulate a computed `goto` with tail calls? That feels like it should be possible...

  • @fdwr
    @fdwr Před měsícem

    Many of these should just be codified, because they are de facto in-practice standard in many compilers anyway.

  • @ericsbuds
    @ericsbuds Před 4 lety +3

    hilarious man

  • @cmdlp4178
    @cmdlp4178 Před 4 lety +4

  • @ArjanvanVught
    @ArjanvanVught Před 4 lety

    @12:37 -> error: ISO C++ does not allow C99 designated initializers [-Werror=pedantic]

  • @Ariccio123
    @Ariccio123 Před 4 lety

    32:21 using reserve 👍😎
    Using push_back 😭

    • @kebien6020
      @kebien6020 Před 3 lety

      The type bytecode seems to be simply an enum class or similiar, this would make it 𝘛𝘳𝘪𝘷𝘪𝘢𝘭𝘭𝘺𝘊𝘰𝘱𝘺𝘢𝘣𝘭𝘦 (copy is the same as move). So push_back is just as efficient as emplace_back for this type (it will generate the exact same code).

    • @OMGclueless
      @OMGclueless Před 3 lety

      What's wrong there? He back-inserts (count - 1) opcodes then pushes one more halt instruction. Should all work out with no dynamic resizing.

  • @PrivateUsername
    @PrivateUsername Před 2 lety +1

    Designated Initializers! Finally, C++ will be almost as useful as C.

  • @4otko999
    @4otko999 Před 4 lety +3

    really like the talk. 28:00 this data[0] = data[1] + data[2] bs is very disturbing. data[0] should be equal to *(data + 0) and data[1] should be equal to *(data + 1). if pointer arithmetic works correctly then this is defined and if this doesn't work as expected then this should be considered a compiler bug imo.

    • @mknejp
      @mknejp Před 4 lety +5

      Pointer arithmetic also only works on arrays. The syntax is equivalent, yes, and thus both need the existence of an array object. Remember you're dealing with the abstract machine, not the physical machine. It cannot be a compiler bug if the compiler is following the rules of the standard.

    • @De5ertFi5h
      @De5ertFi5h Před 4 lety

      Pretty sure pointer arithmetic on something that isn't an array is UB. However, I would expect that code to just work and wouldn't worry that much.

    • @mknejp
      @mknejp Před 4 lety +2

      @@De5ertFi5h That's what people always say. "It works, don't worry" until suddenly compilers start optimizing based on this particular form of UB. It's an arms race that you cannot win because you're not following the rules.

    • @TimVerweij
      @TimVerweij Před 4 lety +2

      @@mknejp I'm puzzled. Is this UB because the compiler can see where you got the pointer from, and that this pointer is not pointing to an array? If so, would this issue be solved by P0593R4 "Implicit creation of objects for low-level object manipulation", possibly using bless/start_lifetime_as, telling the compiler there actually is an array there?
      Or... is it UB in general to subscript a pointer to a non-array object (beyond [1])? If so, how would that allow pointer-to-element types to work as an iterator? E.g. If I call begin with an array type I will also just get a pointer to the first element. How would you ever be able to get to / read the next element?

    • @TimVerweij
      @TimVerweij Před 4 lety

      I just realized you can't use bless because you don't know the size of the array upfront.

  • @l3VGV
    @l3VGV Před 3 lety +3

    So if im just C everything, how thick my hat have to be?

  • @rterminatu
    @rterminatu Před 4 lety +4

    (expression) ? do_something() : 0;
    also compiles

  • @314Labs
    @314Labs Před 8 měsíci

    Hilarious

  • @Fetrovsky
    @Fetrovsky Před 2 lety

    Nit: MFLOPS per second.

  • @sakari_n
    @sakari_n Před 2 lety

    Flexible arrays are so basic and obvious thing to do. They are extremely useful useful. You would be an idiot not to use it. It literary does not make any sense to not use it.

  • @LordHonkInc
    @LordHonkInc Před 2 lety +1

    Incredible presentation, both interesting and enjoyable. Those "computed goto"s are definitely going on my list of favorite ways to royally screw with anybody inheriting your codebase xD