C++ Super Optimization: 1000X Faster

Sdílet
Vložit
  • čas přidán 9. 06. 2024
  • Obey this one simple C++ trick for a 1000x performance gain: constexpr! Check out the free sample of my book on the Spectrum at Amazon: amzn.to/42TfSJX
    Language Racing Series, Github Code: github.com/PlummersSoftwareLL...
  • Věda a technologie

Komentáře • 805

  • @DavesGarage
    @DavesGarage  Před 3 měsíci +426

    Thanks to everyone for the kind words on the intro... I was a little worried about using it, but had fun making it, so went out on a limb!

    • @c4cypher
      @c4cypher Před 3 měsíci +5

      Entertaining and informative, thank you.

    • @NoLoseJustLearn
      @NoLoseJustLearn Před 3 měsíci +2

      Loved it

    • @systemloc
      @systemloc Před 3 měsíci +5

      I laughed SO HARD when constexpr popped up. I haven't laughed that hard in awhile. Thank you for that.

    • @jstro-hobbytech
      @jstro-hobbytech Před 3 měsíci

      Dave. Can you do me a favor. I'll even donate 100cad to a charity of your choice.
      In the fastled library there's a udl in the tetris game written by Aaron liddement. I for the life of me do not understand the bitwise based udl code he has written to define the tertris blocks. I get what he's doing but can't find any other similar cases of the bitwise code using hex to and unsigned long data like the way he's done it.
      Scott Manley made a wearable matrix using his code years ago which made me see what was going on under the hood. I don't like using code I don't understand and it's driven me nuts ever since. I too am on the spectrum and have a software degree I never used until I had to medically retire a while back.
      I love your channel and have been around since the beginning and even if you could have a quick look and reccomend a book or something if you don't have the time to make a video. It's one of those things that gets stuck in my head from time to time and it kills me that I do not understand what he's doing fully. I suspect it's intentionally obfuscated and as someone who spends his retirement doing charity work and learning it's been a thorn in my side. I am 45 and plan on using my past credits toward a compsci or ee degree (as a bucket list item and to help me tutor programmers and develop free courses for a local non-profit makerspace). I've googled, bought books, stopped short of chatgpt because I don't want the info handed to me on a silver platter because I prefer to learn hands on. I understand how transistors perform math using different binary rules because I like to do things by coding and then trying the same wirh passive circuit components. He rewrote the sprite class so it no longer uses the udl but it still drives me nuts that I can't figure it out. I'm also a fellow Canadian but I hail from the ns side od the nfld ferry. I'm waiting for approval for funding for a 7 week inpatient program for how to live with ptsd and I'd like to spend my free time coding c++ while I'm away. I was going to bring a guitar but I'd rather work on a skill that I'm not as good at as I'm deciding which degree to pursue. I have a year and a half of credits which transfer to compsci or ee but I don't want to drag around my overpriced ee lab when I could write code instead with a decently powerful laptop. I plan to travel to Utah this summer to camp under the milky way before I hit the books again. I've spent the last 4 years donating half my disposable income from my pension to others so they can learn and put a proper electronics section in the local makerspace. My ptsd keeps me from being able to leave the house and teach there in person or attend the labs I'll have to attend when I go back to school.
      Sorry. I ramble. I figured you'd get what I mean when something is stuck in your head and can't move past it. I could link the git repository if you like. Id be extremely thankful and would make good on my donation.

    • @egeres14
      @egeres14 Před 3 měsíci

      Omg, my sides 😂, that intro was so good

  • @unclerojelio6320
    @unclerojelio6320 Před 3 měsíci +950

    CZcams kept recommending this to me and I was starting to get annoyed until I finally noticed it was a Dave’s Garage video.

    • @EricBRowell
      @EricBRowell Před 3 měsíci +30

      😂

    • @DavesGarage
      @DavesGarage  Před 3 měsíci +191

      I was wondering about that, but sticking my head in the thumbnail seemed weird... maybe I should at first?

    • @unclerojelio6320
      @unclerojelio6320 Před 3 měsíci +73

      @@DavesGarage Yeah, the lack of your mug is why I didn’t immediately click on the video.

    • @JamesPotts
      @JamesPotts Před 3 měsíci

      ​@@DavesGaragequick work!

    • @danwilson7581
      @danwilson7581 Před 3 měsíci +22

      @DavesGarage it's much better than sticking your head in other places.

  • @taltal1122
    @taltal1122 Před 3 měsíci +80

    Please note: constexpr just declares that it is POSSIBLE to calculate at compile time. It does not actually guarantee it unless the context demands it (as a template argument for example). The compiler will do its best but it might just call the function. c++20 added consteval which forces compile time evaluation.

    • @skilz8098
      @skilz8098 Před měsícem

      Thank you for pointing that out! I'm sure many are aware of this, yet there are just as many who aren't and for those who are new to it, this is really good information to be aware of. As a side note, this could also potentially be compiler dependent (implementation defined). I'm not 100% sure on this based on the literature of the standard so don't quote me on it. However, using MSVC vs Clang, vs GCC, vs Intel, vs MingW, etc... may provide the same results if they support and implement this feature of the language which by now I'm sure all major compilers do. Yet their methods of doing so can and will vary which also depends on the various compiler flags that are being used during compilation. It all depends on how each compiler parses, tokenizes and analyzes or handles the source code it is given to compile as well as how the programmer - user instructs the compiler or gives hints to the compiler to do its job. For simple expressions I would assume that almost all compilers would generate the same assembly, binary results. However for more complex expressions, the results may and can vary. Again, this is a really good and underrated comment. It is truly good advice to give!

    • @ColinBroderickMaths
      @ColinBroderickMaths Před měsícem +1

      It is guaranteed if you just constexpr the result. That is guaranteed to calculate at compile time or be a compile error.

    • @skilz8098
      @skilz8098 Před měsícem

      @@ColinBroderickMathsSure when the assignment operator is involved for a specific type. Now as for the evaluation of an expression, mileage may vary.

  • @ernestg.harveyjr7087
    @ernestg.harveyjr7087 Před 3 měsíci +773

    That opening was fkn hilarious

    • @drewwilson8756
      @drewwilson8756 Před 3 měsíci +8

      Thumbnail was pretty good as well.

    • @Avighna
      @Avighna Před 3 měsíci +4

      Agreed!!!

    • @DM-qm5sc
      @DM-qm5sc Před 3 měsíci +4

      Its from "Better call Saul" if you didnt know. Good show!

    • @Roboprogs
      @Roboprogs Před 3 měsíci +1

      @@DM-qm5scso, Steven Ogg (I think I got “Trevor”’s real name right) had a guest bit?
      My wife used to watch Sol, but I only sampled a few episodes.

  • @hopperpl
    @hopperpl Před 3 měsíci +303

    We use constexpr a lot for FIR/DCT/DFT (fourier transforms, wavelets) constant generation; this removes literally thousands of magic float numbers in the code, that are so error prone. Miss a number, flip a digit, mixup digit order. No code reviewer would ever notice if the 17th digit of PI is 3 or 2. The code generation only uses sqrt/sin/cos etc for which we have constexpr implementations. constexpr simply guarantees that the value is never ever calculated at runtime, and that's the real beauty of it.

    • @serhii-ratz
      @serhii-ratz Před 3 měsíci +17

      thank's for sharing practical example

    • @sergeykolesnik1171
      @sergeykolesnik1171 Před 3 měsíci +18

      this is not an optimization, this is meta programming. Your algorithm doesn't become faster because of the constexpr. It runs just as slow, but during compilation.

    • @Dom-zy1qy
      @Dom-zy1qy Před 3 měsíci

      ​@@sergeykolesnik1171That was my takeaway

    • @privateger
      @privateger Před 3 měsíci +74

      ​@@sergeykolesnik1171Precomputation absolutely is an optimization. You don't tend to compile as often as you run.

    • @ra2enjoyer708
      @ra2enjoyer708 Před 3 měsíci +15

      @@sergeykolesnik1171 Algorithms don't have "speed", only complexity. "Fast" algos merely trade time complexity in exchange for space complexity and shit themselves the moment you start to hit RAM limits. I.e. a typical hash table has assumptions about the maximum size of the key storage and upon hitting this limit in best case scenario crashes, in the worst case rears its ugly side effects.

  • @bobthemagicmoose
    @bobthemagicmoose Před 3 měsíci +278

    I had a VERILOG class where a question on the exam was: create a component that calculates the factorial of 32 (or some number, I don’t remember). I hardcoded the result and returned the value :) got an A

    • @NinjaRunningWild
      @NinjaRunningWild Před 3 měsíci +41

      😂 Doesn't speak highly of grading in classes.

    • @Divine_Evil
      @Divine_Evil Před 3 měsíci +49

      Yeah typical unclear requirements... It should have been more generic. Something like: Having an 8bit input N, handshake start/result_rdy interface, clk, reset and an output 31 bit... Calculate the N! when start is asserted and then assert result_rdy when the correct value is done.

    • @NotMarkKnopfler
      @NotMarkKnopfler Před 3 měsíci +7

      🤣🤣@@Divine_Evil

    • @pilotashish
      @pilotashish Před 3 měsíci +25

      don’t ask stupid questions if you don’t want stupid answers

    • @muggzzzzz
      @muggzzzzz Před 3 měsíci +49

      @@pilotashish Actually, there was nothing stupid in this question. The question was to compute a factorial of 32, not to make a function that computes a factorial for any given integer number (32 in particular case), and the program did exactly that.

  • @Evan490BC
    @Evan490BC Před 3 měsíci +96

    I think it would be more accurate if we said that the `constexpr` specifier indicates that the expression *may* be evaluated at compile time, in a constant expression context, but not necessarily. And a `constexpr` function specifier may return such a value. If we want to force the function to produce a compile time constant expression (under some constraints), we must use `consteval`.

    • @Malaka1802
      @Malaka1802 Před 7 dny

      Why would you ever not want a function that can be evaluated at runtime or compile time? Just because the size of the binary?

    • @Evan490BC
      @Evan490BC Před 7 dny

      @@Malaka1802 I'm not sure I understand your question. Compile-time evaluation is completely different than run-time: you must know every value beforehand. The size of the binary is not the main concern here, the goal is to pre-compute as much as you can before the runtime stage. Think of it as a _kind of_ "memoisation".

  • @nunyobiznez875
    @nunyobiznez875 Před 3 měsíci +87

    It's also worth mentioning that C23, which ironically should have just been submitted for final publication..today as a matter of fact, is adding 'constexpr' to C as well.

    • @aniketbisht2823
      @aniketbisht2823 Před 3 měsíci +16

      Nope, just constexpr variables not expression/function. So it's not that useful. Also C doesn't have "if constexpr" like C++ to make compile time code path decisions.

    • @nunyobiznez875
      @nunyobiznez875 Před 3 měsíci

      @@aniketbisht2823 Nope what? It would have taken you much less time to go to google and type "c23 constexpr".

    • @zsomborgyenge4359
      @zsomborgyenge4359 Před 3 měsíci

      @@aniketbisht2823 macro magic can do this in c

  • @lpprogrammingllc
    @lpprogrammingllc Před 3 měsíci +73

    Aside from performance gains, the best use of constexpr is unit testing. First, it's important to note that, unlike inlined internal functions, the constexpr functions themselves _are_ included in the program itself, unless stripped by a later step (such as `--whole-program` in g++), so you can have a program that uses constexpr to automatically return any values below some threshold and fall back to runtime evaluation _of the same code_ if in excess of that value.
    This means you can write fairly complicated state machines (first case I saw was a complete 6502 emulator) with a complete set of unit tests using `constexpr`. Then, they'll read the "real" data at runtime. By embedding the unit tests in the main compilation unit, you get the neat property that if it compiles, it's correct (so long as you don't encounter a compiler bug). You can even give it simple integration tests (like 10 steps of Conway's game of life on the virtual 6502).

    • @JohnMullee
      @JohnMullee Před 3 měsíci +1

      Nice

    • @ThePC007
      @ThePC007 Před 3 měsíci +12

      Doesn't that just mean that compiling now takes just as long as running the actual test would?

    • @craigslist6988
      @craigslist6988 Před 3 měsíci +1

      very cool and pretty well explained, thanks!

    • @AnttiBrax
      @AnttiBrax Před 3 měsíci +8

      ​@@ThePC007That was implied in the latter part of the comment: "if it compiles, it is correct". So instead of running compile && test you just run compile.

    • @ThePC007
      @ThePC007 Před 3 měsíci +2

      @@AnttiBrax But then, what do you gain from doing that? Since unit tests are compiled once, and the run once, there shouldn’t really be much of a performance differnce. I guess you might benefit from your IDE showing the error right inside your code, though, assuming it can actually catch such bugs.

  • @mbnqpl
    @mbnqpl Před 3 měsíci +30

    Wait, so I can make my hello world 1000x faster?

    • @feel65
      @feel65 Před měsícem +2

      print(“hi”)

  • @TomMinnick
    @TomMinnick Před 3 měsíci +81

    Probably one of my favorite scenes in Better call Saul. Steven Ogg is awesome. I also live your mashup here with it!

    • @whannabi
      @whannabi Před 3 měsíci

      Big shots in that show

  • @germanassasin1046
    @germanassasin1046 Před 3 měsíci +16

    My favorite application of constexpr is compile time regular expression library, it evaluates your regex at compile time so you don’t have to worry about runtime cost of creating a regex. Also pretty cool one is in the fmt library where it checks whether the format string you provided is valid, btw this library partially made it into c++20 and was mostly completed in c++23 and I love it. Man, people are so creative.

  • @isodoublet
    @isodoublet Před 3 měsíci +23

    To me, the biggest benefit of constexpr is that it significantly simplifies some stuff that previously would have to be written with template metaprogramming. Thanks to constexpr, just about anything that could be a regular operation on values can now be done in mostly ordinary-looking C++ instead of template stuff, which makes it much more likely to be done at all -- nobody wants to maintain template metaprograms.

    • @johnshaw6702
      @johnshaw6702 Před 3 měsíci +1

      Well I wouldn't necessarily say that. I created my own meta templates before they became part of the standard. It was quite fascinating at the time and a very productive learning experience. The introduction of 'constexpr' was an extension of what some people were doing with templates already. It just allows you to write them in a more easily understood manor.

    • @isodoublet
      @isodoublet Před 3 měsíci

      @@johnshaw6702 It's true, templates were already Turing-complete so there was nothing you couldn't do before that you can now. But template metaprogramming is very hard to teach, and the syntax makes the code rather difficult to understand and debug. It's also purely functional (excepting the stateful template instantations thing which IMO should not be used in serious code), which is a completely different paradigm to normal C++.
      So constexpr stuff, while not giving any strictly "new" capabilities, makes the existing capabilities easy enough to use that I can convince my team to use them, for example, and be confident that people will "get it" and continue to get it over time. I don't shy away from template metaprogramming myself and there's some great tricks you can still only do using it, but the more that can be done using familiar syntax and patterns, the better.

  • @yarost12
    @yarost12 Před 3 měsíci +11

    We had a C++ logging library that ran on embedded hardware, it was used to convert numerical debug level values into strings. There was a heap of functions that converted different log message types into readable strings.
    A few developers from the customer side complained that this lib alone could use the CPU to 100% when debugging was enabled slightly above default. We made those conversions constexpr and switched them to literal types and dropped the load quite significantly.
    What has to be mentioned is that constexpr functions will consume more stack space and programms will be larger, so you have to be careful with that. Sometimes it can be worth it, when you're fighting for kilobytes.

  • @sietzethebest
    @sietzethebest Před 3 měsíci +14

    Mostly used constexpr to generate lookup tables at compile time. I remember writing a cloth physics simulation that could use a lookup table of size 256 to avoid multiple multiplications and divisions. And as the table is small it will most likely be stored in a cache somewhere so it was really fast. Got an A+ on that assignment. :)

    • @revenevan11
      @revenevan11 Před 2 měsíci

      That's a great example of a use for that, feel like I just understood the practicality a lot better now. Thanks!

    • @jennalove6755
      @jennalove6755 Před 5 dny

      Thats not a simulation then

  • @ljw4103
    @ljw4103 Před 3 měsíci +81

    If you made a c++ tutorial I would buy it.

    • @theograice8080
      @theograice8080 Před 3 měsíci +11

      I concur! I've wanted to learn C++ for ages. I started to try to grok the low-level details of systems programming years ago as a teen, but modern OS UIs abstract away so much of what is really happening that any program seems like a miracle to me.

    • @NinjaRunningWild
      @NinjaRunningWild Před 3 měsíci

      @@theograice8080Try game programming & write your own engine in DirectX, OpenGL, or SDL. You'll learn all you need to know along the way.

    • @theintjengineer
      @theintjengineer Před 3 měsíci

      @@theograice8080, get the book Beginning C++23 by Ivor Norton and start there.

    • @lennymclennington
      @lennymclennington Před 3 měsíci

      @@theograice8080 if you actually want to learn C++, use learncpp. It is a good tutorial that is suitable to teach you about software development in C++ from zero prior knowledge about programming. If you already know some programming languages you can still use it and just skip a few of the first chapters.

  • @nezbrun872
    @nezbrun872 Před 3 měsíci +141

    "Recursion: a sexy computer science concept that's rarely practical in real life"
    Ain't that the truth. Furthermore, recursive solutions are also typically write-only.
    Ex colleague of mine came up with a SQL Server recursive algo to write check amounts in words, very clever I thought, although it was left to me to debug it when it crashed the production server one day... it'd eaten up a couple of 100GB of RAM. Turns out he hadn't written it at all, it'd been copy and pasted from Stack Overflow.

    • @TheNewFaceOfHSP
      @TheNewFaceOfHSP Před 3 měsíci +7

      Aren't they always?
      I've been doing this for 10+ years and every single time I think of something clever, somebody has usually already made it (to one degree or another).

    • @NinjaRunningWild
      @NinjaRunningWild Před 3 měsíci +13

      There's a reason NASA doesn't allow it in their coding though. Error-prone & hard to debug.

    • @JohnDoe-sq5nv
      @JohnDoe-sq5nv Před 3 měsíci +7

      You're missing the most important part of recursion: They are FUN!
      Nah, but seriously I often solve problems by recursion because it is easier for me to think that way. The benefit when you write something recursively is that you through smart use of parameters can easily control the data in every context, that is in every call in the recursion. I understand, however, that they are more complicated to debug for others who aren't used to thinking recursively. Therefore it is, as always, extremely important to properly test your recursive functions for all possible inputs that can come through the public interface, and never, ever, have them read or write data not in the input parameters.

    • @isodoublet
      @isodoublet Před 3 měsíci +13

      @@NinjaRunningWild No, that's not why. Recursive solutions in fact tend to be much easier to understand and debug than iterative solutions. The problem is that they don't fail gracefully: try to solve too big a problem and you'll get a literal stack overflow, whereas an iterative solution would typically run just fine.

    • @fantasypvp
      @fantasypvp Před 3 měsíci

      Recursive solutions generally use up a lot more memory than iterative ones due to each function call adding memory to the stack

  • @Pedantic2025
    @Pedantic2025 Před 3 měsíci +19

    two things on this:
    1. the [[assume(expr)]] attribute in C++ does something similar (it indicates to the compiler that it can assume expr holds.
    2. using C++20's requires for templates, you can achieve a similar effect, but have the code detect how deep it has recursed and switch to a runtime implementation

    • @Evan490BC
      @Evan490BC Před 3 měsíci +1

      Nice! I regard C++20's Concepts more as *declarative* type constraints (so I prefer `constexpr` for this case). In essence, you are suggesting something similar to Rust's procedural macros, if I'm not mistaken.

  • @MorebitsUK
    @MorebitsUK Před 3 měsíci +17

    Hi Dave, I've been trying constexpr and static constexpr lately, in a calculator for room heating ratings.
    Instead of having an array load in at runtime, I made the array static constexpr; low and behold the lookups were instantaneous and the it was 10x faster than the one that didn't do that.
    On a side note, a C++ Guru showed how it works in that case and all the array elements are coded as a lookup table in the code, making it 100x quicker.
    Great video as usual, keep up the content. 😃

  • @gabydewilde
    @gabydewilde Před 3 měsíci +22

    You can hard code the results in any language of course but if you want to enjoy some time off while compiling it's perfect.

    • @attilatorok5767
      @attilatorok5767 Před 3 měsíci

      constexpr can never be as powerful as a python script that generates some code called in your makefile. It's dirty but it works.

    • @tommclean9208
      @tommclean9208 Před 9 dny

      @@attilatorok5767no

  • @clemensschotte
    @clemensschotte Před 3 měsíci +9

    During the era of the Commodore 64, we relied on precalculated paths utilizing sine and cosine functions, which were too slow for real-time usage in demos and games. Consequently, when you observe sprites gracefully traversing the screen, it's all precalculated. Today, you can adopt a similar approach, such as with a website, where you generate as much content as possible beforehand and serve the already existing pages.

    • @snemarch
      @snemarch Před 3 měsíci +3

      Even on a 486 (with hardware FPU support for SIN/COS), which is some order of magnitude faster than the C=64, there was a lot of speed to gain by using lookup tables. But at least we could do LUT + interpolation instead of fully pre-calculated paths :P

    • @revenevan11
      @revenevan11 Před 2 měsíci +2

      This makes me chuckle a bit, thinking about how people used to look up the trig tables in the back of their math book to evaluate something, and I grew up thankful I never really had to beyond an exercise, because we have computers to evaluate those for any value for us... only to know realize that in some cases the fastest way for a computer to do so is sometimes to make a table ahead of time it can look up to get the answer! (or part of the answer at least) 😅

  • @BMW_M-Performance
    @BMW_M-Performance Před 3 měsíci +5

    You are one of the reasons I started learning C/C++. I am extremely grateful for this channel, I love your content! Programming is just great.

  • @johnmckown1267
    @johnmckown1267 Před 3 měsíci +58

    So the compile time becomes O(2^n)?

    • @michaelfuller34
      @michaelfuller34 Před 3 měsíci +11

      Only if your compile environment has the requisite resources 😂

    • @Obscurai
      @Obscurai Před 3 měsíci +16

      Yep, it's a cheat. Trading compile time for run time.

    • @Bolidoo
      @Bolidoo Před 3 měsíci +6

      Well that finonacci implementation is horrible. A simple iterative one can get you O(n), a smarter one O(log n) and an aproximate one (using the formula for the nth finonacci number) runs in O(1). Also constexpr is not really needed, if you run the compiler with optimizations activated (O3) it will probably do this by itself.

    • @dismuter_yt
      @dismuter_yt Před 3 měsíci +5

      Yes, but you can get O(1) both for runtime and compile time by calculating the value once on the side and putting it in manually in the code. So if it actually never changes, you'll have paid for the computation only once, and not at every compilation.

    • @AnttiBrax
      @AnttiBrax Před 3 měsíci +5

      ​@@ObscuraiIt sounds like a cheat if you only concentrate on the drag race example. The benefits to constexpr were really badly sold in the vid I think. There are a lot of better examples of great real world applications in the comments.

  • @mgancarzjr
    @mgancarzjr Před 3 měsíci +40

    "I bet it's constexpr."
    If I'm remembering correctly, constexpr had some other very interesting effects. It may be in the book C++ High Performance by Andrist and Sehr.

    • @blackcitadel37
      @blackcitadel37 Před 3 měsíci +1

      there's constexpr young brother too, consteval.

  • @0ADVISOR0
    @0ADVISOR0 Před 3 měsíci +4

    Man, let me tell you, this exact screen 1:49 is one of the reasons I love this Channel. The code neatly on the left the face not too big not too small and rounded soft edges on the right, THIS is quality!

  • @corvinyt
    @corvinyt Před 3 měsíci +25

    This is just shifting the algorithm runtime from program run time to compile time.

    • @diogeneslaertius3365
      @diogeneslaertius3365 Před 3 měsíci +6

      Exactly my thought. Why doesn't he just hard-code the result of the function, this way we don't even need to bother the compiler.
      "Hey folks, I optimized the program so it runs 1000000 times faster when you call factorial(1e7). How? I hard-coded the constant with the result."

    • @eDoc2020
      @eDoc2020 Před 3 měsíci

      @@diogeneslaertius3365 Because then you have hardcoded magic values in your code. If that magic value is incorrect how will you know? If it's done with code you can fix errors more easily.

    • @AelfricLake
      @AelfricLake Před 3 měsíci +1

      Yeah in a complex program you'll just hold every possible result in memory? What...

    • @eDoc2020
      @eDoc2020 Před 3 měsíci +4

      @@AelfricLake Not every possible result, only hardcoded results for specific constant inputs given at compile time. It's of limited usefulness because most inputs won't be known at compile time.

    • @futuza
      @futuza Před 3 měsíci +1

      Sure, but it's still a useful tool no?

  • @monawoka97
    @monawoka97 Před 3 měsíci +25

    Constexpr is really amazing from a compiler tech perspective, but it's still hard to think of circumstances where it enables something genuinely new. These examples are effectively using constexpr to bake a lookup table into the exe. It's really cool that this is possible, but you could achieve this in C. It's obviously more tedious. Either an executable as part of your build script that code gens the lookup table as a pre-process if you want it bundled into the executable - or just loading in an external file at runtime with the lookup table. It's still pretty neat. Definitely ergonomic.

    • @Ch40zz
      @Ch40zz Před 3 měsíci +2

      constexpr is very useful for code obfuscation, string encryption etc. enjoy doing that in C

    • @mgancarzjr
      @mgancarzjr Před 3 měsíci +3

      If you can throw constexpr in front of a function, you turn it, potentially, into a lookup table without having to find and replace all the code where the function is called.

    • @strehlow
      @strehlow Před 3 měsíci +2

      @@mgancarzjr or just re-code the function to do the lookup rather than compute the result. The calling code wouldn't need any changes.

    • @mgancarzjr
      @mgancarzjr Před 3 měsíci

      @@strehlow how do you propose letting each instance of the function know where to look for the lookup table? Genuinely curious. I can only think of static class members being able to automatically access the same memory.

    • @strehlow
      @strehlow Před 3 měsíci +1

      @@mgancarzjr The function definition has the lookup table. The calling code doesn't know or care how the function produces the result. It doesn't need direct access to the lookup table.
      This is no different than refactoring a function with a more efficient algorithm, or any other optimizations. As long as it still produces the same mapping of input to output, why would the calling code need any changes?

  • @Carstuff111
    @Carstuff111 Před 3 měsíci +1

    That opening.....man this is just one of the great bonuses to this channel! Here to learn and got a great laugh as a bonus right off the bat!

  • @joshuawillis7874
    @joshuawillis7874 Před 3 měsíci +10

    Last thing I expected: Dave using a Breaking Bad meme to introduce his video...
    Being able to talk to our compilers directly is one of the reasons I love systems languages! Have you explored Rust very much, Dave? It has some of the most impressive metaprogramming features I've seen - including full on injection into its AST to create custom syntax.

    • @coldlyanalytical1351
      @coldlyanalytical1351 Před 3 měsíci +2

      Forth could do that in the 1980s .. a brilliant langauge .. but that sort of 'refine the language' trick creates write-once-read-never code.

  • @BartoszZasieczny
    @BartoszZasieczny Před 3 měsíci +3

    Once I got a task to implement loading some lookup table that was supposed to be loaded form a csv file. There was an equation and argument values were a discrete finite set with even intervals between values. I threw out the CSV file and jsut implemented a constexpr function what would calculate the lookup table at compile-time, put all the values inside a vector and then just read them. Awesome stufff.

  • @Hadzz95
    @Hadzz95 Před 3 měsíci +3

    Great video Dave! Would love to see more videos about algorithms

  • @pete540Z
    @pete540Z Před 3 měsíci

    Thanks, Dave.
    I look forward to a video on the ct_sqrt function mentioned.

  • @samanthajarosek9756
    @samanthajarosek9756 Před 3 měsíci +1

    I love this channel, I am always learning amazing things

  • @johnshaw6702
    @johnshaw6702 Před 3 měsíci +2

    I loved your coverage of this. Many years ago I fell in love with templates, especially meta expressions. The introduction of 'constexpr' just simplified what I already was doing. If you understand templates, then you see the limitations of the C++ language up close and personal, as well as the power. I believe your examples could be done using templates alone, it just wont be as clean.
    I created template code that doesn't directly know the integral type being used. When an 'int' was 16 bits and a 'long' was 32 bits, and you wanted the minimum of a 32 bit integer, it would result in a 'long' integer. But when a compiler came with a standard library that said an 'int' was 32 bits, that was chosen instead. I realy didn't like to depend on the standard library to determine the choice, but it's the only way I knew how to do it. Letting the compiler dertermine something at complie time is a great way to reduce runtime overhead.

  • @XzatonJW
    @XzatonJW Před 3 měsíci +28

    Constexpr in C++20 is amazing, youre also allowed to do dynamic allocations provided you dont leak the memory out of the constexpr context.
    For example, you can use std vector, and so long as you copy the data into an array, or similar, its totally allowed

    • @AJMansfield1
      @AJMansfield1 Před 3 měsíci +3

      If you do some tricky stuff with arena allocators, even that's not necessarily impossible either. You can absolutely have a constexpr function returning pointers into an rodata segment of the compiled binary.

  • @damindra
    @damindra Před 3 měsíci

    Great video, just wrote a fibonacci program in c++ as I am learning , and this will be great to test against your constexpr code! Love your videos keep them coming

    • @ACCPhil
      @ACCPhil Před 3 měsíci +1

      Using tail recursion massively speeds things up. I wonder if I can paste code
      __int64 Fibtail(__int64 n, __int64 p = 0, __int64 c =1)
      {
      if (n == 0)
      return p;
      else if (n == 1)
      return c;
      else
      return Fibtail(n - 1, c, p+c);
      }

  • @jbuchan12
    @jbuchan12 Před 3 měsíci

    This is awesome Dave thanks, I'm a c# software engineer, and I am learning c++ atm. This is very good timing.

  • @JamieBainbridge
    @JamieBainbridge Před měsícem

    Awesome video. I'm a C99 grognard but had always wondered what this keyword meant. Your presentation is information dense but easy to understand with lots of laughs too. Thanks!

  • @emjizone
    @emjizone Před 3 měsíci +2

    Good to understand and remember.
    I'm much more used to writing static code than programming the final compiler to build static code, but this promises a level of machine-level optimization that I've too often neglected.
    *Thank you* Dave from @DavesGarage for this valuable lesson.
    In any case, I would never calculate functions such as fibonacci or factorial by top-down recursion in production, of course, not even with memorization. This silly approach only serves to demonstrate concepts.

  • @l.piekha100
    @l.piekha100 Před 19 dny

    Your videos are simply awesome and perfect, thank you so much.

  • @firiasu
    @firiasu Před 3 měsíci +2

    You're awesome man! Like C++ itself!

  • @zamf
    @zamf Před 21 dnem +1

    constexpr is so powerful that if you go far enough in theory you can have your binary contain every possible state that your program can be in (calculated at compile-time) and then just "jump" to that program state based on your inputs that are only known at runtime. Of course, this would mean the binary could grow to an unimaginable size but in theory it would run in a split second.
    For example, Jason Turner (a big proponent of constexpr in the C++ community) suggested a game engine where all the states of the game are calculated at compile time using constexpr functions and then the whole game would be a sequence of pre-drawn frames that appear on the screen based on user input. And in this theoretical engine the FPS could reach the CPU frequency.

  • @TIME1minute
    @TIME1minute Před 3 měsíci +1

    Hi Dave! I work with Qt C++ right now and your videos are helping me make the quality of my work better! Thanks again! Do you think you'll do any more videos on compile-time execution, i.e. template arguments and properly using constexpr in class definitions?

  • @xnadave
    @xnadave Před 3 měsíci +1

    So, I can stop writing code generators for LUTs? Very cool. I was already vaguely aware of how this worked, but seeing a concrete analysis is very helpful. Thanks for sharing.

  • @bowiemtl
    @bowiemtl Před 3 měsíci

    I went from learning about tidbits of early windows programs to learning about relevant code features. Thank you sir!

  • @mikeh_nz
    @mikeh_nz Před 3 měsíci

    Neat! I loved the breaking bad intro and hadn’t heard of constexpr, so googled when it was the throat shot
    I hadn’t heard of this - I must be somewhat ancient back at the original c++ versions
    Yeah I guess the downside is the that timebuild is broken due to some intern doing a fib(2^100) in their code.
    Love the video

  • @KillerMZE
    @KillerMZE Před 3 měsíci

    "if constexpr" is super useful for making templated functions that act mostly the same except for a small snippet. For example I've successfully used it for a container that needed both a Peek and a Read function, that did pretty much the same thing with the difference of either advancing an internal variable or not

  • @marksmit6086
    @marksmit6086 Před 3 měsíci

    I use constexpr with a hash method for strings. That way in a switch/case statement strings can be used as a selector. This makes code more readable, I love that.

  • @erroneum
    @erroneum Před 3 měsíci

    Another great thing about constexpr is that you don't need to pass only constant arguments in (unlike non-type template parameters or consteval functions), but if you have even basic optimizations turned on, when you do pass in constant arguments, the compiler should evaluate the result and substitute it in.
    As an example, I'm writing a calculator, but I want to be able to use machine integers for speed, but switch to GMP when they get too large. To do so, I needed to implement the logic to check if there would be an overflow/underflow, so I made 3 functions, willOverflowAddition, willOverflowSubtraction, and willOverflowMultiplication, which are template constexpr functions for two signed integers. In practice they won't be being used in constant expressions, but for the optimizer it's helpful to afford it any opportunities to simplify the code you can.

  • @jonathanlapierre8620
    @jonathanlapierre8620 Před 3 měsíci +8

    Unless I do not understand here what you are basically saying is that Program A aka the compiler, does the heavy lifting of compute and pre-resolve the sieve which is pushed in the binaries of Program B aka the actual competition submission. Although this is a "clever" niched solution, it's no more useful than having the sieve solution injected from a JSON that was pre-calculated by another program before.
    I am no C++ guru and I love that I have learned something but it seems this is no different than compilers environment variables logics.
    As for the sieve, I totally get why it would not be included in the competition.
    Big fan. Awesome stuff. I should have started with that.
    Ciao!

    • @bretthunter2828
      @bretthunter2828 Před 3 měsíci +4

      I thought the same - if you want fib(35) hard coded why bother even using a constexp "function" but just put the value in directly - you are just as likely mistype the functional code as you would type the exact number in from google or calculator. The only upside would be if you are unsure at the time of writing the main program what fib number you want to use - and hence easy to change to say fib(22)

    • @MatthewMakesAU
      @MatthewMakesAU Před 3 měsíci

      ​@@bretthunter2828it's a lot easier to review code for errors than a magic number

    • @germanassasin1046
      @germanassasin1046 Před 3 měsíci

      ⁠@@bretthunter2828what you experience is what is called a failure of imagination. I also thought like you, but when I saw what kind of sorcery people came up with I was shocked. For example I saw a compile time regex parser library, if you know regex it can construct state machine at compile time which is an incredible speed up. And it was simple to use!

    • @-_James_-
      @-_James_- Před 3 měsíci +3

      To give a concrete example to what M-FeD was saying, if you have a string and you need to know which of one possible values it contains, you can switch on the runtime-calculated hash of the string and have case statements of compile-time calculated hashes of possible strings to compare against. The alternative would be a ton of if...else if... statements and a bazillion string compares.
      E.g.:
      switch( hash( str ))
      {
      case hash( "colour" ):
      break;
      case hash( "size" ):
      break;
      //etc...
      }
      All you need is a couple of hash functions that can hash strings or string literals, and you're good to go.

  • @IamSholiSJ
    @IamSholiSJ Před 9 dny

    I am not coding anymore, but this is something I would tried in during my college days for coding competitions, good content 🎉

  • @drop0ne_f20
    @drop0ne_f20 Před 10 dny

    I freaking love the intro Dave!!!

  • @AdvancedSoul
    @AdvancedSoul Před 3 měsíci

    The coolest compile time practical code I've seen in use in the wild is Zig's regex compiler

  • @palaashatri
    @palaashatri Před 3 měsíci

    Hahaha love the intro, and the video. Will definitely share it with my colleagues!

  • @idaknow716
    @idaknow716 Před 3 měsíci

    Thank you for another interesting programming video.. I confess that much of it is jibberish, but with the meager programming i have learned thus far, I am left with this question.. Do you recommend any curriculum, online or institutional to learn programming? Reading the code from previous videos and from your github, I can understand the flow as if in a fog, but lack the base knowledge that might make it more clear. Just curious to ask.
    Loved the intro too.. I have seen so many internet meme's using that scene... :)

  • @VoidloniXaarii
    @VoidloniXaarii Před 3 měsíci

    Thank you! Might be that your explanation finally made it stick 4 me

  • @rohann8073
    @rohann8073 Před měsícem

    this is awesome
    thanks Mr. Plummer

  • @CremeDeLaCoulee
    @CremeDeLaCoulee Před 3 měsíci

    For the sieve prime problem, it shouldn't be necessary to separately calculate the square root of the upper bound before starting the algorithm. It seems like a more efficient algorithm might be ... As we're marking off the multiples of primes, keep track of the multiplier in a local variable incremented at each pass, and if the multiplier is less than or equal to prime itself after we've passed the upper bound, then we know we are at/past the square root of the upper bound and can call it quits on marking off more multiples of primes.
    The cost of this approach is two new operations -- incrementing the multiplier in the inner loop (while crossing off each multiple of the prime) and a comparison at the end of each prime loop to check whether this multiplier is less than or equal to the prime itself (and if so, then break from the outer loop). The second operation is a substitute for comparison against the pre-calculated square root, so isn't really an additional operation, it is just placed after rather than before. So, unless the upper bound is the square of a prime, we will have to run one additional prime to know we have reached this point, as compared to the pre-calculated square root which will tell us before we run that prime.
    The time-saving on this approach would depend partially on the efficiency of the dedicated square root algorithm (I'm skeptical on the efficiency of the one shown...). And the relative cost of the increment operation that's tracking the multiplier. My instinct is that since the multiplier is a local variable on the stack whereas crossing out the prime multiples is an array/memory operation presumably on the heap, its likely the increment will be done concurrently by the CPU while waiting for the memory-write operation to complete, and thus the cost will be nonexistent in practice on a modern CPU. I could be wrong... in which case the additional cost of this incrementing would grow O(n^2) with n = the upper bound... but then what is the complexity of the recursive square root??? Plus there's no chance of optimizing that with concurrent memory operations that need to be done anyways... hmmmmmm?????

  • @ycombinator765
    @ycombinator765 Před 3 měsíci

    So constexpr just compiles my code before I hit the compile button? noice!
    Thanks for this banger

    • @mario50000
      @mario50000 Před 3 měsíci +1

      no
      constant expressions are evaluated at compile time
      the point is that the expression is constant and thus won't need to be calculated during runtime as it was already evaluated and baked in during compilation

  • @kensmith5694
    @kensmith5694 Před 3 měsíci

    Thoughts on using this:
    1) It could be used to precompute the twiddle factors for an FFT if you are working to make the FFT really fast. It is an alternative to writing a program to compute them and then including the text into your C++ source.
    2) In a hardware thing I needed to know the best combination of what was loaded into two registers of a device. One register could selected between 7 clock sources and the other was the (integer) factor to divide by. The goal was to hit an exact frequency as best I could. I solved this again by writing code that figured it out and spit it out as text to include.

  • @HritikV
    @HritikV Před 3 měsíci

    the starting scene was magnetic ! taught so much by baiting us all

  • @rikschaaf
    @rikschaaf Před 3 měsíci

    If you know an upper bound to computation and the result is very memory efficient (like with the sieve only having 1 integer as an output per function call) and you know that the function would have been called more often than it would be called by pre-computing upto the upper bound, then it might be worth to calculate it at compile time.

  • @JxH
    @JxH Před 3 měsíci +2

    In a sense, it's a small step away from just working out the answer in advance, and then just outputting the answer at "Run" (sic) time.

  • @aniketbisht2823
    @aniketbisht2823 Před 3 měsíci

    You can also use the entirety of (C++23) and allocate at compile time (C++20) to perform arbitrary computations at compile time.
    Constexpr should be the reason alone for many C bois to consider using C++ (another compelling features being RAII, templates and type-safety).

  • @hhlavacs
    @hhlavacs Před 3 měsíci

    Nice video about an important topic. Thanks Dave! constexpr is really a super power! People have implemented constexpr ray tracers! Just one side note: There is a closed form formula for Fibonacci numbers (no recursion required), which can be evaluated in microseconds as well, without any computational limitations on the number. It is based on linear algebra, eigenvectors and the eigenvalues (1+sqrt(5))/2 and (1-sqrt(5))/2.

    • @milanstevic8424
      @milanstevic8424 Před 16 dny

      in other words, phi and 1-phi
      where phi is defined as phi = 1+1/phi (or phi^2 = phi+1) and is known as golden ratio
      and 1-phi is known as a golden ratio conjugate (1-phi = -1/phi)
      phi can be declared as a constant phi = 1.6180339887498948482045868343656
      whereas 1-phi = -0.6180339887498948482045868343656
      so due to how floating point numbers work, it's best to instead define
      apc = 0.6180339887498948482045868343656 (absolute phi conjugate)
      and then apply that by doing
      1.0+apc (instead of phi) and -apc (instead of conjugate)

  • @jeffscheeler2532
    @jeffscheeler2532 Před 3 měsíci +5

    Most of this went way over my head!

  • @yephick
    @yephick Před 3 měsíci +2

    Writing embedded code (a USB camera) this `constexpr` allows me to pre-determine many things, including building the complete USB descriptors' set, and just "bake it in" into firmware image

  • @gehirndoper
    @gehirndoper Před 3 dny

    Another main use for constexpr is to allow template initialisation with more complicated values. E. g. you can use something like std::array where iexp(3, 10) computes 3 to the power of 10.

  • @JackRussell021
    @JackRussell021 Před 3 měsíci

    I guess I first starting paying close attention to this when I ran across some more complicated C++ templates, and where one could "turn on" or "turn off" certain class methods based upon the properties of the type being used when instantiating the template. When I was trying to disentangle all of this, I would have liked some compiler diagnostics related to how all of those things are evaluated.

  • @danwilson7581
    @danwilson7581 Před 3 měsíci

    I like this vid Dave. Its back in the 'slightly overhead' range, as im a net dev type, with lots of net. Much better than boring network snoozefests. By the way, ever develop anything that plays with tcp sliding windows?

  • @DavidLindes
    @DavidLindes Před 3 měsíci

    This is cool. And wasn't there when I learned C++ (which I know a bit of, but don't use super often for the kinds of work I do), so... I'd missed the memo! Thanks for sharing!

  • @BoyCcns
    @BoyCcns Před 3 měsíci +1

    A practical use-case: My ui engine uses constexpr crc32 to make string comparison of keywords in the css evaluator a LOT faster.

  • @pleasurewasmine3173
    @pleasurewasmine3173 Před měsícem

    i envy your memory and general cognitive ability Dave. You are a fortunate person to have such a beautiful mind. in a world where everything seems to be a race against time, you have seen this over the years and better yet you were able to keep up with it.

  • @thomaswilliams690
    @thomaswilliams690 Před 3 měsíci

    Wow, amazing! Now to try some things.

  • @arnold5328
    @arnold5328 Před měsícem

    I‘m so happy to find this randomly

  • @shm236
    @shm236 Před měsícem

    Finally, just what my todo app needed.

  • @michaelbauers8800
    @michaelbauers8800 Před 3 měsíci

    I pre upvoted due to this (a) being Dave's Garage which I respect, (b) C++, and (c) constant expressions, which I love. const expressions are easier to understand than some crazy template metaprogramming to generate primes :)

  • @The_Pariah
    @The_Pariah Před 3 měsíci

    It is intimidating how well you understand code.

  • @codeinnovator
    @codeinnovator Před 3 měsíci +1

    I guess the limitation is that the input should also be something that can be deterministically determined before hand.

  • @drewmortenson
    @drewmortenson Před 3 měsíci

    I'm being brought onto a project with an associate professor at my university, he needs me to write a GUI and optimize his python code. I'll be re-writing it in C++, and there are a lot of physics calculations so I'm certain this will be helpful for setting constants needed in those calculations.

  • @johannesschneider1784
    @johannesschneider1784 Před 3 měsíci

    This is such an excellent explanation of constexpr omg

  • @OZtwo
    @OZtwo Před 3 měsíci

    This is cool! Or another way to think of it is that constexpr is more or less converting you code into an executable script.

    • @dutchdykefinger
      @dutchdykefinger Před 3 měsíci

      yeah it's like evaluating the code, then pasting the output back as constants lol

  • @fiffy6572
    @fiffy6572 Před 3 měsíci

    i didn't know baout constexpr, thank you!

  • @BGFutureBG
    @BGFutureBG Před 18 dny

    Small addition: constexpr functions do NOT guarantee execution at runtime, they enable it. constexpr functions can be invoked with runtime values for which they will compute the result during runtime. This is why the C++20 standard introduced the "consteval" keyword, which actually forces the compiler to never execute a function during runtime.

  • @jimchoate6912
    @jimchoate6912 Před měsícem

    I love that scene, watched it a thousand time.

  • @MoSiddeek
    @MoSiddeek Před 3 měsíci

    Great work! I still think these are contrived examples however. How does the consexpr implementation compare to hardcoding pre-populated isPrime[] or fib[]? I can't see the advantage really.

  • @oblivion_2852
    @oblivion_2852 Před 3 měsíci

    My favourite video on this topic is "Rich code for tiny computers" where Jason Turner writes very complex C++ functions that generate all sorts of tables for the c64.

  • @pcdoodle1
    @pcdoodle1 Před 3 měsíci +2

    that intro is gold

  • @SoapSoapCrayon
    @SoapSoapCrayon Před 3 měsíci

    If you're wondering (and on windows) the main reason this took 38 microseconds was due to std::endl. It has an implicit os.flush() which can take a while on most platforms, especially windows. You can just output a "
    " which shouldn't flush. It won't change the outcome much (though it's loads faster), constexpr is amazing, and this video is great. If you're looking at very time sensitive things, it's probably better to sample the clock before/after the calculation then print it all in one big go, rather than measure the print too! Sorry for nitpicking!

  • @danthepyroman1
    @danthepyroman1 Před 3 měsíci

    I did something similar for solving Dao using minimax with precomputing moves. Very interesting that the compiler can perform the work instead.

  • @khaledf3977
    @khaledf3977 Před 16 dny

    What greet way of explaining thanks

  • @BMW_M-Performance
    @BMW_M-Performance Před 3 měsíci

    That intro was awesome!

  • @stke1982
    @stke1982 Před 3 měsíci +1

    Well, in practice the performance improvements could have also be achieved otherwise, but constexpr allowed for a lot clearer code and less magic numbers or handcrafted LUT, thus less errors.
    However if you work on a big project you want to keep compile times reasonably small and should be careful with constexpr that take very long to compile (or have a compile toggle to use the expensive ones)

  • @hebozhe
    @hebozhe Před 3 měsíci

    I'm subbing for the intro, alone.

  • @kasufert
    @kasufert Před 3 měsíci +1

    That intro skit hit diff

  • @hongkongcantonese501
    @hongkongcantonese501 Před 3 měsíci

    I never thought a C++ joke(!) would be so gd funny! Gonna re-read Petzhold for the gags. :)

  • @strehlow
    @strehlow Před 3 měsíci +1

    So you're using the compiler as an interpreter? Does it compile the code faster than the same algorithm runs it?

  • @Sluggernaut
    @Sluggernaut Před 2 dny

    You at the end of the video: "This is all probably fairly new to you"
    Me: Yep this super new technique is amazing!
    Then I go do some research, look at your github and follow the trail to the post about constexpr square root function. Aaaaannnnddd its a post that will turn 10 years old next month?! wtf? How have I never seen this? Forget that it has to be ran at compile time and the long-ish time it takes to compile. It's still very very cool.
    Thanks for this video!

  • @MikesGoAcademy
    @MikesGoAcademy Před 2 měsíci

    I was once asked in a job interview to write conversion to roman numerals and about midway i realized, it can be a constexper and showed to to the interviewer, he was blown away

  • @melodicmonster
    @melodicmonster Před 3 měsíci

    I use constexpr with specializations involving integer_sequence to generate lookup arrays for sequences like this so runtime calls can also benefit. Fibonacci(x) just becomes a wrapper for the array index, e.g. return x < size ? sequence[x] : some_error_expression;