All about MEMORY // Code Review

Sdílet
Vložit
  • čas přidán 25. 06. 2024
  • Keep exploring at brilliant.org/TheCherno/ Get started for free, and hurry-the first 200 people get 20% off an annual premium subscription.
    Patreon ► / thecherno
    Instagram ► / thecherno
    Twitter ► / thecherno
    Discord ► / discord
    Code ► github.com/St0wy/GPR4400-Phys...
    Designing a Physics Engine in 5 minutes ► • Designing a Physics En...
    Send an email to chernoreview@gmail.com with your source code, a brief explanation, and what you need help with/want me to review and you could be in the next episode of my Code Review series! Also let me know if you would like to remain anonymous.
    CHAPTERS
    0:00 - 2D Physics Engine
    13:17 - Heap allocation, memory fragmentation and the CPU cache
    18:30 - Logging performance considerations
    23:12 - Smarter memory allocators
    29:12 - Allocate memory once ahead of time when possible
    This video is sponsored by Brilliant.
    #CodeReview

Komentáře • 214

  • @TheCherno
    @TheCherno  Před rokem +183

    Who's excited for part 2?
    Keep exploring at brilliant.org/TheCherno/ Get started for free, and hurry-the first 200 people get 20% off an annual premium subscription.

    • @pushqrdx
      @pushqrdx Před rokem +2

      Can you please point me to that sweet Visual Studio color scheme you're using?

    • @heman922
      @heman922 Před rokem +3

      Plz do a series of cpu

    • @PhoenixDigitalGamer
      @PhoenixDigitalGamer Před rokem +1

      Can u make tutorial on creation of game engine Cinematics system. Please :)

    • @mr.anderson5077
      @mr.anderson5077 Před rokem

      yes please

    • @harshsulakhe2720
      @harshsulakhe2720 Před rokem +3

      can u plz make a complete VIDEO ON ASSEMBLER like the one similar to LINKER AND COMPILER

  • @Stowy
    @Stowy Před rokem +376

    Thanks a lot for looking at my code ! For the logging, I was using spdlog, but then I removed it because I wasn't able to import it using FetchContent haha. This is very useful feedback and I can't wait for the part 2 !

    • @ThePowerRanger
      @ThePowerRanger Před rokem +3

      Cheers, good luck for your classes!

    • @blazefirer
      @blazefirer Před rokem +18

      2rd

    • @Stowy
      @Stowy Před rokem +6

      @@blazefirer english is not my first language lol, my b

    • @blazefirer
      @blazefirer Před rokem +14

      @@Stowy its ok. I saw that there was only one reply and I would the 2nd so I couldn't resist making the joke

    • @ohmree
      @ohmree Před rokem +1

      I suggest taking a look at xmake as a replacement for cmake, it probably has spdlog in its repos and is just a pleasure to use in general.

  • @anon_y_mousse
    @anon_y_mousse Před rokem +202

    As a predominately C developer, I agree with and applaud his choice of adding "pp" to the end of file names to differentiate C and C++ header/source files. They are separate and it should be noted. Arena allocators are a good idea and I've implemented several that I use in my own libraries. Heap allocation need not always be super expensive, even with "vectors", and the mitigation technique I learned years ago that still works beautifully to this day is to scale by a factor of two and always reserve memory starting at some power of two. As to the comment about logging, yes, it is a Windows "feature" to slow down by such a large factor when logging to a console. If you use a Linux distro of nearly any variety you'll be surprised by how quick the terminal updates as compared to Windows.

  • @mr.anderson5077
    @mr.anderson5077 Před rokem +10

    Cherno, has a huge backlog of "The topic for another video", please keep it coming. yes, we do want a cpu cache, memory fragmentation , and what not in the multiverse video

  • @paligamy93
    @paligamy93 Před rokem +2

    @8:13 would not recommend starting with _ ever because its too easy to make a mistake because "Use of two sequential underscore characters ( __ ) at the beginning of an identifier, or a single leading underscore followed by a capital letter, is reserved for C++ implementations in all scopes."
    @13:31 Not only do you want to be using pointers, but ask yourself "Do I need a hierarchy or do i just need several implementations of void ClassName::Update(float deltaTime)"? Because if you don't need a hierarchy, don't use one! Use type erasure and yes its still a function pointer and a potential cache miss, but it will simplify your code structure. Now you have a folder called Entities instead of a type named Entity that everything derives from and your type erased entity type now defines the contract a type must fulfill to be an entity instead of saying you HAVE to derive from Entity to be useful here.
    @14:51 Also known as a "cache-miss" because the writer was not as concerned about "cache locality"
    @19:59 std::sync_with_stdio(false) improves that time considerably but c++ iostreams are notoriously slow and the reason why is because of all the safeguarding overheads they do. The console is slow because it has to render which as you know is bleh. Logging libraries are the way to go in this case and not have them output to console but have them output to files. This is a graphical program so there shouldn't be a "console out" anyway. Create a new global logger named log or something at the very least. There are multithreaded logging libraries that will attempt to put your logs in chronological order if you don't want to split them.
    @25:41 On the virtual part: The operating system will allocate to you a "page" of memory when your current page is full so its basically the same thing as a small arena allocator, but its so much smaller than what an arena allocator will give you and many many system calls to the OS to ask for more "pages" is what makes allocation take so long. You're giving over your CPU cycles to the OS and that's going to mess up your execution cache because it code that's not in your program that's being called, malloc or whatever is going to be a function that's in a dynamic library aka a function pointer and more cache misses. Profile your system calls! You may find more than you expect. Also align your types (adds padding) so that when you do ask for a value its not going to have to ask for 2 lines (? proper name escapes me) because half of your object is on one line and the other half on another.
    @31:47 It looks like the number you're looking for is already computed with collision pairs as well. You seemed to know you needed to make a vector but made it too early! make instances as close as possible to where you use them.
    @32:09 I think the multiple solver problem is something that should be handled with a template. From what I saw you don't need to dynamically at runtime change your solver with the same types. Make your solver be something the compiler figures out.

  • @simonesasso8379
    @simonesasso8379 Před rokem +124

    Yes, implementation and profiling of the optimizations would be super interesting to see!

    • @crumbled9774
      @crumbled9774 Před rokem +4

      yes yes yes. Can't wish for anything better!

    • @ChrisM541
      @ChrisM541 Před rokem +2

      Totally agree, that would be super interesting.

    • @ibrahimmahdi1299
      @ibrahimmahdi1299 Před rokem

      can't wait for a video like that from the best "TheCherno"

  • @miguelguthridge
    @miguelguthridge Před rokem +8

    At 14:40 where you're talking about cache misses, there's a relevant article which is really good called "Your computer is not a fast PDP-11"

  • @thwKobas
    @thwKobas Před rokem +39

    I left C++ like a 7 years ago, and this brings so much memories and smile to my face. I'm watching your videos for few weeks now and must say good job and keep uploading. :)

    • @tathagatmani
      @tathagatmani Před rokem +1

      What did you switch to ?

    • @matthewe3813
      @matthewe3813 Před rokem +1

      @@tathagatmani probably rust or c

    • @thwKobas
      @thwKobas Před rokem +3

      @@tathagatmani Actually I switched first to objective-C and then swift :D Doing iOS mobile development now

    • @Alperic27
      @Alperic27 Před rokem

      c++ has evolved a looot … but he seems to be stuck in c++9x style.

  • @crystalferrai
    @crystalferrai Před rokem +6

    31:45 Good advice about preallocating vectors. If this is a function that runs every frame, I would take it a step further and make the vectors persistent. Clear them at the start of the function and reuse them. This way the memory remains allocated and keeps getting reused. Another option would be to use an auto-resetting frame allocator like you mentioned earlier. However you go about it, the main idea is to not make new heap allocations every frame.

  • @ThePhyskid
    @ThePhyskid Před rokem +19

    I'd really be interested in seeing how you add the optimizations. In particular, I'd be interested in seeing how you clean up the memory used by the arena allocator once you're left with holes.

  • @jonathangrahl
    @jonathangrahl Před rokem +8

    Great topic! This has been in my head the latest weeks when implementing my path tracer and SaH BVH, and the optimisations really add up. Especially referring objects by index and saving them in a 1D array.

  • @Klusio19
    @Klusio19 Před rokem +5

    I just started learning C++, currently I (I think) finished learning OOP concepts, and this video is so interesting for me actually! The stuff about the memory and access times is pretty interesting.

  • @Mnmn-xi6cj
    @Mnmn-xi6cj Před rokem +17

    Would love to see you profiling this after your first look at it. I'm sure the stack allocation and growing of the vector each frame hits like a truck. That would also allow you to show some before/after benchmarks!

  • @sixtenhugosson
    @sixtenhugosson Před rokem +5

    If anyone wants to learn more about memory arenas, there's a good write-up called "Untangling Lifetimes: The Arena Allocator" by Ryan Fleury.

  • @Basel-ll8fj
    @Basel-ll8fj Před rokem +4

    this series is really fun to watch and very helpful

  • @ShaunYCheng
    @ShaunYCheng Před rokem +16

    I'm not a game dev but this is still very educational.

  • @jeffcummings3842
    @jeffcummings3842 Před rokem +6

    You really caught my attention when talking about the CPU cache, as I've done some work with Assembly Language programming WAY in the past, but yeah, understanding how that works is an amazing detail for optimization. OMG, great idea with the logging to a file vs console, I'm just getting to the point in my project where it's starting to become medium sized, and logging is an issue already, so great to know that logging to files is more efficient...plus the macros... it probably helps that I am watching your video at a time when I'm considering re-working my entire codebase for my main project too. LOL OMG, that's amazing that you can pre-allocate memory and pass an allocator to the vector class, I'm totally going to look into this and try it! Great video, thanks for sharing.

  • @HLCaptain
    @HLCaptain Před rokem +12

    What I would like to see is you optimizing a project based on your recommentation you given in this video, then compare the results with an unoptimal solution with via a profiler. Would be super interesting! Great video though! :)

  • @SkyCityInc
    @SkyCityInc Před rokem +4

    This is awesome, makes me want to write my own physics engine as an exercise. Can't wait for the next video!

  • @aaron6807
    @aaron6807 Před rokem

    FINALLY! I'VE BEEN WAITING FOR THIS EPISODE FOR AGES

  • @fellypsantos_
    @fellypsantos_ Před rokem +4

    extremely valuable knowledge passed here, thanks Cherno ♥

  • @StevenMartinGuitar
    @StevenMartinGuitar Před rokem +2

    Would def love to see you profile this and then implement the optimisations and profile again! (threading, arena, allocators, less heap etc) great video!

  • @atraps7882
    @atraps7882 Před rokem +2

    im not even a game developer, i just work on the web and the cloud doing backend stuff but this is really interesting to watch. Subbed!!

  • @Sebanisu
    @Sebanisu Před rokem +1

    Just realized you are still doing code reviews and this one had 3 videos. So Now I got my afternoon planned out heh.

  • @uploadschedule
    @uploadschedule Před rokem +2

    in the moment now i dont have time to watch it. But later i will watch this vid and im sure its interesting because videos about how the hardware components work etc are always a thing i like learning about :D

  • @mementomori7160
    @mementomori7160 Před rokem +2

    I really liked this video, all in for part 2

  • @darioabbece3948
    @darioabbece3948 Před rokem +1

    The project: c++ gameplay
    The cherno explanations: c++ lore

  • @jef777
    @jef777 Před rokem +4

    This main function looks so nice. I wish mine could look so inviting.

  • @douglasullman
    @douglasullman Před rokem +1

    I've been loving your stuff and gotta say the plug for brilliant is brilliant ! I'm going to check that out. Thank you so much Sir.

  • @viraatchandra8498
    @viraatchandra8498 Před rokem +3

    for c++ simple logging, you can look at `sync_with_stdio(false)` and `std::cin.tie(NULL)` calls to accelerate your `cout` code a bit. `printf` will in general be faster though because it doesn't deal a lot with multi threaded scenarios. there are even faster ways to output logs, but of course, its non trivial overhead.

  • @cyphre117
    @cyphre117 Před rokem +1

    Would be great to hear you talking about Static vs Dynamic libraries!

  • @IkeVoodoo
    @IkeVoodoo Před rokem +4

    Great video, though each time I wish that we could see the final optimized version of the project :D

  • @Beatsbasteln
    @Beatsbasteln Před rokem +1

    this was fascinating. can you make a video about how to make an arena allocator and then show how you use it when creating vectors?

  • @thehambone1454
    @thehambone1454 Před rokem +1

    Would love a video about the CPU cache and the related!

  • @ricardopieper11
    @ricardopieper11 Před rokem

    This is the 1th The Cherno video I watch

  • @Thomas_Lo
    @Thomas_Lo Před rokem

    cool refrence video for quite a lot of topics. works well as a refresher :-)

  • @squelchedotter
    @squelchedotter Před rokem +3

    I wouldn't expect that the virtual memory thing matters all that much considering current CPUs don't prefetch across page boundaries anyway. But things like huge pages do have advantages in terms of TLB lookups and hit rates.

  • @-infality
    @-infality Před rokem +5

    Regarding the slow Windows terminal you may be interested in Casey Muratori's videos about it and his refterm prototype project

    • @Macuyiko
      @Macuyiko Před rokem

      Was going to mention that as well. He goes into some interesting details about conhost if I remember correctly which is doing a lot of crazy things that make consoles slow on Windows.

  • @nathantonning
    @nathantonning Před rokem

    Great code review.

  • @on-hv9co
    @on-hv9co Před rokem +1

    I do something very similar with that log macro. its essentially just an X macro that wraps cerr and uses the ascii color codes. from there DLOG and RLOG are called and will log their respective debug/(sparse) release

  • @tolkienfan1972
    @tolkienfan1972 Před rokem +3

    Often the dependencies between chained pointers is more important than the fragmentation. I.e. you could explicitly construnct a linked list in contiguous memory, but iterating will still involve the cpu waiting for each load to complete before it can calculate the next pointer. Iterating over the exact same nodes, but using an index instead of the next pointers, will be much faster. The cpu can prefetch the cache lines.

  • @Spartan322
    @Spartan322 Před rokem

    Terminal logging is slow in C++ because most streams, especially cout, tends to flush constantly where as most implemented file logging in C++ doesn't perform constant and immediate flushed for every input.

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 Před 8 měsíci

    Strong argument to use hpp: A potential user does not need to think about extern "C". If it's .hpp, it can be included only and directly in Cpp. .h leaves a lot of room for speculation. Can you import it from C? Can you import it from Cpp? Do you NEED to call extern "C"? It's there for a reason.

  • @enigma7791
    @enigma7791 Před rokem +3

    Yes if you could look at your optimisations and the effect on performance that would be really cool! Often I spend too much time optimising code for very little return. EDIT...but I do note the FPS is massive here anyway so it is difficult to quantify if it's worth it. Maybe throw in something that really puts a strain on the FPS and see the optimisations make it smooth again? Either way great code Stowy and great review Cherno.

  • @F1nalspace
    @F1nalspace Před rokem +2

    Nice project and good talk about memory improvements! Memory arenas and transient memory are great and my most used techniques when i do programming these days.
    If you are interested, i have a similar physics project (2D fluid simulation) that is a little bit more complex, due to its multi-threading + integrated benchmark support and 4-versions of C++ styles, where i tried to show the difference between naive/from-the-book C++ programming to data-oriented-programming, but didn´t get it exactly right - especially the data-oriented part. Just give me a hint, i will sent you the details.

  • @shalip
    @shalip Před rokem +1

    please release a video where you implement your suggestions. It would be so GREAT !!

  • @MrDenniable
    @MrDenniable Před rokem

    @19:45 About the huge time consumption of logging... You should check out Trice! It speeds up your logging performance on embedded systems :)

  • @SC2Villares
    @SC2Villares Před rokem

    Why is that channel so good? Humanity deserves it? Oh my, what a gift!

  • @MrSandshadow
    @MrSandshadow Před rokem +1

    23:50 it's called 'placement new'

  • @dealloc
    @dealloc Před rokem

    The reason it's slow to write to stdout is that things like std::flush, std::endl and new lines ("
    ") will flush the contents of the cout buffer into the stdout buffer terminal (writing to it) this happens instantly because terminals usually have little or no buffering, so it can appear instantly. This also happens with files on disk; although it's perceived as faster because it doesn't flush the contents as frequently, due to how the OS buffers the contents before writing to the file on disk. So it's not that terminals are slow, it's that any I/O is slow in general.
    You can avoid this by flushing the cout buffer less frequently (i.e. outside of loops) but it can be an architectural nightmare and often not needed, since you're probably more interested in up-to-date info when debugging. Do what Cherno (and many other projects) does and use different levels of logging for more granularity.

  • @sethmoore5903
    @sethmoore5903 Před rokem +1

    I'm curious how the actual defragmentation process works in a game engine and how it affects performance in a simulation where we have lots of circles dying

  • @bu3778
    @bu3778 Před rokem

    damn this was a nice review

  • @nikeedev
    @nikeedev Před rokem +2

    I read C++ standards 2 months ago, and it said that C++23(C++2b) will support .h file as standard header file. It doesn’t mean that .hpp shouldn’t be used, but .h will be supported because it was before planned to phase it out, but as it was used a lot within C but also C++ they will keep it

    • @ultimatesoup
      @ultimatesoup Před 9 měsíci

      You can actually get rid of headers entirely if you use modules

  • @MosiuoaF
    @MosiuoaF Před rokem

    Thank You!

  • @simonkufeld7903
    @simonkufeld7903 Před rokem

    this channel should have more subs

  • @TuxikCE
    @TuxikCE Před rokem +4

    Pls bring more of these code reviews!

  • @wright777
    @wright777 Před rokem

    For a better std::cout -> console performance:
    1. Call ios_base::sync_with_stdio(false);
    2. Call std::cin.tie(nullptr);
    3. Use '
    ' instead of std::endl

  • @cloud9sl98
    @cloud9sl98 Před rokem

    WORKING thx bro

  • @stdc_tri
    @stdc_tri Před rokem +1

    As a convention, I think using p_Member for protected members are better than m_Member, it just makes it more clearer in my opinion.

  • @Overminddl1
    @Overminddl1 Před rokem

    Logging to console in Windows is indeed substantially, like Substantially slower than on Linux, however there are ways to speed it up as well, both by using Microsofts new terminal as well as using buffering in the program instead of flushing every single log immediately, still not as fast as on Linux, but helps a ton.

  • @deconline1320
    @deconline1320 Před rokem +2

    We see it often in code, but in C++ it's not a good idea to start a variable identifier with an underscore. Some combinations of single/double underscore identifiers are reserved for the compiler implementation by the C++ standard. I would avoid it completely.

  • @billynugget7102
    @billynugget7102 Před rokem

    C++ ALREADY HAS ARENA ALLOCATOR. It works for all std structures/containers even vector. Its called PMR

  • @kursatyakupkukul7670
    @kursatyakupkukul7670 Před rokem +1

    Wow, really enjoyed this one as a non game/game engine developer!

  • @featherless656
    @featherless656 Před rokem +2

    I wish I could find the motivation and smarts to be able to do stuff like this

  • @GautamSharma-un3cr
    @GautamSharma-un3cr Před rokem

    Please make a video on how to exploit cache lines and CPU cache in order to build blazing fast applications

  • @MrFlyingChip
    @MrFlyingChip Před rokem

    Haven't seen this in the comments, so will leave it. There's an article called "What Every Programmer Should Know About Memory". It explains in detail how the CPU works with memory, how RAM works, why it's so slow, and why CPU cache memory is so fast. I really recommend reading it (you just need to read only 3-4 first chapters).

  • @rajpootmhm
    @rajpootmhm Před rokem

    Please make a video on handling big data
    Along with memory management and time complexity

  • @m3taldragon1
    @m3taldragon1 Před rokem

    Certain IDEs require you to use hpp vs just h if you are using any C++.

  • @andreidumitras4237
    @andreidumitras4237 Před rokem

    What cholor scheme do you use?
    Awesome video btw.

  • @codemastercpp
    @codemastercpp Před rokem

    For speeding up console ouput
    You can unsync with stdio
    ```
    ios_base::sync_with_stdio(false);
    cin.tie(0);
    ```

  • @SETHthegodofchaos
    @SETHthegodofchaos Před rokem

    15:20 Is there a difference between a "Entity Component" system and a "Entity Component System" system/architecture? Both can be implemented with a data-oriented memory layout, correct?

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 Před 8 měsíci

    Logging on Linux/macOS: Yes, their terminals are magnitutes faster than Windows. Reason is that they are totally different implemented and Console on Windows is just slow. I read somewhere why it's hard to change. But Files are always faster, that's true.

  • @draco5991rep
    @draco5991rep Před 7 měsíci

    I just started programming in C and I wonder a lot about when to use the heap and when to use the stack. Because I am more comfortable using the stack, I predominantly put all data onto the stack. Is there an easy rule of thumb to when use one or the other?

  • @freandtuber
    @freandtuber Před rokem

    Maybe there is time to have a look in to openMP for loading and shaping allocated memory 🤔

  • @DanteWolfwood
    @DanteWolfwood Před rokem

    you suggested allocating things like rigid body to the stack because of cpu optimizations but shouldn't the programmer worry about space? Are you banking on the fact that vectors allocate on the heap contiguously? Or should there be a specific buffer created or contiguous heap memory?

  • @HappyHorge
    @HappyHorge Před rokem +18

    javidx9 has some quite excellent videos on how you can make games in C++ and programming in embedded systems, which is really nice if you're into that kind of low level programming 😄 Low Level Learning is also a great channel for that kind of knowledge 😄

    • @paligamy93
      @paligamy93 Před rokem

      CppCon and CppNow also great channels for the more advanced. Amazing talks by Michael Caisse and Luke Valenty this year about what can be done with compile time programming and the type system.

  • @xxdeadmonkxx
    @xxdeadmonkxx Před rokem

    I really want to know how would you deallocate item from custom memory pool (arena?)

  • @davidcmoffatt
    @davidcmoffatt Před rokem

    There is more benefits to contiguous data storage. Cutting down on TLB misses, and VM page misses jump to mind.

  • @frankhaugen
    @frankhaugen Před rokem

    The reason why writing to console is slow, is that windows assume a window, so it's written to the UI interopts, while filewriting is just bits on disk

  • @mobslicer1529
    @mobslicer1529 Před rokem

    with logging what i do is for stuff that gets called all the time i only log failures so you know what happens with those but don't flood the log.

  • @roz1
    @roz1 Před rokem

    @Cherno We can do calloc rather than malloc which will be a contiguous allocation .... that can help but still it can't beat the stack memory.

    • @TheCherno
      @TheCherno  Před rokem

      Both calloc and malloc returns a contiguous allocation of memory - there’s actually very little difference between how those two work

  • @BradenBest
    @BradenBest Před rokem +1

    I wouldn't worry about fragmentation. It's the heap allocator's job to worry about managing that. And in the general sense, as long as you free memory in the opposite order that you allocated it, fragmentation will not be a problem. I say this as someone who has implemented malloc+free in C. To get a memory leak from allocator fragmentation, you would have to do some insanely stupid things. Of course don't just allocate willy nilly from the heap if you don't have to. Heap allocation carries a performance overhead because when malloc has to get more memory, it has to do so via a system call, which means a context switch, which is slow. That's the `sys` metric given by the `time` command.
    Regarding specifically what is said in the video, where you go into low level machine details like the CPU cache, I especially wouldn't worry about that, because that's premature optimization. Worry about choosing efficient algorithms, not about how the machine accomplishes a task. That's the compiler's job. Turn on that -O3 flag. Or -Ofast if you're not worried about slightly less precise math. Sometimes you can justify low level optimizations, like when the Quake devs implemented the fast inverse square root using low level floating point math. But then look what happened--the chipset manufacturers and compiler vendors caught up. Nowadays, the quake inverse square root is no faster (and sometimes slower) than code that a compiler will generate for a more straightforward algorithm. I do not recommend wasting your time optimizing for hardware. The compiler has already done it and you can save a lot more time by choosing a better algorithm. C (and by extension C++) is not a low level language, and your computer is not a fast PDP.

    • @BradenBest
      @BradenBest Před rokem +1

      A big problem with that argument is the assumption that the pieces of data necessarily will be fragmented. It's "whataboutism" taken to the extreme. But let's look at an average case where you allocate 100 small objects using a heap allocator: the heap allocator has a free pool of memory, so it slices a chunk off for both the object and the bookkeeping node to manage that memory, and updates the other node to account for the borrow. It does this over and over again until 89 objects in, the pool doesn't have enough memory. So the allocator will do a context switch asking for more memory. The memory comes from the heap, so it will be adjacent to the previous memory, but it will continue to allocate memory until all objects are allocated. The allocator is smart, it doesn't want to waste CPU time by making a bunch of syscalls to allocate tiny blocks of memory, so it does them in bulk. Pages and pools of memory that it marks up and manages. If the addresses were wildly spread out, that would mean the allocator is allocating random pages for every single allocation request, and all those context switches would be a far worse bottleneck than a cache miss. But as it turns out, the heap grows upward. The addresses are all fairly close together.
      Now, you can optimize your code to assume that the allocator allocates a huge chunk of memory that's all close together, or you can optimize it to assume that the addresses will be far apart, but in the end, that's all you're doing: assuming. The standard says nothing about how the allocator is implemented. Don't assume. Write better algorithms. If the compiler thinks your array of structs will be more efficient if it turns it into individual arrays of the one element you access, it will do exactly that. That's the ultimate lesson: the compiler is better at optimizing than you are.

  • @lionkor98
    @lionkor98 Před rokem

    You can log into a queue, and then flush the queue on a separate thread

  • @Amitkumar-dv1kk
    @Amitkumar-dv1kk Před rokem

    Do you also review Java codes or is it only c++?

  • @IgnoreSolutions
    @IgnoreSolutions Před rokem

    I’m surprised you didn’t mention the fact that variables starting with just an underscore are considered reserved by the language.

  • @ByChris
    @ByChris Před rokem

    How comfortable would you feel about making a C++ Graphics course for udemy?

  • @odarkeq
    @odarkeq Před rokem

    11:33 The webcam picture quality begins to tank because of the video encoding all the little gaps between so many moving circles. It's interesting to see a non-FPS-related side-effect appear while testing FPS-related benchmarks.

  • @alyshmahell
    @alyshmahell Před rokem

    21:05 how do we define "DISTRIBUTION" for the whole project?

  • @unkgames-abdullahali4048

    Physics engine: is an engine about physics!! 👍👍👍

  • @0xCAFEF00D
    @0xCAFEF00D Před rokem

    25:45
    Does it really work like this? That you have fragmentation in any percievable way. I thought with virtual memory you're not taking any penalty in reading across pages beyond that you're taking more TLB space because you have multiple pages. Is there any gain in having the actual pages be contiguous?

  • @kuroakevizago
    @kuroakevizago Před rokem +2

    Thanks you're giving me a heads up on what to do next. I probably going to start making 2D Physics Engine.
    Thanks btw got your brilliant discount :)

  • @ValinorFP
    @ValinorFP Před 8 měsíci

    Great video, thank you! In modern C++, is heap memory fragmentation a concern for developers, given that the OS uses virtual memory to map to physical memory? My hypothesis is that even if physical RAM is fragmented, but virtual memory is contiguous, the C++ program's performance will not be affected.

    • @majormalfunction0071
      @majormalfunction0071 Před 8 měsíci

      Maybe or maybe not. CPUs don't prefetch across page boundaries, probably because of kernel-side page permissions / residency state. The more pages you access, the more TLB slots you use. TLB misses hurt, but maybe not to the level of framerate problems. It's an extra memory access, paid serially. Huge TLB requires defragmented memory on the kernel-side, and has a system-wide limit. Running kernel code to change page residency really hurts. It's many instructions, and a possible disk access.

  • @12affes
    @12affes Před rokem +1

    Excellent video, memory is always an interesting topic!
    My one suggestion would be to change the storage of bodies in DynamicsWorld. On line 23 in the source file (seen at 27:45) the whole 'if (!body->IsDynamic()) continue;' means that static bodies are loaded into the L1 cache and then immediately discarded. Splitting the storage into static and dynamic bodies will ease the pressure on both the cache and the branch predictor.

  • @fenril6685
    @fenril6685 Před rokem +4

    With regards to the .hpp header specification over .h, I find it to be very necessary in a lot of projects which are larger where you have a mixture of both c and cpp code (happens way more often than you might think at some companies where you have legacy code).
    It does makes a huge difference in those cases because you need to compile those .h files as C code only in some situations and not as C++ especially if they are separate projects in a larger solution base. It just makes it easier to distinguish directly what you are looking at.
    I used to be one of the .h default people, and never did understand why someone would use .hpp until I started working on legacy code bases created by other developers in large teams, now it makes sense because organizationally it serves an actual purpose.
    I now just use .hpp as default as a result, because I'd rather not go back after the fact and have to specify hey this is actually a cpp header file and you should compile it in your makefile or whatever build system you are using as C++ code specifically and not C code. Just something to consider.

    • @tolkienfan1972
      @tolkienfan1972 Před rokem +1

      Not just legacy. Many of us use modern C. There are numerous cases I prefer C for.

    • @user-dh8oi2mk4f
      @user-dh8oi2mk4f Před rokem

      But you don't compile header files?

    • @fenril6685
      @fenril6685 Před rokem +2

      @@user-dh8oi2mk4f What I mean is that usually in external build systems you have some method of determining which files are included in which compilation processes, typically by some kind of pattern matching.
      You do NOT want .h files which are strictly c linked in unnecessarily with C++ compilation units. This can result in all sorts of unexpected behaviors, especially if you have C headers putting things in global scope with simplified names, which is pretty frequent in legacy code.
      If I have multiple binaries in a solution that I need to compile some as C and some as C++ then you don't want to pattern match against all .h files when you are building C++ code in your build steps specifically.

    • @user-dh8oi2mk4f
      @user-dh8oi2mk4f Před rokem

      @@fenril6685 But why would you need to figure out which headers are c and c++? The compiler simply pastes the contents of the includes directly into the source file. I don't understand why you need to know which headers are which. Maybe this is helpful if you mix c and c++ in the same directory, but I don't get how it would help with a build system

  • @roykapon181
    @roykapon181 Před rokem +1

    Is a std::vector with preallocated size a decent way to implement this kind of memory management? Or do you need to do it manually? Im a cpp newbie so pls dont roast me :)
    Btw, a great video! Looking forward for pt 2

    • @Larock-wu1uu
      @Larock-wu1uu Před rokem

      I am curious about this as well

    • @roykapon181
      @roykapon181 Před rokem

      I forgot to note that it will probably not work well with deleting items (I guess that for this we need a more sophisticated method)...

  • @sherazali8691
    @sherazali8691 Před rokem +1

    About logging, can we just create a Static class and call it's function to log something there (through parameters)
    like:
    Logger.Log(_currentFps);
    and in our release build, we just comment out all the statements in that function.
    We would still have an overhead of calling that function and passing parameters, but is it okay to do it like this?

    • @nickgennady
      @nickgennady Před rokem

      It’s more simple and straightforward to setup sure but you have to keep commenting and uncommenting every time you want to change build type and you have to remember to do that.
      His macro way is much better.

    • @user-dh8oi2mk4f
      @user-dh8oi2mk4f Před rokem

      I would be quite surprised if your compiler left the function call to an empty function with max optimization

    • @nickgennady
      @nickgennady Před rokem

      @@user-dh8oi2mk4f fair. Did not think of that

  • @nullbeyondo
    @nullbeyondo Před rokem

    If he just uses a clock for the delta time instead of a fixed time-step, that means his physics engine is not determinstic and thus will produce different results every time he runs a simulation.

    • @Stowy
      @Stowy Před rokem +4

      yes i didn't knew that at the time, but i'm working on networking at the moment so I realized that mistake. I'll definetly be careful about that if I ever do something like that again haha

  • @spider853
    @spider853 Před rokem

    I suspect you participated in ludum dare jam? We also did participate ) Can you please link to your game?

  • @cameleon5724
    @cameleon5724 Před rokem

    One content, two languages. What I have now written may have a perfect mirror in another language. You can create a program that searches for the perfect language mirror. Thanks to this, you will be able to speak two languages ​​and perform tasks in the shade.Endless enigmatic book in all languages. You can write a book with mirrors in all languages of the world. You can speak two languages at once, you just need to find the perfect reflection, same content, different translation. Infinite Mirrors. Pi 3.14 XBooks. Hybrid language. The algorithm flows through our heads, endless coding, just take off the chameleon masks. Connect words without spaces and you will find hidden tasks in all languages. Our conversations collide in the process, some words as well as numbers in words. We perform tasks hidden between words. You can create a Python coding language from a spoken language. You just need to find the mirrors. Two tongues glued together.

  • @TGAPOO
    @TGAPOO Před rokem

    Leading underscores are reserved in microsoft code. You should never use leading underscore variable if you expect to work on windows. Prefer trailing if you must.