One reason to Modify your Memory Allocator (C/C++)

Sdílet
Vložit
  • čas přidán 29. 11. 2021
  • Patreon ➤ / jacobsorber
    Courses ➤ jacobsorber.thinkific.com
    Website ➤ www.jacobsorber.com
    ---
    One reason to Modify your Memory Allocator (C/C++) // We tend to take our memory allocator (malloc) for granted. It's always there and it works well. But, there's always room for improvement. This video tutorial describes one reason why you might want to make some changes to malloc that are going to make it easier to catch bugs in your programs.
    Related Videos:
    xmalloc: • Memory Allocator Error...
    shims: • How to Intercept and M...
    ***
    Welcome! I post videos that help you learn to program and become a more confident software developer. I cover beginner-to-advanced systems topics ranging from network programming, threads, processes, operating systems, embedded systems and others. My goal is to help you get under-the-hood and better understand how computers work and how you can use them to become stronger students and more capable professional developers.
    About me: I'm a computer scientist, electrical engineer, researcher, and teacher. I specialize in embedded systems, mobile computing, sensor networks, and the Internet of Things. I teach systems and networking courses at Clemson University, where I also lead the PERSIST research lab.
    More about me and what I do:
    www.jacobsorber.com
    people.cs.clemson.edu/~jsorber/
    persist.cs.clemson.edu/
    To Support the Channel:
    + like, subscribe, spread the word
    + contribute via Patreon --- [ / jacobsorber ]
    Source code is also available to Patreon supporters. --- [jsorber-youtube-source.heroku...]

Komentáře • 91

  • @applesushi
    @applesushi Před 2 lety +35

    Back in the stone age when I took CS classes on SunOS workstations, we had to write a simple malloc/free library for one of our classes. One of my classmates went all out and wrote a malloc implementation that was more efficient at freeing up memory than Sun's standard C library. I can't vouch for it being more bug free though.
    I just ran across this channel and it makes me miss programming. I work in IT but I didn't become a developer/programmer. I am glad I can still follow *most* of this. 😃

  • @jesselangham
    @jesselangham Před 2 lety +14

    NUMA is a good example on why you'd want to write or modify your allocator. If you want to make sure the memory you allocate is local to your current processor, you need to specify numa node on a multi-node system.

  • @Psykorr
    @Psykorr Před 2 lety +14

    Allocators are actually really rewarding to make. I like making arenas and divide parts of the program into groups. This can avoid complicated malloc-free or new-delete strategies as you can just reset an arena when you are done with some problem that needed a few, or many, different allocations. And the next time you need to allocate into the same arena, you don't even need to get memory from the os, you just reuse that memory from the arena you just reset.

  • @glee21012
    @glee21012 Před 2 lety +23

    You can also put printf's in there to see when you are allocating and freeing memory (helps to find leaks) in the console. Good video.

    • @maxsilvester1327
      @maxsilvester1327 Před 2 lety +16

      But printf could use malloc, and calling malloc inside malloc is a really bad idea.

    • @SerBallister
      @SerBallister Před 2 lety +1

      @@maxsilvester1327 You can guard against re-entrancy

  • @leocelente
    @leocelente Před 2 lety +4

    Interesting way to wrap standard functions. Very useful.

  • @AlexBlack-xz8hp
    @AlexBlack-xz8hp Před 2 lety

    Fantastic tip! Thanks for sharing

  • @enderger5308
    @enderger5308 Před 2 lety +16

    One cool thing about Zig is that you choose your memory allocator.

    • @user-ux2kk5vp7m
      @user-ux2kk5vp7m Před 2 lety +18

      But one issue with zig is the entire language

    • @inkryption3386
      @inkryption3386 Před 2 lety +3

      @@user-ux2kk5vp7m how so? Seems pretty good to me

  • @mav474
    @mav474 Před 2 lety +1

    This Prof has made CS interesting! Thanks!

  • @alevez2004
    @alevez2004 Před 2 lety +1

    Amazing video. This topic is very interesting

  • @xorlop
    @xorlop Před 2 lety

    Wow! Super helpful... thank you!

  • @andrewnorris5415
    @andrewnorris5415 Před rokem

    Very cool. I was thinking how else can intercept calls - out of interest. One way is to write your own strace and intercept the system calls. I have been playing with LibVMI too, which was difficult to get up and running and not good on documentation/tutorials. But it is potentially very powerful. Uses include fuzzing and malware monitoring.

  • @jacko314
    @jacko314 Před 2 lety

    in c++ in dispatch mode you want to replace the new allocator. as you want the memory to be non-pageable -- depending on the driver. especially if your code is running on the paging disk. if (on windows) you are on the paging storage stack you aslo need to make sure you have a backup non pageable heap to make forward progress in the case of extreme memory pressure. so it becomes more compicated, because if the initial allocation fails you want to go into serial allocation mode. which is why low level kernel can be complicated. if using c you simply call the exallocnonpageable functions (windows) to do that. but that also means you need to serialize your ops because your previous allocation can really only be used for io and building mdls. in my code, on boot you create a io packets preallocated specifically for this. but a more generic case is you create a backup heap and make sure anything that touches that heap is marked as non-pageable.

  • @vladrootgmailcom
    @vladrootgmailcom Před 2 lety

    10:15 - thanks, man. Like bugs I already found was not enough for me...

  • @mrcrackerist
    @mrcrackerist Před 2 lety

    I have started using backtrace, but this also looks interesting

  • @skeleton_craftGaming
    @skeleton_craftGaming Před 6 měsíci

    New actually calls the 'constructor' for int. Which always initializes it to zero by default.. which is why you weren't getting a warning...

  • @harelrubin1432
    @harelrubin1432 Před 8 měsíci +1

    It initilized *p2 to 0 because (i think) new calls the constructor, in the case of int its 0

  • @LoesserOf2Evils
    @LoesserOf2Evils Před 2 lety +1

    Dr. Sorber, these videos are indeed instructive to me.

    • @JacobSorber
      @JacobSorber  Před 2 lety +4

      That's always good to hear. Glad I could help.

  • @d97x17
    @d97x17 Před 2 lety +19

    Thank you for the video! Could you comment on how this relates to a general Allocator type in C++ (e.g. std::allocator)? Would making a custom implementation of the Allocator type in C++ be a better approach than overloading malloc?

    • @edwinontiveros8701
      @edwinontiveros8701 Před 2 lety +11

      In c++ it's best to use a custom Allocator, it is more idiomatic. Using cstdlib in C++ is considered a code smell, specially malloc and free. You might as well use C in that case.

    • @YoloMonstaaa
      @YoloMonstaaa Před 2 lety +2

      You can also overload the new operator

    • @StefanoTrevisani
      @StefanoTrevisani Před 2 lety +2

      I found custom allocators very useful for CUDA. You just write a simple custom allocator, and you get all the STL containers working on your GPU! I know there is a CUDA version of the STL maintained by Nvidia, but still

  • @SeleDreams
    @SeleDreams Před rokem

    isn't dlsym platform specific ?
    i'm not sure it's usable on some platforms that for instance do not support any dynamic library loading (like the 3DS)

  • @lordansem2
    @lordansem2 Před 2 lety +1

    I swear I remember a video where he recreates Malloc but can’t seem to find it. Which was it?

  • @voytechj
    @voytechj Před 2 lety +1

    It would be better to switch DEBUG to NDEBUG. It is already used for example in "cassert" "assert.h" header file, and some build systems (cmake) defines NDEBUG for Release builds.

  • @leokiller123able
    @leokiller123able Před 2 lety +17

    In c++ when you call `new int` it calls the default constructor of int which sets the variable to 0
    (Edit: wrong, built-in types don't have constructors but initialization syntaxes that makes them look like they do, so you have to call `new int();` for 0 initialization)

    • @nikitabelov1478
      @nikitabelov1478 Před 2 lety +1

      The default constructor of int? As far as I know, base types don't have any constructors.

    • @leokiller123able
      @leokiller123able Před 2 lety

      @@nikitabelov1478 they do in c++, otherwise you couldn't do `int i(42);` for example

    • @nikitabelov1478
      @nikitabelov1478 Před 2 lety +1

      @@leokiller123able it's not a constructor. It is value-initialization. Which just has the same syntax.
      i.e:
      int i(42) != string a("something")
      int i(42) is just another way of saying
      int i = 42

  • @keesterwee4890
    @keesterwee4890 Před 2 lety +4

    Excellent as always! I added a global counter and in the malloc function this counter is increased by one. It turned at the end of the program that this counter has the value 646. Hence the malloc function was called that many times! That puzzles me. Could you eleborate on where all these calls come from? For completeness I run this C (not C++) program on a Raspberry 3 compiled with gcc.

    • @sourestcake
      @sourestcake Před 2 lety +3

      If possible, you could make the malloc wrapper take in a line number and file name then just print those. You can wrap the malloc wrapper in a macro, like #define malloc(SIZE) (malloc_wrapper((SIZE), \_\_FILE\_\_, \_\_LINE\_\_)). This would show you where the calls come from. You can additionally use \_\_func\_\_ to get the function name. (I had to add backslashes to stop CZcams formatting. Don't include them in actual code.)

    • @SerBallister
      @SerBallister Před 2 lety

      You could add a breakpoint inside your custom malloc function and look at the call stack to see where it's being called from.

  • @tomaszstanislawski457
    @tomaszstanislawski457 Před 8 měsíci

    Alternatively, one could typedef the function type directly: `typedef void *malloc_like_function(size_t);` and next use `malloc_like_function*` whenever a function pointer is desired.

  • @suvalaki
    @suvalaki Před 2 lety

    Any recommendations on a book to learn how to write your own allocator?

  • @awabomer
    @awabomer Před 2 lety +1

    When there is going to be a full C / C++ course ? Professor Jacob

    • @gregoryfenn1462
      @gregoryfenn1462 Před 2 lety +1

      There’s millions of them online,

    • @awabomer
      @awabomer Před 2 lety

      @@gregoryfenn1462 well, it's not the same teacher, professor Jacob teaching method is totally different and not to mention the experience he have, it saves a lot of time because you get to know the best practices, I am sure everyone is this channel would want t full course from him, what other good course you might recommend ?

  • @tk36_real
    @tk36_real Před 2 lety

    just confirming: vs still puts safety nets inside and around allocated memory (0xCC)

  • @saeedmahmoodi7211
    @saeedmahmoodi7211 Před 2 lety +2

    could we make this code os independent?

  • @reactst
    @reactst Před 2 lety

    What r the jobs u do C for a living except a professor?

    • @gregoryfenn1462
      @gregoryfenn1462 Před 2 lety +1

      Embedded development eg for flight ✈️ control systems - that’s my job

    • @inkryption3386
      @inkryption3386 Před 2 lety +1

      Legacy maintenance as well. There is tonnes of old C code that isn't going to be replaced anytime too soon, and you need someone to make sure it's maintained.

  • @decky1990
    @decky1990 Před rokem

    Really interesting video, but don’t think I’d have called it C++ - just looks like C with a .cpp extension

  • @PauxloE
    @PauxloE Před 2 lety

    Hmm, isn't there a way of getting the sysmalloc only once, instead of in each call of the new malloc?
    This looks like horribly inefficient.

  • @betareleasemusic
    @betareleasemusic Před 2 lety

    Isn't it better to just avoid using uninitialized memory? For local variables const and auto already make this kind of error impossible; for heap allocation - make_shared and make_unique.

    • @alevez2004
      @alevez2004 Před 2 lety

      That’s the issue, you can forget to initialise a variable

    • @leokiller123able
      @leokiller123able Před 2 lety

      In c++ we have default constructors, so when you write `int i;` for example, i is automatically set to 0 (its the same as `int i();`) which is not the case in c as there are no such things as constructors/destructor. So in c++ you can't have uninitialized values (except if you explicitly call malloc or some other allocator than new for pointers )

    • @betareleasemusic
      @betareleasemusic Před 2 lety +1

      @@leokiller123able Wrong in multiple ways.
      EDIT: "class_name object();" on its own does not run the default constructor, "class_name object;" does.
      For global and static variables, `int i;` will indeed initialize to zero, for *local* variables the value is undefined.
      Not to mention it is bad style to rely on a variable (maybe) being initialized to zero. It makes the code more readable to just spell it out:
      int i = 0;
      or, much better in C++,
      auto i = 0;
      because auto makes it impossible to "forget" to spell out the "= 0" part.

    • @leokiller123able
      @leokiller123able Před 2 lety

      @@betareleasemusic are you sure about that? Because for class types when you just declare `class_name variable_name;` it calls the default constructor and I believe that it's the same for built-in types in c++, it feels wrong that it isn't and I always thought it was the case, but sorry if I said something wrong

    • @ANSIcode
      @ANSIcode Před 2 lety

      @@leokiller123able Yes, but `int` is not a class in C++...

  • @henrykkaufman1488
    @henrykkaufman1488 Před rokem

    "If you've been programming for more than few hours you'd probably used it". In the meantime, I have about 5000 slocs in my game written in C without a single dynamic allocation. 🙂

  • @privileged6453
    @privileged6453 Před 2 lety +2

    obviously In c++0x nobody would use new like that, you’d just do ‘int *x = new int()’ for an int initialised to 0

    • @glee21012
      @glee21012 Před 2 lety

      new uses malloc() under the hood, delete uses free()

    • @privileged6453
      @privileged6453 Před 2 lety +1

      @@glee21012 i dont disagree, it does this mainly for backward compatibility reasons. but theres not much point in writing your own malloc to initialise values like this in C++ when you can do it as part of the language

    • @privileged6453
      @privileged6453 Před 2 lety

      @@glee21012 not to mention that free doesnt have to use malloc, if you overload it to prevent it from doing so.

    • @coopercone4293
      @coopercone4293 Před 2 lety +1

      @@privileged6453 As other people have said, this technique can be used for other purposes, like finding memory leaks. In malloc, you can register the file and line that it was called from (using the preprocessor) to some kind of dictionary, and then when you free the memory, remove it from the dictionary. Then at the termination of your program you can print out all the data in the dictionary to catch when you aren't freeing your memory.

    • @glee21012
      @glee21012 Před 2 lety

      @@privileged6453 nothin worse then using free on a bad pointer - I KNOW lol

  • @ohwow2074
    @ohwow2074 Před 2 lety +3

    Great content. Had never seen something like this. Also I wanna mention that this code was almost pure C. I see no reason for you to promote it as C++ code. It has hardly any kind of resemblance to C++.

    • @JacobSorber
      @JacobSorber  Před 2 lety +1

      Thanks. Always glad when I can show someone something new.
      And, please see my video about C/C++ and my opinions about what gets to be called C++. Spoiler: if it compiles with a C++ compiler, it's C++ (by definition).

    • @ohwow2074
      @ohwow2074 Před 2 lety +1

      @@JacobSorber It's true. But this code is still more C-like rather than C++. You know that modern C++ looks quite different.
      Your code can't be considered as proper modern C++ code since it uses deprecated features like deprecated headers such as stdio.h and string.h (these are deprecated in C++ and the equivalents cstring and cstdio are strongly recommended by the committee).
      As a software engineer said in an answer to a Quora question, if you want to write C then write in pure C. Not C++.
      Also Compiling with a Cpp compiler doesn't have any benefits for C code as far as I know. C compilers might even generate faster code in some cases in which C differs from CPP.

    • @edwinontiveros8701
      @edwinontiveros8701 Před 2 lety +7

      C style c++ is an actual thing, doom 3 was written like that.

  • @jenidu9642
    @jenidu9642 Před 2 lety

    Don't really like this for Linux, it's much better to use valgrind, since this catches 95% of memory bugs (from my experience)

  • @debanjanbarman7212
    @debanjanbarman7212 Před 2 lety

    1'st

  • @WolvericCatkin
    @WolvericCatkin Před 2 lety

    569 likes... _nice..._

  • @betareleasemusic
    @betareleasemusic Před 2 lety +1

    I think it is disingenuous to say that this is a C/C++ trick:
    1. dlsym() is not part of either the C or C++ standard library. It is a UNIX system call. It doesn't have the portability C or C++.
    2. It solves a problem that you would never encounter in C++ if you followed the C++ Core Guidelines.

    • @JacobSorber
      @JacobSorber  Před 2 lety +8

      I hope you mean "imprecise" rather than "disingenuous", unless of course you think that I'm intentionally leading people astray. You are right that dlsym() is not part of the language standard (it is part of the POSIX standard), and while we're being precise, it's a library call, but not a system call (at least on Linux and MacOS). But, replacing your allocator is something that you can do (for debugging purposes) on just about any OS using C and C++, which was the point I was trying to make. Sorry if that wasn't clear. As for #2, that's like saying, "you don't need to debug code if you just follow best practices and always write correct code." I suppose it's true, but it doesn't actually seem to decrease the number of bugs out there or impact the need for debugging.