How to Map Files into Memory in C (mmap, memory mapped file io)

Sdílet
Vložit
  • čas přidán 23. 07. 2024
  • Patreon ➤ / jacobsorber
    Courses ➤ jacobsorber.thinkific.com
    Website ➤ www.jacobsorber.com
    ---
    How to Map Files into Memory in C (mmap, memory mapped file io) // How to Map Files into Memory in C (mmap) // Want to read an entire file into an array in memory all at once, in C? Memory-mapped file I/O is one of those super useful tools that so many programmers don't know about. Let's fix that.
    My recent, related videos:
    Basic File IO in C
    • Reading and Writing Fi...
    Get the size of files.
    • How to get a file's si...
    Easier working with file paths, with realpath
    • Simpler Paths in C wit...
    Using mmap to request more memory
    • How processes get more...
    Note that any Amazon links in my video descriptions are generated by Amazon. If you click one of them and then buy something it helps support this channel. Thanks.
    ***
    Welcome! I post videos that help you learn to program and become a more confident software developer. I cover beginner-to-advanced systems topics ranging from network programming, threads, processes, operating systems, embedded systems and others. My goal is to help you get under-the-hood and better understand how computers work and how you can use them to become stronger students and more capable professional developers.
    About me: I'm a computer scientist, electrical engineer, researcher, and teacher. I specialize in embedded systems, mobile computing, sensor networks, and the Internet of Things. I teach systems and networking courses at Clemson University, where I also lead the PERSIST research lab.
    More about me and what I do:
    www.jacobsorber.com
    people.cs.clemson.edu/~jsorber/
    persist.cs.clemson.edu/
    To Support the Channel:
    + like, subscribe, spread the word
    + contribute via Patreon --- [ / jacobsorber ]
    + rep the channel with nerdy merch --- [teespring.com/stores/jacob-so...]
    Source code is also available to Patreon supporters. --- [jsorber-youtube-source.heroku...]
    Want me to review your code?
    Email the code to js.reviews.code@gmail.com. Code should be simple and in one of the following languages: C, C++, python, java, ruby. You must be the author of the code and have rights to post it. Please include the following statement in your email: "I attest that this is my code, and I hereby give Jacob Sorber the right to use, review, post, comment on, and modify this code on his videos."
    You can also find more info about code reviews here.
    • I want to review your ...

Komentáře • 180

  • @beegbraining
    @beegbraining Před 3 lety +106

    The background music makes it sound like we're solving a murder case. Great video by the way

    • @langstonnash5454
      @langstonnash5454 Před 3 lety

      pro tip : watch movies at flixzone. Been using them for watching a lot of movies lately.

    • @kabirkhalid2499
      @kabirkhalid2499 Před 3 lety

      @Langston Nash Yea, have been using flixzone for months myself :D

    • @cadearturo3879
      @cadearturo3879 Před 3 lety

      @Langston Nash definitely, I have been watching on flixzone for years myself :)

    • @aidenkylo4349
      @aidenkylo4349 Před 3 lety

      @Langston Nash Yup, I have been using Flixzone for since december myself :)

    • @aldomelvin1991
      @aldomelvin1991 Před 3 lety

      @Langston Nash yea, I've been using flixzone for years myself :)

  • @tigerrabbit3261
    @tigerrabbit3261 Před 5 lety +4

    ayyyy this is awesome very very good! you made mmap very easy to understand. craving for more !

  • @deepshankarjha5344
    @deepshankarjha5344 Před 3 lety

    amazing video . simplicity and clearity. hatsoff

  • @topG448
    @topG448 Před 5 lety +27

    This man is a boss! Your video has some excellent stuffs. Please keep sharing :)

  • @8015908
    @8015908 Před 3 lety +2

    NIce quick to the point videos. Lectures and textbooks often go too in-depth and confuse more than is. Your videos give brief insight, and then shows an actual functioning demostration.

  • @kotravaijm250
    @kotravaijm250 Před rokem +1

    NO ONE explains like you do. Thank You!

  • @LinucNerd
    @LinucNerd Před 6 lety +13

    Thank god someone finally made a video on mmap that made sense!
    Thank you sooooooo much

    • @JacobSorber
      @JacobSorber  Před 6 lety +2

      You're welcome. Glad it made sense.

    • @LinucNerd
      @LinucNerd Před 6 lety

      Also, why is the like/dislike ratio hidden?
      The quality of your other videos (including this one) are fine.

    • @JacobSorber
      @JacobSorber  Před 6 lety +1

      Good question. I guess I hadn't checked that option.

  • @zrodger2296
    @zrodger2296 Před rokem

    This is exactly what I need to know for a project I'm working on. Thank you!

  • @anycaroliny7900
    @anycaroliny7900 Před 2 lety +2

    Perfect explanation!

  • @rafabaranowski513
    @rafabaranowski513 Před 5 lety +4

    I tried to understand this whole day. You made this in few minutes :)

  • @strayedaway19
    @strayedaway19 Před 2 lety

    So freaking cool, thank you for this !

  • @islandcave8738
    @islandcave8738 Před 3 lety

    Wow, this is amazing. My mouth was gaping when I saw this.

  • @djpolyester
    @djpolyester Před 3 lety +19

    Thank you Jacob! However from the man page:
    `if neither O_CREAT nor O_TMPFILE is specified, then mode is ignored.`
    You don't have to use S_IRUSR and S_IWUSR as arguments. They should be ignored. Wanted to point out in case. Appreciate your great work!

  • @karlforshaw
    @karlforshaw Před 6 lety

    Thank you for posting this

  • @dayol2026
    @dayol2026 Před 3 lety

    Thanks for the video!!

  • @jeyanthinba9142
    @jeyanthinba9142 Před 4 lety

    This video helped a lot...Thank you So Much

  • @maellet.9707
    @maellet.9707 Před 4 lety

    great explanations, thanks !

    • @JacobSorber
      @JacobSorber  Před 4 lety

      You're welcome. Let me know if there are other topics you would like to see discussed on the channel.

  • @LITHIUMINWATER
    @LITHIUMINWATER Před 3 lety

    great video, thanks!

  • @rafaelnagel5253
    @rafaelnagel5253 Před 4 lety

    Thank you! Excellent content.

  • @pankajkushwaha3793
    @pankajkushwaha3793 Před 3 lety

    Great tutorial .

  • @shaharrefaelshoshany9442

    This is gold !!

  • @jinti4epicclips
    @jinti4epicclips Před 3 lety

    Thanks dude!

  • @officialsterlingarcher
    @officialsterlingarcher Před 5 lety +8

    For my OS course one of our assignments is tbe classic consumer producer problem but with processes. One of the portions for this assignment is being able to write to a file from one of our consumer processes but I was having the hardest time understanding how to share a file in memory between processes. This definitely cleared it up! Thanks :D

  • @dhruvakumar6964
    @dhruvakumar6964 Před 2 lety

    Thank you.

  • @jonathondelemos4609
    @jonathondelemos4609 Před 2 lety

    The background music caught me off guard at first. After you executed the code, I said wtf like ten times and the music really made me laugh.

  • @thepickicool97
    @thepickicool97 Před 3 lety +3

    you saved my life,
    and i'm not overreacting !
    Thank you !!!

  • @arturesMC
    @arturesMC Před 3 lety +8

    Hi.
    Is there any difference to memory acces time between MMAPed file, and just using MMAP to allocate memory?
    What if I want a fast way of loading stuff, but then I also need a high performance runtime?

  • @ArunPrabhath
    @ArunPrabhath Před 5 lety +5

    Hi, your videos are awesome. I would like to request you to post a video on shared listening socket descriptors which is shared across different processes.

  • @thom9909
    @thom9909 Před 9 měsíci

    awesome channel

  • @leokiller123able
    @leokiller123able Před 2 lety +9

    Hey, great video ! I have one question though, how do you append characters at the end of the file this way ? Do we need a reallocation or something ?
    EDIT: I found a way: First I tried allocating more space with mmap but surprisingly it doesn't append character that I `strcat` at the end, so after a little bit of digging I found that if you edit files with `mmap` the size of the file is not automatically updated like with `write`, so we need to call `truncate` (or `ftruncate`) to manually change the size

  • @milossimicsimo
    @milossimicsimo Před 4 lety +2

    Hi this is super cool example. I have one question...in this example you did some change to the content of the file, but if I want to increase size of the file? For example, I have some data in that file and I want to add more data to it. Is that something that could be done?

  • @utubeuser907
    @utubeuser907 Před 5 lety +1

    Nice videos.. to the point and no BS.. btw, what is the editor you are using? Is it on Windows?

    • @yomanos
      @yomanos Před 4 lety

      Late response, but it's called Atom

  • @user-bp4sx6hx3q
    @user-bp4sx6hx3q Před 5 lety

    love you

  • @MrUmang40
    @MrUmang40 Před 5 lety +2

    Put everything aside.....did anyone notice.......the background music in this video.....it is so intense.... actually I loved it.......

    • @JacobSorber
      @JacobSorber  Před 5 lety

      Yeah, it felt like an intense topic at the time. Glad you liked it. :)

  • @nabilelouahabi6735
    @nabilelouahabi6735 Před 3 lety +1

    T'es un amour

  • @kathiravankathir3089
    @kathiravankathir3089 Před 5 lety +3

    Hi, when mapping memory with new empty file (opened with O_TRUNC) ,
    1 ) SIGBUS is generated when accessing the memory ,if i pass the size as page size to mmap.
    2)SIGSEGV is generated when accessing memeory , if i pass the size obtained from fstat fun
    help me to resolve this !!
    Thanks.

    • @TimothyEBaldwin
      @TimothyEBaldwin Před 3 lety +2

      No need to enlarge the file (for example by using ftruncate) to avoid SIGBUS and not write outside mapping (which can be larger or smaller than the file) to avoid SIGSEGV and other undefined behaviour.

  • @leikang8653
    @leikang8653 Před 4 lety

    genius!

  • @mariamka
    @mariamka Před rokem +1

    Hello. First of all let me thank you for yoyr videos. They really teach a lot. I'm just a beginner and have still a lot to learn. Would you mind to answer my question? Do I need to allocate memory when using mmap func or it is already not causing leaks? Thank you)

    • @JacobSorber
      @JacobSorber  Před rokem +3

      Thanks. I'm glad the videos have been helpful. I'm not sure if I'm completely understanding your question (I'm guessing you meant "deallocate"). But, as with most things in computing, it depends. If you are using mmap to map a file into memory , it's probably a good idea to unmap that memory (using munmap) when you're done with it. If you're using mmap to write your own memory allocator (probably not something you will do as a beginner, but you do you), you might keep empty pages around to satisfy future requests. In that case, I wouldn't call it a memory leak, because you have a plan for reusing the memory. And, of course, if your program is about to terminate, and you're trying to maximize speed, you can just leave it and have the OS clean up your pages for you. A lot of terminal programs that just do one thing (like ls) often do this. It's still a memory leak, but (on laptops/desktops/servers) memory leaks are generally only problematic for programs that run for a long time and allocate a lot of memory blocks.

  • @thomaspalade9946
    @thomaspalade9946 Před 4 lety

    bravo tata

  • @veggiefoodadventure
    @veggiefoodadventure Před 4 lety +1

    what kind of speed tool do you use? ;)

  • @greatbullet7372
    @greatbullet7372 Před 4 lety +6

    Whoever reads this, spread Jacob Sorbers Channel as it has been done with Cherno. The quality is good, the insight is insane! Have a nice day

    • @JacobSorber
      @JacobSorber  Před 4 lety

      Thanks. All sharing is definitely appreciated.

    • @greatbullet7372
      @greatbullet7372 Před 4 lety

      @@JacobSorber i see your channel already with 200k subs, you trended on my youtube recommendations since yesterday. I just can beg u to show us more as C is evenly Interesting as C++. Suggestions are: IPC, Sockets, Designpatterns in practical usecases because on my opinion you can learn that in C the best way. Lean and Mean

  • @MECHANISMUS
    @MECHANISMUS Před rokem

    Do we have to page align the mapped memory like it's shown in manpage example?

  • @jcialdella
    @jcialdella Před 3 lety

    How do I move a memory file of dictionary words into hcreate/hsearch?

  • @MultiNova100
    @MultiNova100 Před 6 lety

    Does mmap load into memory a page at a time chucks of the file on demand? How does the OS work underneath mmapping a file?

    • @JacobSorber
      @JacobSorber  Před 5 lety +2

      I guess I missed this one. How pages are brought into memory varies by operating system and possibly the current workload. They might use simple demand paging, but they might also try to pull in extra pages (if there are physical frames available) to reduce future page faults.

  • @yjc149
    @yjc149 Před 2 lety

    nice

  • @MultiNova100
    @MultiNova100 Před 6 lety

    If on a 32bit machine, I want to mmap 4 files of size 1GB, will I get any error saying the system is out of address space when I map the last file (or even a file before the last one, because other parts of the memory like stack or heap could be taking up a lot of space already)?

    • @JacobSorber
      @JacobSorber  Před 6 lety

      Try it and see. At some point, your OS will not let you map any more pages. It usually happens before you exhaust the entire address space. I haven't played around with 32-bit machines for years, though. So, try it and let me know what happens.

  • @realdragon
    @realdragon Před 6 měsíci

    It actually can be useful but I'm using fscanf to get float from file but I don't know if it would work with mmap

  • @ohreally1021
    @ohreally1021 Před 4 lety

    Thanks for the vid! Which platforms will this work on?

    • @JacobSorber
      @JacobSorber  Před 4 lety +2

      This should work on MacOS and Linux. Most Unix-based OSes support mmap. On windows, look into the CreateFileMapping function.

  • @lordadamson
    @lordadamson Před 5 lety +1

    I love you

  • @sghsghdk
    @sghsghdk Před 3 lety +1

    How about appending to files - is that possible ?

  • @emiliocanton
    @emiliocanton Před 4 lety

    Hey! Great videos. Can I change the file descriptor which was mapped to memory (e.g. I mmap fd 0 and once I've mapped it I change the fd to 1 so changes are made on that file)

    • @JacobSorber
      @JacobSorber  Před 4 lety

      Thanks. And, good question. So, are you thinking that you would call mmap once with a block of memory and then call mmap again with another file descriptor? I haven't ever tried to switch the file descriptor for a mapped block of memory. Sounds like an intriguing experiment. Just want to make sure I understand exactly what you have in mind.

    • @bonbonpony
      @bonbonpony Před 3 lety +2

      If you close the file that is referenced by `fd`, it doesn't affect the memory mapping at all (and it doesn't actually close the file either), because `mmap` stores the file descriptor within the mapping, and it increases the reference count on that descriptor, so that the system knew that someone else is still using it. (That's why calling `fclose` on it doesn't actually close the file.) From GNU libc manual:
      "A new reference for the file specified by filedes is created, which is not removed by closing the file."
      www.gnu.org/software/libc/manual/html_node/Memory_002dmapped-I_002fO.html
      Also, if you modify the file on disk externally (especially if you change its size), the manual says that:
      "The effect of changing the size of the underlying file of a mapping on the pages that correspond to added or removed regions of the file is unspecified."

  • @justwanderin847
    @justwanderin847 Před 2 lety +1

    I am on Linux and can not access more than the 4 gig limit. I have 64 bit and 32 gig ram and when I do the mmap it appears to work (8 gig), but when I try to access past the 4 gig, it gives sig memory err. Now I am using gnuCOBOL and have the table defined in Linkage and get the pointer set all is well. It works for 4gig or less.
    Is there something in my Linux system that I need to change as in the config ? (I use Ubuntu and PopOs) on two diff machines.... I should be able to mmap and access 90 gigs (should be no limit in linkage section. ?

  • @MultiNova100
    @MultiNova100 Před 6 lety

    If I open the file in a mode (say read-only, or append) thats different from the permission I give to mmap, which will apply?

    • @JacobSorber
      @JacobSorber  Před 6 lety

      I would expect both to apply. One is controlling access to the file on disk using the file descriptor. The other is controlling access to the memory map. But, I have tested this recently (maybe ever). Why don't you try it out and see? Let me know if it does anything surprising.

  • @paulwomack5866
    @paulwomack5866 Před 4 lety +4

    mmap is awesome and cool, and easy to use. Just RTFM.
    But for professional use, it's useless without exception based error handling.
    Because errors that you could handle via return failure code using read/write/seek (etc) just become memory errors.
    I don't know of a way to do high grade error checking when using mmap().

  • @michaelclift6849
    @michaelclift6849 Před 4 lety +3

    What if my modifications change the size of the file? Do you have a suggestion for how to handle that?

    • @bonbonpony
      @bonbonpony Před 3 lety +2

      In that case, you have to change the size of the file the usual way, i.e. by seeking to some offset past the end of the file and writing some random byte there. This has nothing to do with memory mapping, though, because memory mapping just makes a 1:1 map between some part of the file and some range of virtual addresses. You can't "insert" data into that mapped region and expect it to shift the data after it to higher addresses, as you do when inserting text in text editors. Instead, it works more like the overwrite mode in text editors: you can only overwrite what's already in there, inside that mapped region.

  • @MECHANISMUS
    @MECHANISMUS Před rokem +1

    When could regular io be preferred over mmap?

  • @ProkenKey
    @ProkenKey Před 3 lety

    Can you show an example of offset?

  • @benedictionbora5902
    @benedictionbora5902 Před rokem

    I get errors using "open", "close", "write", and "read", saying that they are "invalid in C99", how do I fix this (I am using Vim on MacOS BigSur)?
    "fopen", and "fclose" work fine, but I just wanted to try the other functions.

  • @athinakyriakou4440
    @athinakyriakou4440 Před 5 lety

    So, the mmap() allocates memory in the heap like malloc()?

    • @JacobSorber
      @JacobSorber  Před 5 lety +7

      Sort of. Malloc actually uses mmap to get the memory that it then hands out to your program. Think of it this way. Mmap allows you to request blocks of memory from the kernel (the operating system's privileged code), but those blocks have to be a multiple of the system's page size (4096 on most systems). I actually made another video on this (czcams.com/video/XV5sRaSVtXQ/video.html). Hopefully, that helps fill in the rest of the details.

  • @spartacuspro88
    @spartacuspro88 Před 5 lety

    Is this just more convenient than allocating a buffer with malloc and then writing to it using fread, or are there other benefits?

    • @JacobSorber
      @JacobSorber  Před 5 lety +1

      For me, it's mostly about convenience. In some cases, it might also be a bit faster.

  • @adreto2978
    @adreto2978 Před 10 měsíci

    When to use shm_open() over open(). Online people say in modern POSIX environments theyre basically equivalent? Is this true.

  • @mostafaomar5441
    @mostafaomar5441 Před 4 lety

    What is the difference between using mmap for a file and reading the whole file into memory in the beginning, accessing it in the memory as much as you want, then writing it back to the disk before the program ends? Does mmap work somehow similar to this?
    Also, what does mmap do when the file is way too big to fit in the memory?

    • @JacobSorber
      @JacobSorber  Před 4 lety +6

      One difference is that you will write much less code, using mmap. The main advantage is that you leverage your operating system's virtual memory system to handle things like 1) what to read into memory when, 2) what to write out when, and 3) what to do when it doesn't fit into memory. The complete answer would be too long to give in a comment, but say you have a large file, but you only need to read part of it (some at the beginning and some somewhere in the middle-but you don't know where until you read the stuff at the beginning). You can use read/fread/fseek to get just the bits of the file that you want, but the code might get a bit complicated. With mmap, this will happen automatically. The OS will just page in the parts you actually access (assuming it's using demand-paging). Hope that helps.

  • @CerbTheUnidog
    @CerbTheUnidog Před 5 lety

    What happens if you have an mmap'ed file in a running program, but you delete the underlying file?

    • @JacobSorber
      @JacobSorber  Před 5 lety +1

      Good question. You should try it out. I haven't explored all of the different failure scenarios, but I just tried deleting a mapped file on a simple example (on Linux), and it didn't change the program's behavior. My suspicion is that it's removing the directory entry, but not removing the file blocks while the mapping is active, but I could be wrong.

  • @BHASKAR26able
    @BHASKAR26able Před 4 lety

    Will this only help in printing string by string or can we also write complete output into file without doing so?

    • @JacobSorber
      @JacobSorber  Před 4 lety

      mmap just maps the bytes into memory. You can modify them however you choose, line by line, or just write in one big memcpy or memset operation. Or, you can just change a random byte in the middle of the file. It just treats the whole thing like a big array. It doesn't care how you access that array.

  • @TheMentalGentelman
    @TheMentalGentelman Před 5 lety

    This is awesome; no more read and write calls. Whoo! Just a question though, I assume this works well with small to medium files but does this work with large files? Like, say, a one GB file?

    • @JacobSorber
      @JacobSorber  Před 5 lety +1

      You should try it out and see. (Yeah, it should within reason.)

    • @bonbonpony
      @bonbonpony Před 3 lety +7

      This is PRECISELY what `mmap` is used for: peeking through huge files fast as if they were memory. Of course it wouldn't be very smart to map a several-gigs file as memory (unless you actually have that much of RAM, or at least swap space, but the latter would be slow anyway), because you rarely need the entire file at the same time. Instead, you map just some portion of the file (a buffer) as memory, do whatever you need with it, then you can map a different piece of the file at the same memory buffer, do some stuff with it, etc.
      You can think of it like opening that file in a text editor - after all you don't see the entire file in your editor, just some "window" of it (e.g. one page / one screen of text). But you can now move that window over the file to see different parts of it through that window. So the memory buffer is your "window", and mapping different parts of the file into that window at different times is like scrolling over that text ;)
      The good thing is that this way you don't have to load the entire file into memory to peek its last couple of bytes - now you can just map the part you're interested of into memory.
      Better still is that with `mmap` the part of the file that you mapped into memory doesn't have to take any actual RAM space until it's actually needed! :> The system just remembers that this range of addresses is mapped to that range of data in the file, but it loads it into actual physical memory only when someone tries to peek them. Until then, the pages that has not been accessed yet remain unmapped. Similarly, the system may swap some of the pages of your mapped memory back into the file (or swap file/partition) when it needs to free some physical memory for something else, until you try accessing those pages again (in which case it loads them back into memory).

  • @SlideRSB
    @SlideRSB Před 2 lety +1

    What's with the slasher movie music?

  • @feastures
    @feastures Před 5 lety

    I've actually crashed Linux, long time ago, by allocating more memory than was physically available. Allocation (malloc()s) went all okay. But then I started writing to that memory, the kernel crashed at a certain point. I ran the same test program on SunOS, which didn't have this issue.

    • @JacobSorber
      @JacobSorber  Před 5 lety

      Software bugs happen in operating systems, too. Glad they seem to have fixed that issue.

  • @sibendupaul6250
    @sibendupaul6250 Před 6 lety

    @Jacob, in what kind of scenarios, does the memory mapped files will be useful? can it be used to dump data directly to the RAM rather than disk while getting content from TCP socket?

    • @JacobSorber
      @JacobSorber  Před 6 lety +3

      Memory-mapped files are typically used to efficiently read large files or to edit files in place. You can use them for writing as well, but mmap won't expand the file size so you would have to do that separately. If you're getting data from a TCP socket and trying to dump it to disk, I doubt that memory mapped files will make things any easier for you. You might get some performance benefit, but I'm not sure.

    • @sibendupaul6250
      @sibendupaul6250 Před 6 lety

      Thanks for the reply. I am trying to directly write the data obtained from a TCP socket and write it to the virtual memory of the process by using mmap files. Not on to the disk first. Any idea how good/bad it will be?

    • @JacobSorber
      @JacobSorber  Před 6 lety +1

      Is your goal to put the data into RAM or onto the disk? Whatever you do, the data from the socket will start out in memory (in the process's virtual memory). If you want it to be written to disk, you can use either conventional file IO or memory mapped file IO. In this case, I think conventional file IO would probably be simpler to program and might be just as fast (because with mmap, you'll have to keep increasing the file size). In the end, you could implement both and time them and see which is faster.

    • @sibendupaul6250
      @sibendupaul6250 Před 6 lety

      My goal is to put the data into RAM (process virtual memory which is already allocated). Then How to directly put into the RAM. Consider, the size issue of the file is not there. Any suggestions now?

    • @JacobSorber
      @JacobSorber  Před 6 lety +2

      So, when you call "read" to get data from the socket, you give it a pointer (an address) to a place in memory that you want the data to be copied to. That can be a global array, or something on the stack or heap. At that point, it's in RAM (in the process's virtual memory).

  • @HeavyRainMeditation
    @HeavyRainMeditation Před 4 lety

    sys/mmap.h give some error in "lie no such directory"

  • @homelessrobot
    @homelessrobot Před 3 lety +1

    next level big-brain mmaping is when the file isn't a regular file, so it doesn't have anything at all to do with a disk. Like a device file. The fd for the file has to be seekable (random access... like memory), but thats the only restriction. This is how you get memory mapped device IO into userspace for efficient userspace device drivers/controllers. You can even double this back on itself to do your own userspace virtual addressing my remapping mapped portions of /proc/self/mem into a more convenient location.

  • @CrazedMachine
    @CrazedMachine Před 2 lety

    So why wouldnt we use mmap every time in lieu of fopen?

  • @inanismailov
    @inanismailov Před 4 lety

    Your code, for some reason, does not compile when I use GCC or G++ in my makefile. After some fiddling I managed to get it to compile but when I run it it the file size is either 1 or 0 (its an entire novel) which is incorrect, or a segfault occurs. Any ideas on what could be causing this?

    • @inanismailov
      @inanismailov Před 4 lety

      nevermind... thanks for the help :D

    • @JacobSorber
      @JacobSorber  Před 4 lety +1

      Glad you got it sorted out. Do you mind telling me what the issue was, for posterity's sake?

  • @mohssenelg4501
    @mohssenelg4501 Před rokem

    can i read my file line by line using mmap()

  • @davidramsay9321
    @davidramsay9321 Před 3 lety

    Works nice when compiling in Linux but the function is not available in Linux or Windows Mingw :(

  • @a4e69636b
    @a4e69636b Před 2 lety

    The music sounds creepy, but thank you for the video.

  • @MECHANISMUS
    @MECHANISMUS Před rokem

    into or onto?

  • @ignacionr
    @ignacionr Před 2 lety

    why do you have books in Thai language on your shelf?

  • @coolshailendra2805
    @coolshailendra2805 Před 6 lety

    is it possible to update mapped memory when base file is updated ? Suppose independent process A has read file "file.txt" where data is abcd. Later process B updated this file as abcde. How mapped memory can get this updated data ?

    • @JacobSorber
      @JacobSorber  Před 6 lety

      Good question. Try it out and see. I think it probably won't work. I did a quick test and didn't see the changes reflected, but there might be a way to get it to update in both directions. In general, things get a little tricky when processes are messing with an open file concurrently.

    • @coolshailendra2805
      @coolshailendra2805 Před 6 lety

      Hi , I tried it. With map_shared it is possible but content in file should be updated with open ( with c or python). :).
      All these are working fine , just one problem left that can we do mmap on a empty file ? Bcoz if we do we get a out of bound pointer which will lead to segm fault.

    • @JacobSorber
      @JacobSorber  Před 6 lety

      Can you explain exactly what you did that worked? Are you modifying the file in the same process that mapped the file?
      For the empty file case, just use "truncate" to extend the file length after creating it before mapping it.

    • @coolshailendra2805
      @coolshailendra2805 Před 6 lety

      Jacob Sorber I followed following steps.
      1 process p1 open file as read only (as I don't wanna write )
      2 p1 mmap opened fd it with MAP_SHARED.
      3 p1 print content from pointer got from mmap.
      4 p1 add break in this process. I used cin/scanf
      5 p2 write in same file from python from another session.
      6 print again. . New changes will be reflected here.

    • @JacobSorber
      @JacobSorber  Před 6 lety

      What OS are you using? I tried something similar and didn't see the changes. Trying to figure out what we're going differently. Thanks.

  • @Rebecca-sv3vd
    @Rebecca-sv3vd Před 3 lety

    I still don't understand why this is faster. If we are in essence just hijacking the VM system's way of swapping from disk to memory, then why not let the VM system handle it? Also, what if the file is super big (either more than you have in memory or close to what you have in memory), if we're mmapping to memory (requesting memory from the OS), doesn't this cut into it's reservoirs?

    • @JacobSorber
      @JacobSorber  Před 3 lety +1

      With mmap, you are letting the VM system handle it. And, it's not *always* faster. It usually is, though. The virtual memory system already has techniques it uses for swapping memory in and and out of RAM from disk. So, if the file is super big, those techniques will just bring in a part of the data as you need it.

  • @andrewdunbar828
    @andrewdunbar828 Před 3 lety

    Have you got any videos about learning Khmer?

    • @JacobSorber
      @JacobSorber  Před 3 lety

      No. Not really. Would you like to see more Khmer on the channel? 🤔

    • @andrewdunbar828
      @andrewdunbar828 Před 3 lety

      @@JacobSorber Well I was looking for a video on mmap but I'm also a language nerd and there are no good Khmer channels by foreigners. There's plenty on Chinese, Japanese, Korean, Thai, and Vietnamese. There's a niche waiting for you man!

  • @267praveen
    @267praveen Před 3 lety

    What's the windows equivalent for this ?

    • @JacobSorber
      @JacobSorber  Před 3 lety

      I believe the function is call CreateFileMappingA, but it's been a while.

  • @bananalord8575
    @bananalord8575 Před 2 lety

    Great video why the creepy music tho?

  • @bonbonpony
    @bonbonpony Před 3 lety +4

    So when you write something to that memory, _when_ does the operating system write this data into the file? I don't suppose it does it immediately, because that would dramatically slow down the access to that memory :q But in that case, my modifications to that memory won't appear in the file immediately, and they can be lost when there's a power outage in the meantime :q Is there some way to control when the operating system flushes those data to the file, or to cause it on demand?
    I'm also a bit dubious about your benchmarks, because from I've been told, `stdio` functions use `mmap` under the hood to do its own buffering. E.g. when you `fseek` to some location in the file, it mmaps a piece of the file into its internal buffer beginning from that offset in the file. I think what might have slowed down your "without `mmap`" version, is those multiple calls to `fseek` being done inside the loop, because that would most likely cause a lot of calls to `mmap` under the hood whenever you jump to an offset that is not in the range of currently mapped buffer (beside the function call overhead).
    Also, when you use `MAP_PRIVATE` with `PROT_WRITE`, the changes won't actually be written to the original file :q The system makes a private copy of the file in memory, that is swapped as any other virtual memory region. It won't be swapped into the original file.
    P.S. What's with that dramatic horror music? :q Is it supposed to scare us from using that dangerous C language? :D

    • @bonbonpony
      @bonbonpony Před 3 lety +4

      OK, never mind, I'm answering my own questions :D
      The data stored in the memory-mapped region are _not_ written directly to disk until the system needs to _swap_ it (e.g. when it needs to grant physical memory to other processes). In that case, two different things may happen, depending on whether you used `MAP_PRIVATE ` or `MAP_SHARED`:
      For `MAP_SHARED`, the original file is used as the swap space, so the system dumps the content from physical memory into the original file, then uses that physical memory at a different virtual address for a (possibly) different process.
      For `MAP_PRIVATE`, the system makes a separate (private) copy of the data from the file in virtual memory of the process, and in that case they are *never* saved into the original file! Instead, the system swaps them as any other virtual memory - into a separate swap file/partition - when it needs to use that physical memory for something else, and reads it back from that swap file/partition into memory when the process needs it again.
      So the actual writing to disk is rather unpredictable and controlled by the system (it happens when it needs to swap that memory).
      But there is a way to request it on demand: using `msync` call. Also, `munmap` does it naturally when you finish using that mapping.

    • @edwingarcia5043
      @edwingarcia5043 Před 2 lety

      @@bonbonpony interesting

  • @obinator9065
    @obinator9065 Před rokem

    1:55 the criminal music lol

  • @R4ngeR4pidz
    @R4ngeR4pidz Před 4 lety

    Why is the background music so ominous? This is making me think this method is dangerous in some way o.o

  • @khomo12
    @khomo12 Před 4 měsíci

    👍👍👍

  • @12crenshaw
    @12crenshaw Před 2 lety

    I wish it was so easy with winapi...

  • @jodiethemathgenius9204

    omg lego

  • @wenglish1968
    @wenglish1968 Před 3 lety

    Unfortunately, mmap isn't portable to Windows :-(

  • @jenaipour2mn
    @jenaipour2mn Před 2 lety

    great video, thank you. Music is awful though...

  • @ahrarcorson6452
    @ahrarcorson6452 Před 4 lety

    Has anyone ever told you, you look exactly like Mathew Macounaghy

  • @MrTomro
    @MrTomro Před 3 lety

    a really cool vid but bruh that music

  • @liamquinlan3456
    @liamquinlan3456 Před 4 lety

    Why are you calling fseek at all? fopen fully buffers the file by default, and fgetc increments the FILE structure's position counter already.
    Also I trust you are rebooting between benchmark runs so that the first run's residuals in memory/disk cache aren't slanting the scales?
    (lastly, "time" makes me sad. perf stat?)

    • @JacobSorber
      @JacobSorber  Před 4 lety

      The point of using fseek is to not read sequentially because I expect to get very similar results if I read sequentially. As far as the timing runs. I don't recall exactly what I did, but it wasn't intended to be a scientifically-publishable experiment. Just a quick demo. I'm sure I played around with different orderings and different files, but I'm guessing I didn't reboot the machine between runs.
      Also, sorry to add to your sadness. Again, I was just going for a quick and dirty test, and I figured more people know and use time. I do like perf-stat. Might be a good topic for a future video.

    • @liamquinlan3456
      @liamquinlan3456 Před 4 lety

      @@JacobSorber see that's kinda what I mean though
      If you expect to get similar results *without* the random access component, then you're claiming the random access component is what's probably making the difference.
      Thing is, that means this test mostly boils down to showing you that array indexing is faster than calling an extern function.
      For what it's worth, that's probably still true at optimization levels other than what you used here...

  • @morganroberts5946
    @morganroberts5946 Před 4 lety

    You kinda remind me of Matthew Mcconaughey

    • @JacobSorber
      @JacobSorber  Před 4 lety

      A few people have said that. I'm guessing it's the hair. :)

  • @thanlim7749
    @thanlim7749 Před 3 lety

    You must speak Khmer. I see the Khmer dictionary in the background. I speak Khmer too :)

    • @JacobSorber
      @JacobSorber  Před 3 lety +1

      ល្អណាស់ រៀននៅណា ? មកពីស្រុកខ្មែរទេ?

    • @thanlim7749
      @thanlim7749 Před 3 lety

      Jacob Sorber wow you do speak Khmer:). That is awesome!!! Where did you learn it? Why did you want to learn Khmer. ខ្ញុំចេះពីស្រុកខ្មែរ

    • @JacobSorber
      @JacobSorber  Před 3 lety +1

      @@thanlim7749 I lived in Phnom Penh and Kampong Cham from 1998 to 2000.

    • @thanlim7749
      @thanlim7749 Před 3 lety

      Jacob Sorber that is awesome!

    • @thanlim7749
      @thanlim7749 Před 3 lety

      Jacob Sorber can I ask you a question about programming.
      For example, struct donuts *ring;
      How can I insert integers into that ring pointer? I am not sure if we can do that ?
      Thanks

  • @michaelfrank1048
    @michaelfrank1048 Před 2 lety

    This doesn't work. Trying to access indices of the string returns random numbers instead of the actual characters.

  • @michaelespinoza4562
    @michaelespinoza4562 Před 8 měsíci

    Life sucks now that i am an adult. I wish i could be a teenager again.

  • @charleshwankong
    @charleshwankong Před 2 lety

    ABrAhAm lInCoLn

  • @littledev09
    @littledev09 Před měsícem

    'Baccha chor'