Rust's Alien Data Types đœ Box, Rc, Arc
VloĆŸit
- Äas pĆidĂĄn 10. 06. 2024
- Rust's smart pointers can be a bit confusing for developers coming from garbage collected languages. Let's walk through some very simple examples to understand when and how to use the most common ones.
00:00 Intro
00:32 Box
03:34 Rc
09:08 Arc
11:42 Outro
---
Stuff I use to make these videos - I absolutely love all of these products. Using these links is an easy way to support the channel, thank you so much if you do so!!!
Camera: Canon EOS R5 amzn.to/3CCrxzl
Monitor: Dell U4914DW 49in amzn.to/3MJV1jx
Lens: Sigma 24mm f/1.4 DG HSM Art for Canon EF amzn.to/3hZ10mz
SSD for Video Editing: VectoTech Rapid 8TB amzn.to/3hXz9TM
Microphone: Rode NT1-A amzn.to/3vWM4gL
Microphone Interface: Focusrite Clarett+ 2Pre amzn.to/3J5dy7S
Tripod: JOBY GorillaPod 5K amzn.to/3JaPxMA
Keyboard: Redragon Mechanical Gaming Keyboard amzn.to/3I1A7ZD
Mouse: Razer DeathAdder amzn.to/3J9fYCf
Computer: 2021 Macbook Pro amzn.to/3J7FXtW
Caffeine: High Brew Cold Brew Coffee amzn.to/3hXyx0q
More Caffeine: Monster Energy Juice, Pipeline Punch amzn.to/3Czmfox
Building A Second Brain book: amzn.to/3cIShWf - VÄda a technologie
ERRATA:
1. I mention that stack memory has faster access time than heap memory. While *allocating* and *deallocating* stack memory is much faster than doing so on the heap, it seems like access time for both types of memory is usually roughly the same.
I was just thinking about this at the beginning of the video. Heap and stack are just different areas of the same system memory.
What matters here is that the stack is used to keep the "frame", i.e. all the values that are local, to the current function. This is how, after a function call returns, local variables retain their values, and this is what makes recursion possible. This stack behavior is implemented by keeping a pointer to the "top" of the stack and, on each function call, moving that pointer by an amount equal to the size of the new function's stack frame. That's why the compiler needs to know the size of the stack frame, and consequently, the size of any local variable to a function. Every other object that's dynamic in nature, or recursive, will have to live outside the stack, i.e. using Box.
And like you just explained, deallocating on the stack is quite fast, since things aren't really "deallocated", the Stack Pointer is just moved back to where it was before the function call, while allocating and deallocating on the heap usually involves interacting with the Operating System to ask for available memory.
Great video! Keep it up!
I think "stack is faster than heap" is a pretty reasonable starting point, especially for a talk that isn't going into nitty gritty details about allocators and caching. Stack memory is pretty much guaranteed to be in your fastest cache, but with heap memory a lot depends on access patterns. If you have a really hot Vec then sure, there's probably no performance difference compared to an array on the stack. But for example a Vec where each String has its own heap pointer into some random page, isn't going to perform as well.
@@oconnor663 For most programmers that aren't going down the nitty-gritty sysprog hole the assumption that "stack is faster than heap" covers 95% of all use-cases. The msot time spent when dealing with memory is allocating and deallocating after all.
You'd need to set another register than EBP but the type of memory is indeed exactly the same, and the cache will cover both. But there may be system calls when using the heap. "In an ideal world you'd have everything on the stack" - I disagree if that's in the absolute, bear in mind the stack is limited in size and if you cannot often control what was stacked before your function is called or what will be stacked by the code called by your function. It's not appropriate for collections either because it would complicate size management and cause more memory moves (which are very power-consuming). But I think you meant it otherwise, for small objects in simple cases where this isn't a concern.
These days memories are so large that people tend to forget about those limitations and then they are surprised the first time they have to deal with embedded code. ;-)
It makes total sense, both are in RAM. The thing is the stack is contiguous so writing to it is fast because the writes are sequential, while the heap is probably fragmented, which means random writes.
Edit: without taking into account what the others have said, about frames, OS allocation, etc, everything contributes.
Sir, your Rust tutorial are cohesive, easy to follow ( due to great examples ) and don't go overly deep into the details. Perfect combination. Keep up with the good work.
Thanks for the kind words Miguel! It's thrilling to know that these videos can make these concepts a bit more palatable.
â@codetothemoon, the way you described lifetimes just clicks
Honestly, I 've read about these things 3-4 times, and I more or less understand them, but it really clicks differently when someone tells you "these are the two main uses of Box: unsized things and self-referencing structs". Thank you, this is really helpful!
Nice, I'm so glad you found that perspective valuable!
Stuff on Cell and RefCell would be exactly what I'm looking for, thanks for these great videos! đ
Nice, I've put it on the video idea list!
As far as I can see, if your implementation requires RefCell then your implementation is probably wrong. ;)
WOW WOW WOW! Rust is my favorite programming language, and Iâve used it for all sorts of things, but Iâve never dived into smart pointers (except box) and this was super helpful!
Nice, glad you found it valuable!
Thanks for the helpful video! It takes me a bit to catch everything on the first time around so I need repeat parts, but the clear examples and broken down explanation really help a lot.
It should be noted that in the Rc example, you could just have written truck_b.clone() instead of Rc::clone(truck_b)
The rust book teaches like he did, Rc::clone(&an_rc), i think the reason is just to be idiomatic. Nice to know both ways are fine.
This is sooo awesome!! I never understood the concept of Arc pointer until now, thank you so much :D
thanks for the kind words, really happy you got something out of the video!
Great video, concise and well explained, just what I was looking for Rc. Please keep them coming.
Nice CJ! Glad you found it valuable - more to come!
Your tutorial is very clear and easy to understand. Thank you so much.
I hope you will create a video about RefCell soon.
Iâm so glad that I found you channel. So easy to understand now
Amazing help! Instantly subscribed.. I've been trying to figure out Dependency Injection in Rust and had no idea Rc is what I needed.
you're doing amazing work doing those videos! please keep going. it would be also cool to see ffi and unsafe rust
Thank you gorudonu! More on the way, and I've put FFI/unsafe on the video idea list.
Your a great teacher. I would love videos where you develop small programs that illustrate various language features.
It was very helpful to put forward usage scenarios.
You explained so clear for these complicated concepts~Thx!
Glad it was helpful!
I love your videos. Thanks for taking the time to make these videos.
Great video! I finally understood smart pointers and its appropriate usecases đ
Thanks Ramkumar, so happy it helped you!
These are extremely nice video's, thank you!
Thanks and thanks for watching Jos!
Loved your video. There was some handy pointers in there đ„. But absolutely would love to see a video covering RefCell
Haha! Seems like there is a lot of desire for RefCell, I've placed it high on the video idea list.
Absolutely love your videos! Keep up the great work. đ
Thanks so much for your support Fotis!
Thanks for your great content!!
Your tutorials are clean, comparatively fast and easy to understand
Thanks Namaste (amazing name btw!), glad you found it valuable!
Thanks a ton for creating this!
Can't wait for new rust videos.
Thanks for watching, more to come!
Such high quality videos. Thank you :)
thanks for watching!
Welcome back
Thanks!
These videos are wonderful as someone new to the language. Thank you!
Great, that's precisely what I'm aiming for! Glad you found it valuable!
I saw a lot of examples, including THE BOOK, and rust by examples, a lot of youtube videos. still didn't fully understand why how what. now i think i understood Rc finally. Thank you.
Also, to mention about Box usecases. The first use cases covers it, but it's not straightforward. Imagine that we are possibly returning many structs that implement the same trait from the function. In this case, the return type can not be known at compile time, so we need to make it Box
Wow. Amazing content!!!
thank you!! đ
Thanks for this video! These smart pointers are confusing. Could you also cover Cow in one of your next videos?
Seems like we have a few requests for Cow, Iâve added it to the video idea list!
@@codetothemoon thanks!
This video is great, thank you for making it.
Thanks for watching!
The quality of these videos is great, 60fps is a nice touch
Thanks Gavin! Impressed you noticed the 60fps ;)
Very good meta informations! Thank you
Thanks and thanks for watching!
Thanks, just what I needed
glad it was helpful!
Very helpful thanks!
Glad you found it valuable, thanks for watching!
great content!
Wow that's an excellent video!
thank you, glad you got something out of it!
great video!
i'm liking the quick vids
glad to hear, thanks for watching!
Great video! I think what would have been simpler to explain the difference between Rc and Arc without mentioning reordering, is that the increment and decrement of the internal strong and weak counters are represented as AtomicUsize in Arc (i.e. thread-safe) and usize (i.e. non-thread-safe) in Rc.
Thanks and thanks for the feedback! Touching on ordering was probably a little confusing, to your point I probably could have just mentioned the different counter types, and that one is thread safe while the other isn't
awesome video, thanks.
thanks, glad you liked it!
Literally best place to explain Box I found
nice, really happy that you found it valuable!
This video is great, thank you
Glad you found it valuable, thanks for watching!
Just the vid I needed
nice, glad you found it valuable!
I like the pace of this video.
Thanks Thorkil, glad you liked it!
best rust tutorial online, period
thank you so much!
The stack is not faster than heap. Both are locations in main memory. True, stack might be partially in registers, but in general, stack is no different to heap. Heap memory involves an allocator which in turn of course causes more overhead (internal some atomics need to be swapped and free memory has to be found). But stack and heap are both located in equally fast main memory.
I misspoke on this - thanks for pointing it out! I made a pinned comment about it.
What about the RefCell? It is mentioned in the intro but never explained what it does
I excluded it from this video to keep things concise, and I wasn't convinced it would be useful for the vast majority of folks. But several people have requested I cover it, so I may at some point. In the meantime there is coverage of it in one of the later chapters of the Rust book.
Good stuff, just came across Box today
Thanks Brandon!
btw mem::drop is in prelude so you can just use drop(...)
ohh nice thanks for the pointer (no pun intended) !
production. Thanks again!
Thank you too!
Watched a bunch of videos before this and didn't really get it at all. Now I feel like I have a pretty good idea of how to use each
Julian - that's fantastic! It thrills me to make tough concepts more palatable.
This was a super helpful primer on why/when to use these types! Would love to see more content building on it.
I'm trying to form some internal decision tree for how to decide how long a given piece of data should live for. Going to go see if you have any videos on that topic right now... đ
great, really happy you got something out of the video! I don't have a video specifically on deciding how long a piece of data should live for, but "Rust Demystified" does cover lifetimes.
Thanks!
You got a new subscriber !
Thanks Tsiory, very happy to have you onboard!
I am unsure whether one should practice both safe and bad programming. At least it is safe, I suppose. Specifically, I do not understand one of these clone examples when good programming might ask the instance to remain singleton, all the way through (both literally and figuratively). You show us how to do it, and you behave as if: awesome.
they are singletons - when we call clone on the Rc/Arc smart pointers, it's the pointer that's being cloned, not the underlying data
@@codetothemoon That you can do it is not the point.
you are awesome!!
thank you, glad you found the video valuable!
help me a lot. Thx.
As a C# developer my understanding is that Rc basically turns structs into classes
How so? I thought C# uses garbage collection as opposed to reference counting?
@@codetothemoon I didn't mean on the memory allocation part, more so of how reference types work in C#
I wouldn't say stack memory is faster to access, just that the allocation and deallocation is faster. It might be a bit faster in certain conditions since it will stay in cache most of the time.
Got it! Yeah my understanding was that stack memory is more likely to be stored on the CPU cache - but maybe that's possible for the heap as well... Though I haven't actually benchmarked this, maybe I'll do that...
Ordinary variables could also be assigned by the compiler to CPU registers, which makes them as fast as they get. This doesn't happen to the heap-allocated variables.
@@codetothemoon Access is fastest when the data is "near" the recent access. Which is a part of why data oriented programming is so much faster.
but I bet the methods of memory access have changed so much that what we are taught is not what is implemented in the most recent technology
Finally a rust tutorial that clicks !
awesome, glad you got some value out of it!
Still fairly new to Rust. If a routine has a reference of a clones structure, can it be changed, or does it more like get a copy?
Omg, I _love_ your intro graphic, played at 0:30. *It's short!* Who wants to sit through 5 or ten seconds of some boring intro boilerplate every time we visit that channel, like a bad modal dialog box on some Windows 95 app, drives me nuts.
thanks modolief! I'd thought about creating a little intro reel, but every time I consider it I conclude that it would hinder my mission to provide as much value as possible in as little time as possible
@@codetothemoon The channel "PBS Eons" also has a really good intro bit. They start their video, then at around 20 or 30 seconds they give their little imprint. But what I really like about it is that even though it's more than about 3 seconds it fades out quickly, and they already start talking again before the sound is done. Very artistic, yet not intrusive.
It's short, which I like, but the sound is kind of jarring.
Was watching your Box part and was like... yep, I know those errors đđđ
they are a rite of passage every Rust developer must traverse.... đ
very nice video
thank you!
Nice continue
Thanks I shall continue!
Sir, what extension you use to have the UI Run in the main function.
Me (a frontend javascript webdev): fascinating!
nice, it seems like many JS frontend devs are interested in Rust!
Hey man, I really like your VSCode theme, can you tell me which one are you using?
Sure it's Dark+!
@@codetothemoon Thanks! Have changed my theme.
So in the RC example would the memory exist until the main function gets completed since it adds to the strong count?
that's correct! Rc doesn't really help much if you intend to hang on to one reference until the program ends - you could just use regular borrows in that case - but in this example to show the strong_count function I just kept a reference in main.
Good video and One RefCell pls.
Thanks, will do one eventually, wishing I had done it for Halloween as I think it has the appropriate level of spookiness đ
11:02 this what I don't rust for. Where did we pass truck_b ownership to the thread? I don't see any obvious code that tells me that truck_b moved to the thread. The variable of type Arc is cloned by readonly reference, so why it passes ownership?
Great video. What is this vscode theme?
Thanks and thanks for watching! VSCode theme is Dark+
Hey please create a video about refcell and cell!
Definitely doing this at some point, given the spooky factor it would have been a good one for halloween, but unfortunately it probably won't be ready in time đ
Yeah, please do RefCell as well. I'd also love you looking at Axum/Hyper/Tower ecosystem, or some of the popular data parallel computing libs.
I've added RefCell to the video idea list! I've been curious about those frameworks as well, especially Axum.
The stack and the heap are just as fast, because they are on the same system memory. What takes time is allocation and pointer dereferencing.
yeah, now I see the stickied comment
One more thing. I'm assuming that for clarity, you used the explicit Arc::clone instead of the suffixed version. You can use .clone() on an Rc/Arc and it will clone the reference instead of the data.
thanks for pointing this out - I should have mentioned this in the video if I didn't!
If you are going to cover refcell, you should surely also cover it's siblings, Cell, UnsafeCell, Mutex and RwLock.
I have another video for all of these (except UnsafeCell) - check out âRust Interior Mutabilityâ
My fave is Cow; Clone on Write
Nice
Thanks!
Could you demonstrate or explain Yeet? Love your eplanations
I had to look this up - is this what you're referring to? lol areweyeetyet.rs/
Sorry I misspelled. its Yew - gui for rust
@@noblenetdk Oh actually I already have a video about Yew - check out "Build A Rust Frontend" from earlier this year!
I'm interested how Rc knows when data is going out of scope, or being dropped like you did. How is it aware that the memory is no longer accessible after a specific point without knowing where the objects are created in the program? How does the Rc know that there is a reference to truck_b in the main function, for example?
great question, in Rc's implementation of clone there is `self.inner().inc_strong();` which increments the strong reference counter. So it doesn't necessarily know where the references are, it just increments a counter each time one is created. Then in Rc's implementation of the Drop trait (which has a drop method that is invoked when the implementor goes out of scope) we have `self.inner().dec_strong();` then if `self.inner().strong() == 0 { /*code for cleaning up memory here */ }`
@@codetothemoon Ohh I see :)) Thanks very much, that makes sense!
Whenever i use the GMS and put it in the soft, it holds out the note forever! please help, i am very confused
đ„
What about the Cow type? Still struggle with that, even when I have the documentation open
been meaning to make a video about it! stay tuned...
are Rcâs safe? How do they prevent immortal reference loops?
Why is "recursive without indirection" an error? (3:00 ish)
Love rust đ
me too!
Awesome, would like to see video on RefCell
I've put it on the video idea list!
Less goooooooooooooo!!!!!!
đđ
God video!
Haha thanks!
This is timely for me. I ran into Rc and cell last night while trying to learn rust with GTK. I find it all very confusing. Anything you can provide including RefCell is greatly appreciated. Thanks.
It's a single-threaded mutex (well, read/write lock.) This might seem useless, but it can be used to create shared references that can still be modified: make an Rc, which you can clone freely, but you can still lock it for mutable writing. (if you try to take multiple write locks at the same time, the thread will panic.) it's sort of like a pointer to an object in a regular OO language. You can also use it to make mutable thread-local data.
Keep in mind anything containing a refcell can't be sent across threads. They're also a pain to serialize.
sorry-- RefCells can be sent but references to them can't be sent, and Rcs / references to rcs can't be sent.
RefCell seems to be frequently requested, I'll probably make a video about it! In the meantime it looks like like strangeWaters has a good description, and there is also an explanation in chapter 15 of the Rust book.
For atomic, it is more than just compiler has to forgo some optimizations but it has to tell CPU to also not reorder, lock the bus, and handle cache-coherency issues.
Both an INCrement and a DECrement, really have three parts load/read, compute, and store/write. Normally, both the compiler and the cpu can reorder many things and be lazy.
So if you had pseudo-code:
y=sin(x); if (cond) {i++}; pritnf("%d
",i);
then compiler could reorder it to asm(pseudo x86):
mov %eax, [i]
mov %ebx, [cond]
fsin x
jz %ebx, prnt_label
inc %eax
prnt_label:
push %eax
push "%d"
call printf
mov [i],%eax
We can have a lot going on between mov %eax, [i] (LOAD) and mov [i],%eax (STORE). The compiler needs combine mov %eax, [i], inc %eax, mov [i],%eax into : inc [i] ....
But it also has to go further and add lock prefix . The lock prefix tells CPU that it has to make sure to hold the bus during the whole LOAD/COMPUTE/STORE phases of the instruction so another CPU doesn't do anything in the middle of all this. Also it has to make sure if other CPUs have L1, L2, etc cache that references that memory that it gets invalidated.
c9x.me/x86/html/file_module_x86_id_159.html
Woah
Synchronization is expensive. Complexity in the code, complexity in the instructions, complexities in the CPU itself.
How is cyclic data handled by Rc? As we can mutate the data we can give the value of the Rc a clone, right? Thus causing the data to never be deallocated
It isn't. You can define circular data with Rc that will never be deallocated. It's the programmer's job to handle this case correctly.
This was actually at the centre of the Leakpocalypse. It was decided that, while accessing deallocated memory is `unsafe`, leaking memory isn't.
You can somewhat get around this with weak references, to get circular data with deallocation, but it gets complicated pretty quickly.
Hmm... Interesting, Maybe there would no cost for accessing variable that stored on heap, But rather there is a cost for allocation.
Yeah, I definitely appreciate that stack vs heap is much more nuanced than I made it out to be in this video...
now i understand what people mean when they say the learning curve of rust is steep
Itâs really challenging. But so interesting. And as I learn Rust I feel as though I am learning very important concepts that are key to becoming a proficient software engineer.
4:10 Truck structure... struckture
lol nice!
Why do you write Rc::clone() explicitely, instead of truck.clone() ?
đ€ I would understand them more intuitively if they were named more intuitively and consistently. One is a single ownership pointer, uniquely owned. One is a shared ownership pointer, implemented via reference counting. Another is the same as the previous, just with interlocked atomic increment/decrement. Names like "Box" and "Arc" though feel pulled out of a hat. A box has height, width, and depth, but there is nothing volumetric in Rust's "Box" (and loosely co-opting the concept of "boxing" from C# feels weird here).
Rc stands for reference counter and Arc stands for atomic reference counter, they are just abbreviations which is good because they are frequently used and imagine writing ReferenceCounter every time, especially when you have to wrap many things with them.
For box it could be named better maybe, but there is no type that is going to be called a "box". If it is a math library it would call it cuboid, cube, rectangular prism or something else. For types that are frequently used short names are good.
Totally understand your frustration - to add to the other response, I believe "Box" and "Boxing" are terms that have histories that extend well prior to the inception of Rust, but are usually hidden from the developer by developer-facing language abstractions. I think Rust is just one of the first to actually expose the term directly to the developer.
ââ@@codetothemoon Example dated usage: X.Leroy. Unboxed objects and polymorphic typing, 1992. The terms have been used in libraries also, at least since 2007 in Haskell and 2000 in Steel Bank Common Lisp. I suspect it could be traced back several decades more.
2:11 Why accessing the heap would be slower? It's still RAM like the stack, and can be cached by the CPU like any other memory. The only drawback of the heap is that it can suffer from fragmentation during allocation and deallocation. But it's incorrect to say it has slower access time.
Allocation and deallocation themselves are slower for the heap. Moreover, (just reading this from StackOverflow), the heap often needs to be thread-safe, meaning it cannot benefit from some of the same optimisations as the stack can.
@@spaghettiking653 yes fragmentation can make allocation slower, but memory access isn't slower, which is what the video implied. Having an object on the heap is exactly as fast as anywhere else, and fragmentation issues only occur in rare cases. We're talking literal nanoseconds slower to find free space on the heap instead of putting it on the stack. Unless we're talking about a very hot loop on performance critical software, it doesn't matter, and you shouldn't allocate in a hot loop anyway.
@@sbef Yes, fair point. What about the problems with thread safety? I really have no clue whether that's a real concern or whether it is a problem at all, as I literally read it minutes ago-what do you think/know?
Yeah I may have misspoken a bit here - stack memory is faster to allocate / deallocate than heap memory. Would patch this if I could :/ I'll pin a comment.
@@spaghettiking653 not sure how thread-safe the Rust default allocator is to be honest, but I would expect to be pretty much lock-free even in heavily concurrent applications. It's not my area of expertise, but allocator technology has been refined over the past 3 decades.