Crust of Rust: Atomics and Memory Ordering

Sdílet
Vložit
  • čas přidán 5. 09. 2024

Komentáře • 131

  • @haystackdmilith
    @haystackdmilith Před 3 lety +150

    "Don't use spinlocks! Don't implement spinlocks! Ok, let's implement a spinlock". Love it!

    • @sycration
      @sycration Před 3 lety +1

      @Cooper Rene LOOL just use a private tracker, don't be a punk

  • @molozful
    @molozful Před 2 lety +37

    It's incredible how coherent you can be in a live session, and on such a tricky topic. Kudos!

  • @KohuGaly
    @KohuGaly Před 3 lety +108

    This video exposes one concept that you rarely notice in regular code. Computer code looks sequential, when written down as text, but in actuality it defines an oriented graph of operations. If two operations don't lie on the same path through the edges of the graph, their relative order is undefined.
    Multi-threaded memory access exposes this weird quirk, because it itself may lack inherent ordering (ie. allows for race conditions).
    Memory ordering introduces extra edges to the graph, to restrict possible orderings.

    • @andynicholson7944
      @andynicholson7944 Před 3 lety +11

      🤯 that’s a really clear explanation. I’d never thought of code in those terms. Thanks for the perspective shift!

    • @michalkupczyk7
      @michalkupczyk7 Před 3 lety +4

      You may want to mentally map Ordering to Transaction Isolation levels in SQL databases.

    • @carterthaxton
      @carterthaxton Před 3 lety +5

      Another important ingredient is out-of-order execution by modern processors. Normally such CPUs and their caches provide a programmer-visible model that appears to execute sequentially. When they run on multiple threads, especially on symmetric multi-processors (e.g. a quad-core CPU), then these out-of-order execution models can expose far more interdependence than you might think could occur just due to "parallel execution". Ever since the Pentium Pro, for example, your CPU isn't even executing the raw instructions. It's converting them into micro-code, which is analyzed in real-time, much like a modern compiler, and reordered according to the directed graph of operations you describe. Then as many of those micro-code instructions as possible are executed simultaneously. Add to that the presence of multi-layer caches, shared between multiple processors, and the whole idea of instructions that execute one instruction at a time just completely breaks down.

    • @JordanBourdeau
      @JordanBourdeau Před 6 měsíci

      This is a great explanation, thanks tor sharing!

    • @Axman6
      @Axman6 Před 2 měsíci

      This is related to how Haskell's IO type is defined (in GHC) - it is a state monad which passes around a special value (RealWorld#) which the compiler treats as being different every time it sees it: data IO a = IO (RealWorld# -> (a, RealWorld#)) - this creates a strict ordering between IO actions, the compiler can't change them around because the RealWorld# token would no longer be passed between the actions in the same order. Other Haskell code is exactly what you've said, a graph of values, but it's one in which the order doesn't matter, because the values are always pure.
      (I'll note that "ordering of actions" didn't require the use of the m-word, that's what we call the ubiquitous pattern that IO fits nicely into)

  • @kennichdendenn
    @kennichdendenn Před 3 lety +76

    I want to give feedback:
    This is great. It fits my difficulty level perfectly and is exactly what I need - as the 2018 guideline suggests.
    Im have some experience with rust and am a computer science student currently just before my bachelors degree, so I have some experience with programming and how a computer works on several levels. Heck, I wrote a small subset-of-c-compiler. But I i.e. never heard of Ordering before.
    Those topics normally have a super high barrier of entry because of all the previous knowledge required, only that that knowledge has the same barrier. Your videos help break that and to learn a lot about how computers work in general.
    Keep on with the great work. I wish all my lectures were even half as engaging, informative, accessible and interesting.

  • @longtomjr22
    @longtomjr22 Před 3 lety +15

    My wife made me watch the intro 7 times. Makes us laugh every time! (Now if only I could also get her as excited about coding again)

    • @aqua3418
      @aqua3418 Před 2 lety

      Just have to come at it from an angle that's applicable to her in some way, that way she can see the possibilities and how it benefits/helps her :D

  • @ericseppanen6579
    @ericseppanen6579 Před 3 lety +7

    One of the exciting aspects of fetch_update (or any CAS loop that generates the new value from the old) is that your closure may be executed more than once. If your closure has side-effects, it sometimes surprises people that they get the side-effect multiple times.
    A more common mistake is for the closure to read and write to a value that exists outside the closure (i.e. "if !initialized { initialized = true; new_value = 42 }"). If the first CAS attempt fails, then the second run of the closure generates an unexpected result because it inherits state from the first run of the closure. I've seen this bug multiple times in production code.

  • @zamalek4079
    @zamalek4079 Před 3 lety +14

    Jeff Preshing has an incredible blog (and I think some videos) on this. He's a C developer, but as you said that doesn't matter.

  • @Ergzay
    @Ergzay Před 3 lety +5

    For anyone testing with themselves while following along: While testing with the first section with the "bad" mutex. It reliably generated completely random numbers every run when run with just "cargo run" instead of "cargo run --release", even without the yield. However I'm running on a Macbook Air with the M1 based ARM processor so that probably increases the randomness as ARM's memory model is a little more relaxed than Intel's I believe.
    Also, with everything on Relaxed, I still got reliable failures even after switching to the "correct" implementation using compare_exchange. The value was very close to being correct and passed most of the time however.

  • @nilshaberstroh2705
    @nilshaberstroh2705 Před 3 lety +3

    Another topic I might have avoided for much longer. But you manage to explain very complicated things (at least to me) in a very digestible way. Hat off 🎩. And the MESI protocol is indeed an interesting topic. Thanks for that hint as well.

  • @shashanksharma21
    @shashanksharma21 Před rokem +1

    Thank you for explaining this in such a brilliant manner! This is the first time I have been able to understand this ordering issue. Thank you Jon!

  • @oerglwoergl
    @oerglwoergl Před 3 lety +7

    Very interesting, thanks. The general rule seems to be (as with crypto): don't roll your own (if you can avoid it).

  • @kanuos
    @kanuos Před rokem +2

    Could not have asked for anything better than this! Thank you so much for the work you do for us!

  • @BrazenNL
    @BrazenNL Před 2 lety +1

    7:00 I couldn't live without Dark Reader!

  • @elio8119
    @elio8119 Před 3 lety +1

    I think this is the best explanation of how to use atomics.

  • @RobertWHurst
    @RobertWHurst Před rokem +1

    Thank you for creating so many of these awesome videos Jon. I find them very beneficial personally, and always enjoy the adventures you take us on. Cheers 🍻

  • @amirh6712
    @amirh6712 Před 2 lety +1

    Thank you so much for the great video Jon
    Just wanna point something out that I think you overlooked a little bit
    There is also another guarantee with Relaxed Ordering, and that is the operation is guaranteed to happen at some point in the future
    For example, without using any ordering, these two codes are considered equivalent from the compiler perspective (This optimization is known as Hoisting):
    while (!stoprequested){
    }
    if (!stoprequested){
    while(true){
    }
    }
    However if you set and read the stoprequested variable with relaxed ordering, it is guaranteed that the first loop will break eventually, meaning that the compiler will not translate it to the second form.

    • @jonhoo
      @jonhoo  Před 2 lety +1

      I actually looked through the spec for text along these lines, assuming that it _must_ be in there, and couldn't find it. Can you point me to where you see it written? The best I found was something along the lines of "the value is eventually visible" or something to that effect, but with no promises about what "eventually" means.

  • @robert36902
    @robert36902 Před 4 měsíci

    Thanks a lot for this video! I feel it's a bit of a low-key hardcore topic that is great to know, even though simple mutexes and atomic counters have been enough for me so far :)

  • @amcsi
    @amcsi Před 3 lety +3

    Thank you so much for diving so deep into all the memory orderings, and all of everything regarding atomics! I watched the entire video.
    I was going to ask why they say Rust gives you safe multithreading if there are so many issues with it apparently, but then at the very end you basically said that it's just atomics that are the problem; mutexes are what really give you the safety :)

    • @berylliosis5250
      @berylliosis5250 Před 3 lety

      The other big difference is that if you don't use Atomics or unsafe, you can't mess up. C/C++ will happily pass a T* to another thread and let you get all the problems atomics have, and more. In Rust, code that doesn't use unsafe or a preexisting synchronization primitive (which necessarily uses unsafe) can't even experience a race condition (afaik).

    • @brdrnda3805
      @brdrnda3805 Před rokem

      @@berylliosis5250 Correction: ...can't even experience a -race condition- data race.
      From the Rustonomicon:
      "Safe Rust guarantees an absence of data races,..." "...However Rust does not prevent general race conditions."

  • @digama0
    @digama0 Před 3 lety +5

    Regarding your example of relaxed ordering, there is a variant that is even more mind bending:
    thread 1: { let r1 = y; x = r1; r1 }
    thread 2: { let r2 = x; if r2 == 42 { y = 42; } r2 }
    (All loads of x and y are relaxed.) The crazy thing is that it is _still_ allowed that both threads read 42. This seems impossible since there is a control dependency from the read of x to the write of y, but this can actually happen in practice if thread 2 speculatively executes y = 42, thread 1 picks that up and stores x = 42, and then thread 2 receives 42 from x and decides that the speculation was correct. In fact, relaxed loads are even worse than that, since in the following program
    thread 1: { let r1 = y; x = r1; r1 }
    thread 2: { let r2 = x; y = r2; r2 }
    it is _still_ allowed for r1 and r2 to be 42, which is now completely nuts since there aren't any 42's in the program. (Actually C11 has a clause saying there has to be a 42 somewhere in the program but you can just have a random 42 elsewhere to satisfy this constraint.) However this is an unrealistic execution, in that no hardware actually can produce 42 in this situation without literal time travel. (This is also known as the "out-of-thin-air" problem since 42 just appears by time travel shenanigans.)

    • @susejx1
      @susejx1 Před 3 lety

      On that reference for "in the air" of this one can there be a transformative way to stay in a better position where the experience isn't plugged into a two

  • @dmitrysim
    @dmitrysim Před 5 měsíci

    First of all it's a great video, Thank you.
    I though it might be a good idea to think on the problem in terms of cpu cores, rather than threads.
    Later x.load -> c1 denotes "x.load was scheduled to run on core 1".
    2nd problem (Ordering::SeqCst section):
    - (x.store -> c1, y.store -> c2) finish.
    *** At this point c1: {x: true, y: false} c2: {x: false, y: true}
    - (!x.load -> c1, !y.load -> c2), so both loops end;
    - (x.load -> c2, y.load -> c1)
    This way z == 0.
    1st problem (Ordering::Relaxed section):
    To achieve the case, we need at least "x.load finishes after y.store finishes", since y.store is the only way 42 can get into any core memory.
    So this only achievable with compiler shenanigans, in this case on 1 core:
    - y.store -> c1
    *** At this point c1:{y: 42}
    - y.load -> c1
    - x.store -> c1
    *** At this point c1:{x: 42}
    - x.load -> c1
    Also, loom docs docs.rs/loom/latest/loom/#relaxed-memory-ordering state that "B can run before A", not "D can run before C", meaning "compiler can shuffle commands even if the 2nd depends on result of the 1st"? - I hope it's just a typo.

  • @casperes0912
    @casperes0912 Před rokem

    For a master’s course I implemented exhaustive and optimal enumeration of value release-acquire traces using an algorithm called DPOR. Fun stuff.

  • @MikeBondar
    @MikeBondar Před 3 lety +1

    great video, thanks! this whole memory ordering thing is much clearer now

  • @iilugs
    @iilugs Před rokem

    I was about to lose hope on understanding atomics... then I saw there's a Jon Gjengset video about it 😍

  • @miriyalajeevankumar5449
    @miriyalajeevankumar5449 Před měsícem

    Thanks a lot for your service.

  • @lettere9560
    @lettere9560 Před 2 lety

    The way I understand it is that load/store instruction only affects the local cache of a particular core. Ordering::Acquire is an analogue to recv and Ordering::Release is an analogue to send operation. However, there is some additional rule where if there is no incoming message in the bus, then Ordering::Acquire just read directly on its local cache.

  • @alliu6757
    @alliu6757 Před 3 lety +3

    Hi, Jon. Thanks for the amazing tutorial. Could you do some videos about tokio, crossbeam, and rayon?

  • @timvw01
    @timvw01 Před rokem

    Learning a lot here. Still cant accurately predict the results when using ordering in certain cases, but this video explains the concepts very well. Perhaps some sketching of the concepts can help, as was done in the leftright/evmap vid.
    Thanks!

  • @jocketf3083
    @jocketf3083 Před 3 lety +3

    Thank You! I learned a lot from this video! However, I almost feel like parts of this API should require unsafe as an "I know what I'm doing" kind of thing. I was thinking that SeqCst could be the "safe" default and that methods where you can specify orderings would be unsafe. Perhaps that is taking it a bit far since this isn't really about memory safety, but it might go a little way towards fearless concurrency. Do you have any thoughts on that?

  • @thomasfrans1185
    @thomasfrans1185 Před rokem

    32:48 is probably the most sarcastic "Oh nice, it works so well" I've ever heard XD.

  • @johndowson1852
    @johndowson1852 Před 3 lety +18

    You should give Dark Reader extension a try. It gives a decent enough dark mode experience on pretty much any website.

    • @driedurchin
      @driedurchin Před 3 lety

      Absolutely! Love that extension.

    • @mwcz5190
      @mwcz5190 Před 3 lety

      Great extension.

    • @jonhoo
      @jonhoo  Před 3 lety +22

      Ah, but you see, I don't actually _like_ dark mode for anything that has a lot of text on it like documentation. I *only* use dark mode for streaming because I was told by a bunch of viewers early on that their eyes hurt since they're watching the streams in a dark room ;)

    • @driedurchin
      @driedurchin Před 3 lety +8

      @@jonhoo Blasphemer! Dark mode for life 🕶️

    • @AbhinavKulshreshtha
      @AbhinavKulshreshtha Před 3 lety +3

      @@jonhoo
      Well I should thank you for being so considerate for us. I do get flash blinded when everyone in youtube video open a very bright website. Maybe its because I have become too depend on dark mode.

  • @adamzimmerman1
    @adamzimmerman1 Před 3 lety +1

    Gonna leave a plug for "Designing Data-Intensive Applications" by Martin Kleppmann. It talks about a lot of the same things in the context of distributed systems and databases. Specifically in the context of eventual consistency vs. other read/write ordering guarantees, a lot of the same thinking applies. It's a great book, especially if you're working in with those types of systems.

  • @mindstorm3854
    @mindstorm3854 Před 3 lety +3

    omg I will need to see this video 5 times to understand half

  • @xinhao6245
    @xinhao6245 Před 3 lety

    great video to explain the details, thanks

  • @dinushkam2444
    @dinushkam2444 Před 3 lety

    Great video, learned alot. Thanks I might need to rewatch

  • @Lexikon00
    @Lexikon00 Před 3 lety +2

    So, if I increment a counter in different threads and I just care to fetch the result after all threads have resolved, I can just use relaxed because there is no read just writes and therefore the ordering is not relevant? In regards that we take care that stuff happens in one execution.

    • @OMGclueless
      @OMGclueless Před 3 lety

      Yes, this is mostly correct. Your statement that "there is no read just writes" isn't correct, updating a counter is both a read *and* a write. But when updating a counter you probably don't care what value your thread reads, so it's OK to use relaxed.
      More precisely, if you use `fetch_add` to guarantee that the updates to the counter are atomic, and you call `.join()` on all the threads to provide the Happens-Before relationship that guarantees that the threads' updates to the counter are visible to you before you read the final value, then you can do all the updates with `Ordering::Relaxed` and things will work.

  • @ahuggingsam
    @ahuggingsam Před rokem

    so I know I'm late to the party but I do have a question about the Orderding::SeqCst example. Does this amount of synchronization mean we end up regressing to almost single threaded performance again? Or can the compiler/CPU ensure this ordering while still maintaining concurrency (with obviously at least some overhead for the synchronization) ?

  • @DHorse
    @DHorse Před rokem

    Great topic!

  • @BryanChance
    @BryanChance Před 3 lety

    This is new to me. My mind blew up after 1:58:00. LOL

  • @nikosefthiasjs
    @nikosefthiasjs Před 3 lety +2

    Hi thanks for the great video.
    What lsp plugin are you using it is really responsive

    • @jonhoo
      @jonhoo  Před 3 lety +4

      I'm using rust-analyzer through coc.nvim :)

  • @TheClonerx
    @TheClonerx Před 3 lety

    Even for C++ this is really useful

  • @daniellambert6207
    @daniellambert6207 Před 3 lety

    2:39:17 always an educational experience :)

  • @marlls1989
    @marlls1989 Před 3 lety

    On the counter example, wouldn’t relaxed allow the counter to be updated to the same value twice? Like, both load a previous value and both add 1 to the same old value.

  • @pm71241
    @pm71241 Před 3 lety

    Hmm... this loom thing reminds me of Kyle Kingsbury's "jepsen" and "Knossos" software to check distributed databases for consistency issues.

  • @Baptistetriple0
    @Baptistetriple0 Před rokem

    Ok I have a question on the possible reordering of the programm by the CPU when the operations are independant, if I print something in my function, but the printed value don't depend on any operation of the function, like if I just put "println!("test");", is the compiler allowed to move that print in the function? or because IO is considered having side effect it can't assume there is no link with the other operation ?

  • @error200http
    @error200http Před 3 lety

    1:14:47 C++ docs are enlightening

  • @manuelaraujo2571
    @manuelaraujo2571 Před rokem

    I got confused by the argument for why z=0 is impossible in the SeqCst example. I think the argument should be that if t1 does not add, it must be that it sees y = false after it sees x = true, which establishes that the x load happens before the y load, so t2 must see x = true after it sees y = true and it adds. Similarly, if t2 does not add then this means the y load happens first and so t1 adds.

  • @Blure
    @Blure Před 3 lety +14

    NOM NOM NOM NOM

  • @hrmny_
    @hrmny_ Před 3 lety +1

    I'm not sure whether I understand the example with `SeqCst`, do atomics when they change value not still have the same memory address?
    Can the CPU just change that around?
    And wouldn't it be horrible if it does, but the load can still refer to the old location?
    How can it at that point even guarantee that it ever gets the new value and your references to the atomic don't become stale immediately
    How can `load` get an old value after it has been changed in memory 🤔

    • @aGuyFromSomewhere
      @aGuyFromSomewhere Před 3 lety +4

      I think this may, for example, come down to caching. When Threads run on different cores, both cores may have a given value copied in their caches. Both cores may modify the value in their cache, without the other core seeing the modification. If the ordering of the modifications does not specify a clear “happened before” relation between them, then the cores may chose not to invalidate their caches before the modifications, which would be costly. This is probably also why stricter orderings (than relaxed) are considered to incur a performance cost. The cores cannot optimize as freely (with caches and out-of-order execution and the likes) when an ordering is specified. So if performance is a concern, you always want to find the “most relaxed” ordering that makes the program still work correctly.
      Anyway, caching is just my guess, there might also be other reasons for Threads to see different values of the same memory location.

  • @pictureus
    @pictureus Před rokem

    When you say in the beginning that one thread may never get the update.
    If you have memory X = 1
    Thread A that writes 2 to X in a loop.
    Thread B that reads X in a loop.
    Wouldn't Thread B eventually get the value 2 but not in the same step as you later mention?

  • @jeffg4686
    @jeffg4686 Před 3 lety

    Hi Jon, I have a small question regarding std::cell::Cell and std::rc::RefCell if you happen to have a minute. So, I get the concept of being able to change an inner value on an immutable reference, and have seen examples where a clone() implementation that takes a shared reference to self, but also needs to update some other member of self, and hence the issue since it's only shared reference, and so the work around is to use these cells. I was wondering do you tend to see this usage often outside of things like clone() implementations? I guess it's good to know these cells are there and what they do, but when thinking about when I'd ever use it, I kinda see it as quite rare, but good to know exists -- is that your take, or are these actually used all over the place in typical Rust code. thanks for any thoughts

    • @jonhoo
      @jonhoo  Před 3 lety +1

      I most commonly see in use-cases where you need mutable access to multiple subsets of a collection that are determined at runtime. This comes up frequently in graphs, but that's not the only place. In those cases, you have multiple & to the collection, and then take out &mut through something like RefCell as you, say, traverse nodes in topological order.

  • @fbytgeek
    @fbytgeek Před 2 lety

    @jon Gjengset, is it possible to do a stream on Rayon or parallelism in Rust?

  • @droidgod3
    @droidgod3 Před 3 lety

    Is it possible to get the code for this episode as a gist or git repo?
    Sometimes it is hard to follow and I have to rewind all the way cause i forgot the context, when switching between examples.

  • @correabuscar
    @correabuscar Před 3 měsíci

    1:21:40, that Relaxed (on fail), doesn't it mean it uses that memory ordering to load the value? even before the compare happens? or is it internally split then into two operations: a load for the compare&exchange and if the compare failed, then a load(again!!) for the actual value; and these two are grouped into an atomic operation; but doesn't seem to make sense, so doesn't it load it with the failure ordering (aka Relaxed in this case) then compare that value and if it's same as expected then it does the store with the success ordering(here Acquire), and these two load&(compare&)store are themselves internally atomic ?! otherwise it seems kinda weird that it would load the same value twice, once for success and once for fail; so then if what i expect is correct then the ordering args would be Release(for success) and Acquire(for failure), because it would load it with Acquire and store it with Release, or well, maybe storing it with Release isn't needed? unclear on this part. But i'm only talking about the compare_exchange_weak here, if it wasn't clear, not the last store of unlocked.

    • @correabuscar
      @correabuscar Před 3 měsíci

      the doc says this:
      "compare_exchange_weak takes two Ordering arguments to describe the memory
      ordering of this operation. success describes the required ordering for the
      read-modify-write operation that takes place if the comparison with current succeeds.
      failure describes the required ordering for the load operation that takes place when
      the comparison fails. Using Acquire as success ordering makes the store part
      of this operation Relaxed, and using Release makes the successful load
      Relaxed. The failure ordering can only be SeqCst, Acquire or Relaxed."

  • @yailkalaf8542
    @yailkalaf8542 Před 3 měsíci

    Thanks

  • @arpandhatt6011
    @arpandhatt6011 Před 2 lety

    If we were to acquire and release atomic variables inside of a loop, can memory ordering allow the CPU to wrap operations around the loop? For example, if we did a Release operation at the end of the loop, the contents of the loop will now be visible to other threads. If the CPU correctly predicts that the loop will run again, it will have already begun processing instructions for the next iteration of the loop. Is it guaranteed that it won’t move these operations ahead of the release synchronization from the previous iteration of the loop?

    • @jonhoo
      @jonhoo  Před 2 lety +1

      Ah, no, the memory model generally guarantees in-order semantics within a given thread. Reordering is only with respect to the actions of other threads.

  • @austecon6818
    @austecon6818 Před 2 lety +1

    I think my take home message from this has been: If I do multi-threading in Rust. Just use a damned Mutex.

    • @theroboman727
      @theroboman727 Před 2 lety

      At least learn channels as well. They fit some problems much better. The book ha sa subchapter on them.

  • @focker0000
    @focker0000 Před měsícem

    1:05 the magic happens in compiler!!!

  • @xwtek3505
    @xwtek3505 Před 2 měsíci

    2:36:45 uh, Rust has .as_ptr(), which is surprisingly safe, as dereferencing it is unsafe anyway.

  • @Dygear
    @Dygear Před 2 lety

    I wonder if the data race for the locking of the mutex is more likely on LITTLE.big architectures because of the different speed of the P and E cores.

    • @Dygear
      @Dygear Před 2 lety

      In ordering::relaxed section, y can store into x's 42 value because they happen at the same time? The let r1 is step a1, the let r2 is step b1 and happen at the same time as they are in different threads. the x.store is step a2, and the y.store is step b2. Because it's relaxed the compiler is free and the the CPU is free to execute these functions in any order so a1 can happen at the same time b2 is happening meaning we see y's stored value cross over into x.

    • @Dygear
      @Dygear Před 2 lety

      Getting a Raspberry Pi 4 to show the Relaxed Ordering failing could be interesting.
      With how powerful Ordering is over how the compiler will execute the statements it sees, it would almost be helpful to have a Syntax lowlighting in the background of a different color to show were the relationships start and end.

    • @Dygear
      @Dygear Před 2 lety

      So atomics are single assembly instructions, an instruction is only consider atomic when no other application, thread, process, core will have a chance to interject itself into your program flow because it's a single instruction. Electrons are in flight changing the value, nothing else can happen until that's done.

    • @Dygear
      @Dygear Před 2 lety

      1:30:55 - Line 864: What on earth is `x @ Ok(_)`?! I've never seen that before.

  • @jeffg4686
    @jeffg4686 Před 3 lety

    A little confused on Box::leak. So leak gives you back a reference to the actual value and consumes owned binding, but confused as to why one would want to over just allocating Box::new alone. What benefit does that yield? thanks

    • @jonhoo
      @jonhoo  Před 3 lety

      Box::leak gives you a &'static, which you can then give out many copies of, including to other threads. If you have a Box, any reference to the Box is tied to the lifetime of the Box (i.e., it isn't 'static), so you can't send it to other threads and such!

    • @jeffg4686
      @jeffg4686 Před 3 lety

      @@jonhoo - gotcha thanks much. Actually, kinda wondering what makes this thread safe? 'static lifetime doesn't itself imply thread safe does it?

    • @SimonBuchanNz
      @SimonBuchanNz Před 3 lety +1

      @@jeffg4686 the short version is that it's a non-mut reference, eg a "shared" reference, and the type is an atomic which implements Sync and thus is declaring itself safe to use from multiple threads at the same time.

  • @timvw01
    @timvw01 Před 2 lety

    This is great

  • @saseoham
    @saseoham Před 3 lety

    Thank you!!!!!

  • @keithmiller8510
    @keithmiller8510 Před rokem

    I knew the difference but 16:00 really gave me an ah-ha moment on usage.

  • @Ergzay
    @Ergzay Před 3 lety +4

    I have a problem with the part at 1:53:05, you say "there's no requirement that x is true even though tx has run". That's simply false. An Ordering::Acquire will see all Ordering::Release of the same variable, if it's run. Did you mean to say "even though ty has run"?

    • @jonhoo
      @jonhoo  Před 3 lety +5

      The C specification is actually extremely vague on this front. From the C specification section 5.1.2.4: "The value observed by a load of an atomic object depends on the “happens before” relation, which in turn depends on the values observed by loads of atomic objects. The intended reading is that there must exist an association of atomic loads with modifications they observe that, together with suitably chosen modification orders and the “happens before” relation derived as described above, satisfy the resulting constraints as imposed here." Section 7.17.3.1 does say that "[i]mplementations should make atomic stores visible to atomic loads within a reasonable amount of time", but that's really all. Ultimately, by my reading at least, if two threads A and B have no relationship except that A writes x (with Release) and B reads x (with Acquire), it doesn't _seem_ like there is a requirement that B _will_ see A's write to x. There _is_ a requirement that _if_ B sees A's write to x, then B read of x "happens after" A's write to x. Now, _in practice_ you won't generally observe this oddity since chances are that there are also other "happens before" relationships that
      The reason I conclude that this is the right reading of the standard, as opposed to "any acquire load of x that occurs after (by wall-clock time) a release store of x _will_ see the x", is that otherwise the example given in the reference (en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering) would not require SeqCst - we'd be back to "either tx or ty happened first" even for Acquire/Release ordering, in which case an outcome of 0 is impossible. But the reference says that 0 _is_ possible under Acquire/Release, which must mean that it's possible for tx to run, but its store only being visible to one thread but not another.

    • @Ergzay
      @Ergzay Před 3 lety +1

      @@jonhoo Thanks for the response. I'm not quite convinced, but I'll think on this some more. I'm wondering if this is maybe a difference between C11 and C++11 given that you're mixing the two languages here.

    • @jonhoo
      @jonhoo  Před 3 lety +1

      @@Ergzay They both have the same memory ordering model as far as I am aware, I just had the C11 spec more readily available :)

    • @Ergzay
      @Ergzay Před 3 lety

      @@jonhoo As a side note, if you have sway in the loom project, if you could talk them into increasing MAX_THREADS or making it configurable, it's currently a hard coded const value to the number "4" which makes it impossible to use loom with this example as there's 5 threads including the main thread. Using loom with this example and it just instantly panics. It makes it hard for me to understand how it can be used for anything other than trivial examples given it's so limited.

    • @jonhoo
      @jonhoo  Před 3 lety +1

      @@Ergzay Yeah, it's a known challenge. Part of the problem is that loom needs to track a lot of per-thread state, and it does so with fixed size arrays to avoid allocating (which would kill performance). Bumping the number of threads would thus also bump the size of the entire working state fairly drastically. Though I wonder if maybe it'd be possible to use the new const generics to make it configurable!
      You can actually model this case in loom just by doing the store to x in the main thread. In general I've found it to be rare that you need more than four threads to replicate any given race condition at the level of atomics.

  • @jacklong2182
    @jacklong2182 Před 3 lety

    Do you use vim nerdtree plugin? how do you navigate between different files

    • @jonhoo
      @jonhoo  Před 3 lety +1

      Nope. I just use fzf.vim :) github.com/jonhoo/configs/blob/26409d160dff9a667d3fcb854bf83b37b544b154/editor/.config/nvim/init.vim#L28

  • @yuan.pingchen3056
    @yuan.pingchen3056 Před 10 měsíci

    is the rust language capable to write single process single thread non-atomic data-type console program?

    • @jonhoo
      @jonhoo  Před 10 měsíci +1

      yes

    • @yuan.pingchen3056
      @yuan.pingchen3056 Před 10 měsíci

      @@jonhoo this is a good news, but, it seems like it doesn't have the capability to off-loading workloads to GPU? isn't it ? (Heterogeneous computing)

  • @mohsanabbas6835
    @mohsanabbas6835 Před 2 lety

    Yum num yum

  • @garlicxu4768
    @garlicxu4768 Před 3 lety

    Nice job, btw why not just get a dark mode browser plug-in 😂

  • @pkunk
    @pkunk Před 3 lety +2

    Explaining atomics without Indiana Jones reference.

  • @user-rd9hg3gq7z
    @user-rd9hg3gq7z Před měsícem

    19:36 memo

  • @xderfszvfthgycu978
    @xderfszvfthgycu978 Před 3 lety

    underrated

  • @uvarovrs40
    @uvarovrs40 Před 3 lety

    Can machine instructions reordering make function below to panic by making the line B to be executed before the line A?
    fn foo() -> Result {
    ...
    if my_vec.len() < 2 { return Err( MyErr::TooShortErrMsg ); } // line A
    let last_value = my_vec.last().unwrap(); // line B, last() function returns None if my_vec is empty
    ...
    Ok(())
    }

    • @jonhoo
      @jonhoo  Před 3 lety +1

      No - the compiler must respect "program order" as far as any visible side-effects. The reason it can seemingly reorder operations when multiple threads are involved is that it's allowed to choose semi-arbitrarily which other threads' writes are visible at what time.

  • @deadskrillaskrit2078
    @deadskrillaskrit2078 Před 3 lety

    Bruh ya selling me on coding.

  • @user-ov5nd1fb7s
    @user-ov5nd1fb7s Před rokem

    czcams.com/video/rMGWeSjctlY/video.html
    Instruction level parallelism is what you mean, i think.

  • @user-ov5nd1fb7s
    @user-ov5nd1fb7s Před rokem

    czcams.com/video/rMGWeSjctlY/video.html
    I think you wanna say data dependencies.

  • @mmmvvkk
    @mmmvvkk Před 2 lety

    the low freq. feedback of your keyboard resonating through your microphone when you're typing, thats really annoying :>

  • @miketurner3461
    @miketurner3461 Před 2 lety

    Stop trying to make fetch happen

  • @critamine
    @critamine Před 22 dny

    wtf this guy is a vim demon 😭