Making variables atomic in C
Vložit
- čas přidán 11. 07. 2024
- Patreon ➤ / jacobsorber
Courses ➤ jacobsorber.thinkific.com
Website ➤ www.jacobsorber.com
---
Making variables atomic in C // we're talking about threads, thread safety, and atomicity, and specifically this video talks about the atomic keyword, added in C11 that allows you to make variables atomic without using locks.
Related Videos:
Thread safety: • Safety and Speed Issue...
pthreads: • How to create and join...
***
Welcome! I post videos that help you learn to program and become a more confident software developer. I cover beginner-to-advanced systems topics ranging from network programming, threads, processes, operating systems, embedded systems and others. My goal is to help you get under-the-hood and better understand how computers work and how you can use them to become stronger students and more capable professional developers.
About me: I'm a computer scientist, electrical engineer, researcher, and teacher. I specialize in embedded systems, mobile computing, sensor networks, and the Internet of Things. I teach systems and networking courses at Clemson University, where I also lead the PERSIST research lab.
More about me and what I do:
www.jacobsorber.com
people.cs.clemson.edu/~jsorber/
persist.cs.clemson.edu/
To Support the Channel:
+ like, subscribe, spread the word
+ contribute via Patreon --- [ / jacobsorber ]
Source code is also available to Patreon supporters. --- [jsorber-youtube-source.heroku...]
Great video!! Keep up the good work!
great content, thanks for keeping this going
as far as i remember locks use atomic vars for the lock itself, the reason why is faster is that locks come with a lot of additional logic compared to a atomic var.
Great video!
One related subject that might be nice to cover is the use of `__atomic_*()` GCC builtins, and how they differ from `_Atomic`.
Thanks for all the great content!
An even better subject which is more intrinsically related is the memory model/ordering constraints/memory consistency, especially if it's on a weakly-ordered ARM CPU.
_Atomic is part of the ISO C standard; it got added in 2011. GCCs builtins predate the C standard and are compiler specific extensions that basically do the same thing as the standard functions do. GCC itself says you shall not use those compiler buildins any longer, you shall only use the ISO C atomic implementation.
Not covered: memory order acquire, release, relaxed, seq_cst. This can help performance a lot. By default, compiler chooses the safest but slowest implementation of atomic variable. However many places where counters are used no memory ordering is necessary so we are losing a performance here.
We want more concurrency videos especially for embedded systems. Thanks for all the amazing videos.
Hi, love your videos!
I was wondering if you can also do videos regarding Distributed Systems with C (MapReduce, scheduling, Sharding, etc.)
Wow. This detailed description... keep on.
While only getting this in C11 is a bit late, I still love the idea of having an atomic keyword. Signaling to the compiler that variable r/w should occur atomically opens up way more oportunities than just using a lock yourself. Most platforms support atomic instructions, where as the compiler could never adapt them if you're locking explicitely. And if they don't, it's easy to auto-generate the lock.
Thanks!
It's great that they're adding new features, but I feel like for most people this is going to be too little, too late. I would suggest though, instead of printing an error, having the locking code be a fallback. And it's sad if clang doesn't convert var = var + i into var += i because gcc does, even at -O0, and I'm still using the 10.3 version. Although, with higher optimization levels writing a minimum testable example is really annoying.
The problem with making a fallback is that the keyword applies to the variable declaration and locks apply to the variable usage. It might be possible with some preprocessor trickery but the only real option is that all compilers should be forced to recognize the keyword and put locks if they don't support the feature. Making it optional makes it similar to a non portable library.
@@redcrafterlppa303 The atomicity of operations that modify the variable apply to its usage as well. The thing is, you should not have global variables unless they're absolutely necessary, and if you have a global counting mechanism it should be obscured behind a function call instead of used directly. So while this isn't the best example, it's extremely easy to have two sets of code to perform the task at hand and literally swap between them with the preprocessor checks. Even if the method chosen were say to have atomic_counter.c and volatile_counter.c then choose which one to include for the functionality that applies.
@@anon_y_mousse true, but this will introduce a function call wich most likely isn't in budget if you are optimizing on locks (to intrinsic atomics). Of course you could work with force inline but that's the next compiler dependent solution...
@@redcrafterlppa303 That's where you weigh the benefits versus the cost. If optimization in one way will lead to breakage in others, don't do it. Of course, if you're writing this project as a solitary developer, then it may be something you find is worth it since you could control every aspect of its use. However, definitely never depend on compilers to be consistent from one to the next, not even one version to the next.
You could always use atomic and fallback to lock yourself if not supported.
also maybe mention the stdatomic functions and varible?
One minor part :
Why is count_to_big incrementing by i and not one ?
If incremented by one , we can assert that counter is equal to 2*BIG
Can you also cover memory order, memory fenses?
A better word for”Mutex” is “Bottleneck”!
Is there any difference between _Atomic and std::atomic ?
At 10:09, you list `a /= b` and `a *= b` as atomic operations. I can't find any reference for C11 implementing these compound assignments atomically. I know that C++11 certainly doesn't, so I would be suprised if C11 does. Generally, when I need such atomic ops I'd implement them manually using `compare_exchange_weak` loops.
It's good to know how to use threads and mutexes appropriately, but I think what would make for a good video would be the major differences between a
mutexes and semaphores. When I was starting to learn Vulkan, I came across the usage of semaphores as opposed to mutexes, although mutexes could still be used throughout the usage of the API. If I remember correctly Vulkan allows you to write your own custom memory manager, however when working with their thread pool, they have built in semaphores. I know I learned about the two long ago, but it would be a great refresher to hear you explain the difference between the two.
Can I recommend a book about that for you?
@@rian0xFFF I'll take it
if there are 2 threads, one of them is updating(not read and update, just update) a struct and another is reading it only, then what will happen in a situation where the thread reading the struct members is prempted by the writing thread (which might happen because writing thread has higher priority) and it tries to write a memory location which was in the middle of being read by the reader thread. will parts of the struct be read with new values?
Ahould lock and unlck be after and before for loop ?
Thanks for a great video! @around 6:00 Is it possible to implement a lock() function if the hardware does not support something like compare/swap? Also, have you done a video on linker scripts? Would be really interesting :-) Thanks & I wish you a fantastic summer.
sir please make a video about gdb for pros!
What level would you say this is? Pro?
made me think of how often it would be easy enough to simply compile and test/compare both versions of some code. generally, when is comparing two or more variants of code trivial and straightforward, and when is it not practical and too much effort? would be great to hear from experienced people what to consider re this
Maybe some day we will see a video about GCC extensions for C. They are very, very interesting.
Referring back to that linked list syncronisation method I mentioned on a previous video I'm looking around for a similar function to atexit but for threads specifically, do you know of any such function/s?
So, just to make sure I understand, you're looking for a way to register a callback function when a thread exits? If so, maybe check out pthread_cleanup_push.
@@JacobSorber Just found that a few minutes ago, now I just need to find the win32 version too since the customized critical section I'm making relies only on linked lists, no atomics or anything from the system besides the thread id, yield function & with the help of the atexit equivs a means of catch unexpected thread deaths to release the locks it holds to other threads and detach itself from those lists because I don't really want the unpredictable loss of lock due to timed "I'm alive" calls that I hacked in for the initial version
I was able to create a simple shader using pthreads without having to reprogram all my graphics in OpenGL and still get 60 fps :)
Atomic is wonderful, related topics: memory models, thread_sanitizer, false sharing, core ping-poing and more over to go deeper =)
But what about memory ordering? Which semantics does C use?
In C++ you can/have to specify the memory ordering for atomic operations.
enum memory_order {
memory_order_relaxed,
memory_order_consume,
memory_order_acquire,
memory_order_release,
memory_order_acq_rel,
memory_order_seq_cst
};
this is the enum, it has it's own page to explain what each means. Super cool, super able to break everything if you mess up. Not a reason not to learn it though. Needs a good thorough test if you are going around mutexes with this. It's a lot faster if you do it right though.
Hey! Great video! Atomics are indeed faster in your counter example compared to locking - but it is a little bit misleading since you appear to imply that having an atomic variable that gets absolutely hammered by multiple threads is a good idea.
It would be actually interesting to see how the atomics version compares to count_to_big being called twice on the same thread without the counter being atomic in terms of speed.
Also a follow up on memory barriers would be great - since atomicity entails some pretty big pitfalls around that.
Cheers!
I think Mutex locks are great for most cituations, but they are pretty bad imo when it’s in a loop and it locks and unlocks at every loop iteration
Next video could tell; excessive locking is bad, and atomic has it's issues. Really good way to deal thread safety is to have thread local variables that are returned to main scope where summing up happens. Safe, fast, and overall happy.
and what is with the radiation ?
_Atomic does not tell the compiler that you want access to the variable to be atomic, it only tells the compiler to make sure it chooses a type that can be updated atomically. That "+=" works in your code is coincident, as the standard does not guarantee that to be the case.
See ISO/IEC 9899:201x, §7.17.7.5, Section 5:
"The operation of the atomic_fetch and modify generic functions are nearly equivalent to the operation of the corresponding op= compound assignment operators. The only differences are that the compound assignment operators are not guaranteed to operate atomically, and the value yielded by a compound assignment operator is the updated value of the object, whereas the value returned by the atomic_fetch and modify generic functions is the previous value of the atomic object."
Note the "are not guaranteed to operate atomically". The standard does not guarantee that your code is atomic, it only is because on your system native 64 bit atomics seem to exist. The correct code that is guaranteed to always be atomic would have been:
atomic_fetch_add(&counter, i);
Just wondering, is this atomic valid on embedded C?
Embedded C is just C.
Imagine if atomics had been a part of all processors from the start
do a video about how to setup vscode to work with c and cpp i get include errors and this vscode shit :). thx btw useful videos
What sort of include errors?
@@JacobSorber i can't include any c std library where it gives me something like include file not found in include path.
Speaking of new feqtures, constexpr is coming
What happens if you compile on a machine that supports atomic but execute on a machine whose processor doesn't support atomic.
Good question. Try it out and let me know. 🤔 It's ultimately going to depend on what code the compiler generates to handle the atomic variables.
If the generated code contains instructions that the CPU simply doesn't support, the likely outcome is the program exiting with an error.
Any js developers watching c tutorials just for fun?
That _Atomic keyword is barmy if it is optional and not even consistent in all cases. A programming language needs to be predictable and reliable. We make flight control systems and autopilot with it!
C++ std::atomic
There's nothing unreliable about it.
#if (__STDC_VERSION__ >= 201112L) \
&& !defined(__STDC_NO_ATOMICS__)
# include
#else
# error "The 1990s called and said it wants its compiler back"
#endif
"Atom-icity"! Wow, Jacob... Iknew you were a C Jedi, but now a word-smith creator too. I'm stealing this word and plan to use it well and often. BTW, I'm loving wearing my new MALLOC T-shirt around all my Python-writing friends... drives then CRAZY.
Wish I could take credit for it. But, it's definitely a good word. Glad you're getting good mileage out of the shirt.
C can mix really well with Python, if done right.
don’t forget to free your tshirt when you’re done with it
When using _Atomic, is it implict volatile?
No, you have to write "_Atomic volatile" if you want your atomic variable to also be volatile.
volatile means the variable cannot be optimised out. useful for something like a memory mapped LED. atomic means the operations on the variable should be done in a thread safe way. these two things are perhaps related, but not the same
@@attilatorok5767 I know what volatile means.
And I know what atomic access is.
My question is if the _Atomic keyword implies volatileness of the variable.
Is _every_ variable with _Atomic also volatile without using the volatile keyword.
No, atomic and volatile are not closely related concepts. The volatile keyword has been misused often enough to cause that false belief to become popular. Volatile is almost exclusively for I/O registers.
So no, atomic is not implicitly volatile.
Why not use rust then :P
You can probably rely on a single read or write on an arithmetic type to be atomic on all modern processors. So you won't have one thread reading an integer while another thread is updating it. Maybe it can happen on some exotic hardware.
"Probably rely", those are words that send chills down my spine...
64 bit integer on 32 bit processor (eg. Rasberry Pi, not something old) generally won’t be. You also have to consider unaligned vs aligned memory access. Also, atomic will disable certain compiler optimizations that could cause problems if the variable is actually used by multiple threads (usually involving re-ordering of instructions around the variable). With multiple threads running, if you don’t tell the compiler and processor what you are doing, you will almost certainly run into problems.
C issue
C89 or CYA L8R
😂
Found the university CS professor ;)
Didn't know C can be racist 😂
Soon, Zig shall replace C..
#if defined(_SOLARIS) || defined(_SOLARIS86)
#include
#define Increment_Interlocked(x) (long)atomic_inc_ulong_nv((unsigned long *)x)
#define Decrement_Interlocked(x) (long)atomic_dec_ulong_nv((unsigned long *)x)
#define Exchange_Interlocked(x,y) (long)atomic_swap_ulong(((unsigned long *)x),((unsigned long)y))
#define ExchangeAdd_Interlocked(x,y) (long)atomic_add_long_nv(((unsigned long *)x),((long)y))
#define Add_Interlocked(x,y) (long)atomic_add_long_nv(((unsigned long *)x),((long)y))
#define Clear_Interlocked(x) (long)atomic_swap_ulong(((unsigned long *)x),((unsigned long)0L))
#elif defined(_LINUX)
#define Increment_Interlocked(x) __sync_add_and_fetch(x,1)
#define Decrement_Interlocked(x) __sync_sub_and_fetch(x,1)
#define Exchange_Interlocked(x,y) __sync_val_compare_and_swap(x,*x,y)
#define ExchangeAdd_Interlocked(x,y) __sync_fetch_and_add(x,y)
#define Add_Interlocked(x,y) __sync_add_and_fetch(x,y)
#define Clear_Interlocked(x) __sync_fetch_and_and(x,0L)
#else
#define Increment_Interlocked(x) InterlockedIncrement(x)
#define Decrement_Interlocked(x) InterlockedDecrement(x)
#define Exchange_Interlocked(x,y) InterlockedExchange(x,y)
#define ExchangeAdd_Interlocked(x,y) InterlockedExchangeAdd(x,y)
#define Add_Interlocked(x,y) (InterlockedExchangeAdd(x,y) + (y))
#define Clear_Interlocked(x) InterlockedExchange(x,0L)
#endif
If you program in the NT Kernel you need to use spinlocks.
A Mutex should never deadlock. It should be coded like Kernel interrupt coding: Pass-through quickly locking more than one variable, change the group of variables, then drop out of the mutex as soon as possible. A Mutex can be made better by using a microsecond timer and a pause to ensure if the next item queued to lock has spent too much time waiting it gets the lock so that high priority threads do not hog the work to be done. This can be done by the current lock holder determining who gets the next lock if the next queue item time is too long. Also never EVER develop real-time code on Linux. Develop it on Solaris then port to Linux. Linux debugger is a broken micky-mouse toy created by idiots. Even better, develop it on a sparc platform.