I Lied! The Fastest C# Loop Is Even Weirder
Vložit
- čas přidán 2. 04. 2023
- Check out my courses: dometrain.com
Become a Patreon and get source code access: / nickchapsas
Hello everybody I'm Nick, and in this video, I will show you an even weirder way to loop in C#, which in some cases can perform better than every other approach.
Workshops: bit.ly/nickworkshops
Don't forget to comment, like and subscribe :)
Social Media:
Follow me on GitHub: bit.ly/ChapsasGitHub
Follow me on Twitter: bit.ly/ChapsasTwitter
Connect on LinkedIn: bit.ly/ChapsasLinkedIn
Keep coding merch: keepcoding.shop
#csharp #dotnet
when you forget to add the # to your C
yeah that's definitely c prime way of iterating :)
In real world though the actual logic inside the loop takes so much more time and resources than iteration then this optimization is completely unnecessary.
BluntC
Ah yes, the fastest way to write C# is to write it as if it was C.
The fastest way to write code is to actually write C anyway....
@@desertfish74 Not the fastest way to write code.. but a way to write the fastest code :-). Write c takes often longer
@@manuelamstutz4468 true haha
"Hello everyone, this is Nick Chapsas and today I will show you a trick which makes your loop allocate NEGATIVE 128 bytes per item, thats right, LESS than 0, and also makes code in other processes on the same machine run 20% faster! "
@First Last the Java video was quite good as it was 😄
I rewrote a legacy C application in C# that was sensitive to performance. I had to use Marshalling to access a legacy C DLL that I couldn’t update but stayed with Spans for the performance increase. In the end the company got a faster, working high performance application that thanks to C# was smaller and much easier to maintain. Your video today convinces me that my choices back then were still correct.
That feeling when you start to get segfault errors in a managed language.
The final AsSpan is the best by far. It's very clear what's happening and extremely close to the unsafe version.
I am definitely using spans a lot more after watching your performance videos. It has helped quite a lot when iterating massive collections.
me as well
Span is the way to go. Cleaner.
Shouldn't you be able to do something similar with pointers? This seems similar to how you can loop in C++ with pointers. Awesome video. I like that C# does not shut you out of the language but gives you a secret door to the goodies.
You can actually use straight pointers like in C by enabling unsafe compilation. I’ll probably show that in a future video
In fact it's the same thing under the hood, what it's doing is getting two pointers, one to the first element, the other to the last + 1 element, iterating until the pointers are equal. It is quite common to iterate in this way in C, especially on strings, and although in csharp doing this thing is more difficult and less readable, it is fascinating to find out that it is possible.
Good old pointer math is back again.
Love it. C pointer loops are back.
I'm not doing time critical c# nowadays but if I had to write a image processing library that would be great option to avoid cross language issues especially for cross platform programming.
A couple of videos later: hello, i,m Nick, and today we will run c++ code in our c# project to iterate array extremely fast))
Has my video backlog leaked or something?
Actually, that's slower since there's no way of inlining the c++ code, and calling into native, and back to C#, is also wasting performance. It's definitely possible though.
While I like your data that shows it's the fastest, its Standard Deviation is high implying something is causing it to occasionally take much longer than the average. I'm wondering what might cause that now.
If I were to write code for game (especially mobile one) I would seriously consider it. It might be only few ns's but if something is executed N hundred times per second it might improve performance in observable way. Also StdDev for FasterLoop is something worth to consider here in mentioned scenarios.
Even for games - these kinds of optimizations can always be added at a later stage after profiling your code and determining t's actually spending a lot of time here.
9/10 times you will actually be doing something inside the loop that will totally overshadow the time spent looping.
What can't easily be added later is having proper datastructures for the problem. These are definitely worth getting right at earlier stages.
Nice, and good guidance for compiler/GC/runtime future optimisations: we return to the recreating simple optimal assembly/machine code in C# by use of odd obscure keywords.
I really appreciate the advice. I am looking more than video.
Thanks
this man just cant leave the loops alone
In next video: "I Lied! C# Loop that iterates 10.000.000 items in 0.05 nanoseconds"
Accurate
U can probably do that using SIMD 🤣
@@metaltyphoon nah, not even with simd.. 0.05ns for even 1 instruction requires a 20GHz processor. Maybe in a few processor generations. 😁
Love your stuff Nick! Thank you!
Hi! I reproduced your tests on my system and I am getting results close to you, but I also decided to add an actual unsafe method in wich I am using real raw pointer syntax and the required fixed statement to pin the array. Performance of this is very close to your fastest loop, but not faster. In other tests that I did I always discoverd that real unsafe code was still the fastest, so that's a bit of a surprise to me.
I also want to point out that there is a tiny difference between storing the Length value in a local variable inside your method first and use that variable in your loop instead of directly using the field repeatedly. This is because accessing locals in the stack is still a bit faster than accessing fields from the managed heap.
I work with Kotlin for the most part and I have almost no experience with C#, I still find those videos very interesting to watch. Thank you!
7:44 what is this fancy debugger it is only exclusive on this IDE? or you can have it on visual studio code or vs?
Looping like we're C++ now. lol.
Yeah, you can loop that way in C++, but I think performance would mostly the same as ranged for due to cost free abstraction and compiler optimizations
True. I wonder how this would work with goody ol' C# pointers.
@@AcidNeko the range stuff is just an abstraction for looping with the begin() and end() functions, which are inlined to loop with the start/end pointers in the exact way Nick did here.
Hence my comment.
That looks a LOOOOOOT like iterators in C++. Where you take the .begin() and the .end() and keep increasing begin while it is less than end
Oh boy, you're starting to get into the territory of how we handle collections in C++, toeing a dangerous line there lol
The StdDev was almost twice on the FasterLoop so it's results are a little more iffy.
Epic ! I thank you Nick ! Blessings from Polska
I know about spans but I haven’t started using them. I need to try some experiments in my codebase. What are sone good common use cases to explore?.
Would stackalloc make a difference in performance?
the "fast loop" using Unsafe.Add with index will be as fast as the "faster loop" if you use nint i instead of int i.
The disasm you're looking at is not the most optimized one due to tiered JIT. The method will definitely be inlined, but that's not really a problem. It's also worth noting that iterating over the array's span instead of the array is only faster because the JIT optimizes the looping over the span better (which IMHO is a bug). Also, this "even weirder" approach was already commented to the original video ;)
Why then is it not default implementation, are there any downsides?
The cylce as completed for long time we went away from pointers and now they are coming back.
Have you also tried dropping into unsafe code and using actual pointers? Would be interesting to see.
That's for the next video
How about pointer arithmetics, aka true unsafe code? How much faster would it be? Or would it be any faster than Faster loop? And would the array need to be pinned?
That’s the topic of the next loop video
Effectively, the fastest way is equivalent to the fastest original C way: get a pointer to the start, and add the item size. Doesn't surprise me.
Using spans isn't as clear to people looking at the code (yet), so I know span is there but I almost never need them.
Is array indexing really so slow that calling a function each iteration is faster? Even with spatial locality caching and bus sizes that can return multiple contiguous elements at a time? My intuition would be that the stack frame hit from a function call (pushing the popping the activation record, moving around the program counter, loading and unloading arguments) would be more expensive then the frequent L1 cache hits (or pulling from registers) from array indexing (i.e. ask for one value in array, it grabs multiple values, amortized cost is very low overall). I'm fully aware that I am missing something but what is it that I'm missing here?
There are no function calls when the code gets executed. The methods used are so simple that they get inlined automatically and/or have [MethodImpl(MethodImplOptions.AggressiveInlining)].
I think you miss inlining. All this unsafe... cals would be probably inlined. Also this approach is so fast, because you work with array on memory level, so for example code will not check for index out of range exception and so on.
@@user-od4ce8pe3u that's a good point! Thanks
Reading the code a year from now, will be a "What the hell?" moment.
Sort of how we used to do things in assembler, then.
Method two is how I used to loop back in my 6502 days...
“It’s enough slices!”(loop perf videos)
Oh man, here I am totally not optimizing prematurely.
Not so proficient in C# and it's history. So why again Microsoft can't make for/foreach loops as fast as this method as they did with said for/foreach in .Net 7?
I don't bother optimising loops in c#.. It was normal practice in my c & c++ days - but foreach is so much cleaner and less error-prone - it's not worth the effort! Besides the new syntax is much less concise than the original C versions.
Isn't it enough to just put the items.Length in a variable, so that it does not need to be evaluated in each loop?
Nop. In these examples it won't make a difference at all
there is a special IL code for getting array length
The compiler is smart enough to do that behind the scenes
This is exactly how you do it in C & C++
Put length in a const variable and not in the loop.
I have to push this to some branch and then let someone CR this loop :D
FasterLoop is not Faster than "normal" in last run. Check deviation
Eventually it becomes c++ iterators xd
Welcome C
You could try caching the items.Length before the loop, that might make the span approach faster also at 10000 items (haven't tried it myself though)
It isn't going to make a difference at all for those sizes
If you ever need this level of performance you should probably be using C++ and doing your pointer arithmetic there! C# goes out of it's way to make this sort of low level code from being used though it is interesting to see it's possible to do it. When you have a problem you wish to solve with code, you should consider what is the best tool for the job and this often means some languages are better suited than others.
Very cool video & as others have said, good job putting the disclaimer at the start, I've seen developers who want to do stuff like this when performance isn't an issue. For the sanity of dev teams everywhere, readibility & good structure is more important than premature optimization.
Ah good old C-territory spaghetti code
I feel like you could just use C or C++ at this point
hmm, change the order you run the methods.
It doesn't matter
I really hate it when people use the word “lie” instead of “wrong”. One is on purpose and the other isn’t. If the other video was truly a lie then that makes all the videos suspect.
I agree with you but that’s not how the CZcams algorithm and click through rates work unfortunately. It’s the game we have to play (and it’s sad but it works)
C Dull
Feels like a Linked List with extra steps
don't see any value in this kind of topics
I think there is value knowing that you can go deeper in C#. Game developers using Unity or people doing image processing, HEAVILY use these techniques to optimize their code.