I Lied! The Fastest C# Loop Is Even Weirder

Nick Chapsas

zhlédnutí 45 297

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 2. 04. 2023
Check out my courses: dometrain.com
Become a Patreon and get source code access: / nickchapsas
Hello everybody I'm Nick, and in this video, I will show you an even weirder way to loop in C#, which in some cases can perform better than every other approach.
Workshops: bit.ly/nickworkshops
Don't forget to comment, like and subscribe :)
Social Media:
Follow me on GitHub: bit.ly/ChapsasGitHub
Follow me on Twitter: bit.ly/ChapsasTwitter
Connect on LinkedIn: bit.ly/ChapsasLinkedIn
Keep coding merch: keepcoding.shop
#csharp #dotnet

Komentáře • 96

@jackkendall6420 Před rokem ⁺²²³
when you forget to add the # to your C
@mad_t Před rokem ⁺¹⁵
yeah that's definitely c prime way of iterating :)
In real world though the actual logic inside the loop takes so much more time and resources than iteration then this optimization is completely unnecessary.
@SharunKumar Před rokem ⁺¹
BluntC
@MechMK1 Před rokem ⁺⁵⁵
Ah yes, the fastest way to write C# is to write it as if it was C.
@desertfish74 Před rokem ⁺²
The fastest way to write code is to actually write C anyway....
@manuelamstutz4468 Před rokem ⁺¹⁰
@@desertfish74 Not the fastest way to write code.. but a way to write the fastest code :-). Write c takes often longer
@desertfish74 Před rokem ⁺¹
@@manuelamstutz4468 true haha
@MagicNumberArg Před rokem ⁺³²
"Hello everyone, this is Nick Chapsas and today I will show you a trick which makes your loop allocate NEGATIVE 128 bytes per item, thats right, LESS than 0, and also makes code in other processes on the same machine run 20% faster! "
@phizc Před rokem
@First Last the Java video was quite good as it was 😄
@Eric-kx7do Před rokem ⁺¹⁴
I rewrote a legacy C application in C# that was sensitive to performance. I had to use Marshalling to access a legacy C DLL that I couldn’t update but stayed with Spans for the performance increase. In the end the company got a faster, working high performance application that thanks to C# was smaller and much easier to maintain. Your video today convinces me that my choices back then were still correct.
@ristopaasivirta9770 Před rokem ⁺¹³
That feeling when you start to get segfault errors in a managed language.
@prman9984 Před rokem ⁺¹⁰
The final AsSpan is the best by far. It's very clear what's happening and extremely close to the unsafe version.
@arztje Před rokem ⁺¹³
I am definitely using spans a lot more after watching your performance videos. It has helped quite a lot when iterating massive collections.
@jurgen_kluft Před rokem
me as well
@tosunabi1664 Před rokem ⁺⁷
Span is the way to go. Cleaner.
@tanglesites Před rokem ⁺¹⁶
Shouldn't you be able to do something similar with pointers? This seems similar to how you can loop in C++ with pointers. Awesome video. I like that C# does not shut you out of the language but gives you a secret door to the goodies.
@nickchapsas Před rokem ⁺¹⁹
You can actually use straight pointers like in C by enabling unsafe compilation. I’ll probably show that in a future video
@dryadxon Před rokem ⁺⁹
In fact it's the same thing under the hood, what it's doing is getting two pointers, one to the first element, the other to the last + 1 element, iterating until the pointers are equal. It is quite common to iterate in this way in C, especially on strings, and although in csharp doing this thing is more difficult and less readable, it is fascinating to find out that it is possible.
@chipcode5538 Před rokem ⁺¹
Good old pointer math is back again.
@harald4game Před rokem ⁺²
Love it. C pointer loops are back.
I'm not doing time critical c# nowadays but if I had to write a image processing library that would be great option to avoid cross language issues especially for cross platform programming.
@AAAnatoly Před rokem ⁺¹⁰
A couple of videos later: hello, i,m Nick, and today we will run c++ code in our c# project to iterate array extremely fast))
@nickchapsas Před rokem ⁺¹⁶
Has my video backlog leaked or something?
@phizc Před rokem ⁺¹
Actually, that's slower since there's no way of inlining the c++ code, and calling into native, and back to C#, is also wasting performance. It's definitely possible though.
@aurinator Před rokem ⁺¹³
While I like your data that shows it's the fastest, its Standard Deviation is high implying something is causing it to occasionally take much longer than the average. I'm wondering what might cause that now.
@Kazdro009 Před rokem ⁺¹³
If I were to write code for game (especially mobile one) I would seriously consider it. It might be only few ns's but if something is executed N hundred times per second it might improve performance in observable way. Also StdDev for FasterLoop is something worth to consider here in mentioned scenarios.
@dire_prism Před rokem ⁺³
Even for games - these kinds of optimizations can always be added at a later stage after profiling your code and determining t's actually spending a lot of time here.
9/10 times you will actually be doing something inside the loop that will totally overshadow the time spent looping.
What can't easily be added later is having proper datastructures for the problem. These are definitely worth getting right at earlier stages.
@AndrewJonkers Před rokem
Nice, and good guidance for compiler/GC/runtime future optimisations: we return to the recreating simple optimal assembly/machine code in C# by use of odd obscure keywords.
@Innovatorsoft Před rokem
I really appreciate the advice. I am looking more than video.
Thanks
@ali_randomNumberHere Před rokem ⁺²
this man just cant leave the loops alone
@rasimismatulin1400 Před rokem ⁺⁹
In next video: "I Lied! C# Loop that iterates 10.000.000 items in 0.05 nanoseconds"
@nickchapsas Před rokem ⁺³
Accurate
@metaltyphoon Před rokem ⁺²
U can probably do that using SIMD 🤣
@phizc Před rokem ⁺¹
@@metaltyphoon nah, not even with simd.. 0.05ns for even 1 instruction requires a 20GHz processor. Maybe in a few processor generations. 😁
@patrickcandlin7420 Před rokem
Love your stuff Nick! Thank you!
@jongeduard Před rokem ⁺¹
Hi! I reproduced your tests on my system and I am getting results close to you, but I also decided to add an actual unsafe method in wich I am using real raw pointer syntax and the required fixed statement to pin the array. Performance of this is very close to your fastest loop, but not faster. In other tests that I did I always discoverd that real unsafe code was still the fastest, so that's a bit of a surprise to me.
I also want to point out that there is a tiny difference between storing the Length value in a local variable inside your method first and use that variable in your loop instead of directly using the field repeatedly. This is because accessing locals in the stack is still a bit faster than accessing fields from the managed heap.
@DeadlyAlive... Před rokem ⁺¹
I work with Kotlin for the most part and I have almost no experience with C#, I still find those videos very interesting to watch. Thank you!
@damkillerxxwin1462 Před 8 měsíci
7:44 what is this fancy debugger it is only exclusive on this IDE? or you can have it on visual studio code or vs?
@billy65bob Před rokem ⁺⁸
Looping like we're C++ now. lol.
@AcidNeko Před rokem
Yeah, you can loop that way in C++, but I think performance would mostly the same as ranged for due to cost free abstraction and compiler optimizations
@nothingisreal6345 Před rokem
True. I wonder how this would work with goody ol' C# pointers.
@billy65bob Před rokem
@@AcidNeko the range stuff is just an abstraction for looping with the begin() and end() functions, which are inlined to loop with the start/end pointers in the exact way Nick did here.
Hence my comment.
@sebastiangudino9377 Před rokem ⁺¹
That looks a LOOOOOOT like iterators in C++. Where you take the .begin() and the .end() and keep increasing begin while it is less than end
@charlesmayberry2825 Před rokem
Oh boy, you're starting to get into the territory of how we handle collections in C++, toeing a dangerous line there lol
@matthewsheeran Před rokem ⁺²
The StdDev was almost twice on the FasterLoop so it's results are a little more iffy.
@lifeisgameplayit Před rokem
Epic ! I thank you Nick ! Blessings from Polska
@djenning90 Před rokem
I know about spans but I haven’t started using them. I need to try some experiments in my codebase. What are sone good common use cases to explore?.
@FilipCordas Před rokem ⁺¹
Would stackalloc make a difference in performance?
@chadsquad5050 Před rokem ⁺⁴
the "fast loop" using Unsafe.Add with index will be as fast as the "faster loop" if you use nint i instead of int i.
@nofella Před rokem
The disasm you're looking at is not the most optimized one due to tiered JIT. The method will definitely be inlined, but that's not really a problem. It's also worth noting that iterating over the array's span instead of the array is only faster because the JIT optimizes the looping over the span better (which IMHO is a bug). Also, this "even weirder" approach was already commented to the original video ;)
@georjj Před rokem ⁺¹
Why then is it not default implementation, are there any downsides?
@boxacuva Před rokem
The cylce as completed for long time we went away from pointers and now they are coming back.
@clantz Před rokem ⁺¹
Have you also tried dropping into unsafe code and using actual pointers? Would be interesting to see.
@nickchapsas Před rokem ⁺⁴
That's for the next video
@EtienneFortin Před rokem
How about pointer arithmetics, aka true unsafe code? How much faster would it be? Or would it be any faster than Faster loop? And would the array need to be pinned?
@nickchapsas Před rokem ⁺²
That’s the topic of the next loop video
@HarshColby Před rokem
Effectively, the fastest way is equivalent to the fastest original C way: get a pointer to the start, and add the item size. Doesn't surprise me.
Using spans isn't as clear to people looking at the code (yet), so I know span is there but I almost never need them.
@pharoah327 Před rokem
Is array indexing really so slow that calling a function each iteration is faster? Even with spatial locality caching and bus sizes that can return multiple contiguous elements at a time? My intuition would be that the stack frame hit from a function call (pushing the popping the activation record, moving around the program counter, loading and unloading arguments) would be more expensive then the frequent L1 cache hits (or pulling from registers) from array indexing (i.e. ask for one value in array, it grabs multiple values, amortized cost is very low overall). I'm fully aware that I am missing something but what is it that I'm missing here?
@maschyt Před rokem
There are no function calls when the code gets executed. The methods used are so simple that they get inlined automatically and/or have [MethodImpl(MethodImplOptions.AggressiveInlining)].
@user-od4ce8pe3u Před rokem ⁺¹
I think you miss inlining. All this unsafe... cals would be probably inlined. Also this approach is so fast, because you work with array on memory level, so for example code will not check for index out of range exception and so on.
@pharoah327 Před rokem
@@user-od4ce8pe3u that's a good point! Thanks
@deltaphilip8611 Před rokem
Reading the code a year from now, will be a "What the hell?" moment.
@alanbourke4069 Před rokem
Sort of how we used to do things in assembler, then.
@rick2591 Před rokem ⁺¹
Method two is how I used to loop back in my 6502 days...
@adrian_franczak Před rokem
“It’s enough slices!”(loop perf videos)
@Rafloka Před 5 měsíci
Oh man, here I am totally not optimizing prematurely.
@user-de2px1ed8k Před rokem ⁺¹
Not so proficient in C# and it's history. So why again Microsoft can't make for/foreach loops as fast as this method as they did with said for/foreach in .Net 7?
@ronosmo Před rokem ⁺¹
I don't bother optimising loops in c#.. It was normal practice in my c & c++ days - but foreach is so much cleaner and less error-prone - it's not worth the effort! Besides the new syntax is much less concise than the original C versions.
@DennisHaney Před rokem
Isn't it enough to just put the items.Length in a variable, so that it does not need to be evaluated in each loop?
@nickchapsas Před rokem
Nop. In these examples it won't make a difference at all
@1nterstellar_yt Před rokem
there is a special IL code for getting array length
@sgbench Před rokem
The compiler is smart enough to do that behind the scenes
@debtpeon Před rokem
This is exactly how you do it in C & C++
@enitalp Před rokem
Put length in a const variable and not in the loop.
@DanielCipra Před rokem
I have to push this to some branch and then let someone CR this loop :D
@vabka-7708 Před rokem ⁺¹
FasterLoop is not Faster than "normal" in last run. Check deviation
@user-wx7yx5nw9j Před rokem ⁺³
Eventually it becomes c++ iterators xd
@1Eagler Před rokem
Welcome C
@morriscurtis Před rokem
You could try caching the items.Length before the loop, that might make the span approach faster also at 10000 items (haven't tried it myself though)
@nickchapsas Před rokem
It isn't going to make a difference at all for those sizes
@fredhair Před rokem ⁺¹
If you ever need this level of performance you should probably be using C++ and doing your pointer arithmetic there! C# goes out of it's way to make this sort of low level code from being used though it is interesting to see it's possible to do it. When you have a problem you wish to solve with code, you should consider what is the best tool for the job and this often means some languages are better suited than others.
Very cool video & as others have said, good job putting the disclaimer at the start, I've seen developers who want to do stuff like this when performance isn't an issue. For the sanity of dev teams everywhere, readibility & good structure is more important than premature optimization.
@keit99 Před rokem
Ah good old C-territory spaghetti code
@SIlaelinAndCo Před rokem
I feel like you could just use C or C++ at this point
@deltaphilip8611 Před rokem
hmm, change the order you run the methods.
@nickchapsas Před rokem
It doesn't matter
@thx1137 Před rokem
I really hate it when people use the word “lie” instead of “wrong”. One is on purpose and the other isn’t. If the other video was truly a lie then that makes all the videos suspect.
@nickchapsas Před rokem
I agree with you but that’s not how the CZcams algorithm and click through rates work unfortunately. It’s the game we have to play (and it’s sad but it works)
@Tal__Shachar Před rokem
C Dull
@ad9291 Před rokem
Feels like a Linked List with extra steps
@donelbaron Před rokem
don't see any value in this kind of topics
@nickchapsas Před rokem ⁺²
I think there is value knowing that you can go deeper in C#. Game developers using Unity or people doing image processing, HEAVILY use these techniques to optimize their code.

Další v pořadí

Automatické přehrávání

Stop using async void in C#! Do this instead.