Optimizing String Performance Easily in C#

Sdílet
Vložit
  • čas přidán 1. 02. 2023
  • Enroll to "Cloud Fundamentals: AWS Services for C# Developers" for FREE: bit.ly/3XKUBOH
    Become a Patreon and get source code access: / nickchapsas
    Hello everybody I'm Nick and in this video I will show you a technique with which you can eliminate string allocations completely for your code in some scenarios called String Pooling.
    Don't forget to comment, like and subscribe :)
    Social Media:
    Follow me on GitHub: bit.ly/ChapsasGitHub
    Follow me on Twitter: bit.ly/ChapsasTwitter
    Connect on LinkedIn: bit.ly/ChapsasLinkedIn
    Keep coding merch: keepcoding.shop
    #csharp #dotnet

Komentáře • 132

  • @nickchapsas
    @nickchapsas  Před rokem +24

    Just a quick clarification. The hand-written version of the host extraction assumes that there are no query strings or hashes on the URLs. That's just because of the sepcific usecase. If you want the behavior to match the Uri class you need to add index checks for a question mark and a hash to extract the correct host. Adding this logic won't affect the memory shown in the video nor the execution speed at all (I tested it).

    • @nebuleon
      @nebuleon Před rokem +4

      Or the port, before the slash or end of string! (example:8080)

    • @spoonfuloftactic
      @spoonfuloftactic Před rokem +1

      And it doesn't support IP address hosts, but maybe that's desirable.

  • @jimread2354
    @jimread2354 Před rokem +31

    We used to use a SrtingPool when we were using a middleware messaging library because there was a fixed number of strings we could use to create the recipient and sender strings. We were generating potentially hundreds of these messages per second so we couldn't be thrashing around in memory like that. StringPool to the rescue!

  • @local9
    @local9 Před rokem +3

    I've come to the conclusion you've bugged my house, only yesterday I was having a conversaion about strings and their performance... and you post this video, only last week I had a discusion about Dictionaries and their performance to then find out 5 minutes later you posted a video about something you could do to improve their performance in some cases. Keep up the work! These videos do help a lot!

    • @29Aios
      @29Aios Před rokem

      Let's check if you both have quantum entanglement, think about something... say WeakReference😁

  • @claasdev
    @claasdev Před rokem +17

    I just don’t trust myself to implement uri properly haha

  • @AConversationOn
    @AConversationOn Před rokem +11

    I think it would be worthwhile to create a demo app where you can show that `300 ns` translates to `3 s` under some inputs. And then reuse that demo app for all your real-world benchmarks.
    Eg., an API which runs an ETL pipeline over millions of rows. Here you could have "GetHost" as an ETL step, and then show that this takes 1mill * 300ns for the whole step; and then issue lots of requests, and suddenly you've got significant performance problems.
    The purpose of the API/ETL could be to summarise the last (simulated) few days of IoT sensor events, or something where processing millions of rows / request makes sense.

    • @nickchapsas
      @nickchapsas  Před rokem +18

      What I wanna do is get a project that performs fine and then apply multiple optimizations to show how speed and memory improved overal performance of the app. That's gonna be a good example

    • @dave7038
      @dave7038 Před rokem

      @@nickchapsas I would love to see this, it would be a great way to demonstrate the overall process of identifying problem areas and methods of addressing them.

  • @logank.70
    @logank.70 Před rokem +24

    Out of curiosity I benchmarked using a regular expression. The below expression benchmarked at ~1.5 microseconds and using 3.5KB of memory for the shorter URL and 11.5KB of memory for the longer. So memory usage doesn't scale very well at all (and is significantly worse when compared to the benchmarks in the video both in time and memory).
    (\w+:\/\/)[A-Za-z\.]+

  • @numinaoneiron1655
    @numinaoneiron1655 Před rokem +3

    Great video Nick.Concerning the substring, you should try an alternative which Microsoft provides in Microsoft.Extensions.Primitives.StringSegment.

  • @iGexogen
    @iGexogen Před rokem +39

    It is very tricky memory/cpu tradeoff, only applicable to code working with low diversity strings (sounds pretty untolerant 🙂). Memory optimization can easily become memory disaster, when you collect never needed anymore strings like grannies collect garbage 🙂. This feature must be used either wisely or never, I don't think it works like cache with sliding expiration cause it is expensive to manage and will reset all benefits from those microoptimizations.

    • @millch2k8
      @millch2k8 Před rokem +2

      Fair point regarding string diversity... Unless you're expecting the same ones to regularly occur, not obvious whether the StringPool approach adds that much.

    • @tobyjacobs1310
      @tobyjacobs1310 Před rokem +1

      I mean hostname might well be frequently identical.... I've got immense milage on string pools in the past.

    • @Sergio0694
      @Sergio0694 Před rokem +3

      StringPool has a fixed size (which you can also choose upon creation) and it internally uses an LRU schema to automatically drop old string instances when it reaches full capacity. So it doesn't really have a risk of just completely filling your memory, it can't ever exceed the maximum capacity you decided 🙂

    • @iGexogen
      @iGexogen Před rokem

      @@Sergio0694 Ok that makes sense, it will not exhaust memory, but still will be less efficient in wrong use cases comparing to not using it at all 🙂

    • @Sergio0694
      @Sergio0694 Před rokem +1

      @@iGexogen Yes, using things incorrectly or in the wrong use case is generally expected to not produce great results, that is a universal principle not specific to StringPool 😄

  • @billy65bob
    @billy65bob Před rokem +1

    heh, I had rolled my own string pool and added it into our project. We're not using it properly yet though.
    We do repeat some strings (particularly user names) a stupidly absurd number of times, so I had intended it to get used to coalesce those into a single allocation.
    The basics structure was an int -> weakref dictionary wrapped by a ReaderWriterLock, the int being the string's GetHashCode.
    I did it that way so the GC will clean up any strings that fall out of use, so I made it also force a vacuum after every 32768 inserts.
    I had no idea that nuget package existed though. I'm looking at its code, and it does some crazy stuff with structs and refs, but I don't think it would GC the way I would expect it to.
    At least it can be reset unlike .NET's Interned strings.

  • @jwcodendaal
    @jwcodendaal Před rokem +3

    I may try this. Made a webscraper that parses all found links using URI to get hosts. This would certainly improve performance a bit / memory allocations as it gets called alot in this use case.

  • @99aabbccddeeff
    @99aabbccddeeff Před rokem

    I like your videos and how you explain all those things. Thank you!

  • @matthias916
    @matthias916 Před 11 měsíci +1

    I love these optimization videos, sadly I don't need to optimize much at my job

  • @diadetediotedio6918
    @diadetediotedio6918 Před rokem +2

    In my opinion, this is an excellent solution when you are making a small DSL full of small needs with repeated strings, perhaps it is not ideal for all cases since it considerably increases the complexity, but it is one more tool to have hand when needed.

  • @Speiger
    @Speiger Před rokem +6

    Technically there is another optimization you could do.
    There should be a function for indexOf() with a start index where it should be starting at. At least java does.
    Meaning one subString can be eliminated and you come down to 1 subString allocation instead of 2.
    With spans ofc that isn't required, but it on allocation too.

    • @nebuleon
      @nebuleon Před rokem +4

      I came here to post this. Indeed, in the documentation for the overloads of System.String.IndexOf in the .NET Framework, there's one for (char value, int startIndex) and another for (string value, int startIndex). That would eliminate the call to System.String.Substring in statement 3 of GetHostGood(string).

    • @alexisfibonacci
      @alexisfibonacci Před rokem

      GetHostBetter shows this.

    • @Speiger
      @Speiger Před rokem +1

      @@alexisfibonacci GetBetterHosts only stackallocates instead of heapallocation.
      But the same optimization still can be applied because a stack allocation still has to be moved. Yes its faster but this optimization can be still used.
      Since not a second Stack Allocation isn't required either with my suggestion.

  • @jongeduard
    @jongeduard Před rokem

    Yep, all this modern memory efficiency things in dotnet are great additions. By the way, we have at least 3 different of those pool types: MemoryPool, ArrayPool, StringPool.
    Note that those are also thread-safe! This might explain why you noticed a bit of lower performance when using the StringPool, because of some protective overhead.

  • @FunWithBits
    @FunWithBits Před rokem

    I didn't even notice the memory diff tool was in Visual Studio also! Nice!!!

  • @prowhiskey2678
    @prowhiskey2678 Před rokem +4

    one thing to note that while your implementation may be faster and more memory efficient. But the implementation is also different. The Uri version will not break with a query Param while yours does.

    • @nickchapsas
      @nickchapsas  Před rokem

      Why would it break?

    • @prowhiskey2678
      @prowhiskey2678 Před rokem +2

      @@nickchapsas Apparently I can't give example URIs in comments as they are blocked by CZcams.
      But you search for a trailing slash, but directly after the hostname a fragment or query param can start without the trailing slash.
      Your parsed hostname would include the query param and/or the fragment.

    • @nickchapsas
      @nickchapsas  Před rokem

      @@prowhiskey2678 Sure so you just add a IndexOf check for question mark or hash after the forward slash one fails and you're done. I just tested it and it doesn't affect execution speed or memory at all

  • @ivaniliev93
    @ivaniliev93 Před rokem

    Good stuff, thanks Nick

  • @mumk
    @mumk Před rokem

    holy smokes what a great sponsor from AWS!!!

  • @qnikbud
    @qnikbud Před rokem +4

    Hello, I think you should have mentioned String.Intern(), a mechanism similar to StringPool, but available in the core libraries, what are the pros and cons of both. Thank you.

  • @ArnonDanon
    @ArnonDanon Před rokem

    I realy enjoy the perf videos👍🏽

  • @davideglass
    @davideglass Před rokem +4

    Worth noting that some of these methods won't work with URLs that have fragments, querystrings or (the admitedly rare) username:password style syntax.

  • @MZZenyl
    @MZZenyl Před rokem +4

    I recently optimized some code for printing long (~250,000 characters) strings to the console, specifically for Linux. The tricky part was to turn the long C# StringBuilder's utf16 characters into a utf8 string, without the GC going amok. With a bit of unsafe code to allow manual memory allocation, I was able to turn the StringBuilder into a Span, then converting it to utf8 and storing it in a Span, and then use P/Invoke to print it to the console using libc's printf function. The speed is heavily CPU-dependent, but I managed to print around 12.5 million characters to the console every second (most of that being ANSI escape sequences to apply colors to the console output), all without the GC going crazy, and only using around 12-25 MB of RAM according to Visual Studio.

    • @29Aios
      @29Aios Před rokem

      Uh, char is 1 byte, but utf16 is 2 byte, utf8 is 1-4 bytes, I see your pain !

    • @MZZenyl
      @MZZenyl Před rokem +1

      @@29Aios Converting from UTF16 (char/ushort) to UTF8 (byte) was actually the easy part. System.Text.Encoding.UTF8 has a "GetByteCount" method which tells how many bytes are needed to encode a UTF16 input to UTF8. It also has a "GetBytes" method which has an overload that takes an input Span and an output Span, and it then handles the actual conversion. Then you just gotta remember to add an extra byte to the byte span, and set it to zero, so the UTF8 string is properly null terminated.

    • @29Aios
      @29Aios Před rokem

      @@MZZenyl That's cool !

    • @AEF23C20
      @AEF23C20 Před rokem

      "I managed to print around 12.5 million characters to the console every second"
      what?
      your "print around 12.5M" = 12.5M fps
      it's impossible by definition
      I work with consoles, and heavy console output is almost trivial
      count the time of output lines, and if this time exceeds some delta - skip the entire output buffer
      you need to separate the internal console buffer and the rendering of this console buffer, but you need to write your own console for this
      and then you will be able to output at least as many lines to the console, you just won’t see them))) and you will always see the last buffer
      and yes, what you are doing is crazy, because by definition, millions of lines are impossible to read and therefore it is not necessary
      the console should output the required 400 buffer lines, but this is not true either, the console should output 25 lines and no more
      you have 60 frames per second, let's say 120, 25 lines per frame is 120*25 = 3000
      typical renderer - renders 30-40k lines per second on built-in video cards without drivers, only GDI, it doesn't get any faster
      this means that continuous line rendering at a fixed 60 fps (above) is rendering 1500 lines per second and this is great because it does not load the processor (about 3-5%)
      yes, yes, any huge number of lines and their rendering is about 3-5% CPU load, I have such a self-written console) in the most difficult scenarios it is 3-5%
      and yes, if you write your own console, then don't forget to add a pool of graphical objects, because it makes no sense to rebuild the same graphical objects, keep them immediately ready for rendering, because usually a lot of lines are repeated or contain repeated parts - but this is means that these are repeating graphic objects

    • @MZZenyl
      @MZZenyl Před rokem

      @@AEF23C20 Alright, let's take this step-by-step:
      - You need to calm down. You are needlessly aggressive and judgemental over something that does not concern you. You are jumping to conclusions that, as you might imagine, are wildly inaccurate.
      - The project in question is a TUI framework I'm doing as a hobby project, and the numbers from my original comment relate to a stress test, hence the large number of characters being written to the console. As you might imagine, such a framework prints over the same area of the screen buffer for each render pass. Most of those characters are part of ANSI escape sequences, specifically the 38;2 and 48;2 sequences which dictate the 24-bit RGB value of the fore- and background, respectively.
      - I said 12.5 million characters, not frames. Please learn to read. The performance is naturally heavily dependent on the system it runs on, and the console host it is executed within. My home PC can print around 6-7 "FPS" with conhost set to the default 120x30 character size, while Windows Terminal seems far less consistent in terms of "FPS", jumping between around 30-100 "FPS".

  • @felixp535
    @felixp535 Před rokem +1

    4:24 IndexOf accepts a startIndex argument! No need to substring here!
    When comparing results in the benchmark, the difference is very negligible between the Good & Better solutions, so I'd suggest using the more intuitive and readable one (which in my opinion would be the homemade one but everyone is free to have their own opinion).

  • @dusrdev
    @dusrdev Před rokem

    What if you created a readonly private field for the char array of the ://, then created a method which is gets the string already as ReadOnlySpan, performed the search, while saving the indexes in a Span on the stack, and only returned the substring at the end, without always converting to a span since it is converted implicitly in the parameter?

    • @QwDragon
      @QwDragon Před rokem +1

      You don't need to create anything for literal strings. They are always in memory and never recreated.

  • @andytroo
    @andytroo Před rokem

    java has a string compacting garbage collector where it will consolidate multiple instances of the same string down to the one pointer - it would be great addition to .Net in some circumstances.
    Sometimes an external library hands you lots of repeat strings (i'm looking at you, Azure Blob Storage SDK) and you just have to take the memory hit.

  • @jcmorin2007
    @jcmorin2007 Před rokem

    Thanks for the video, I wonder how String.Intern work. I had the impression it was a kind of pool of string to avoid keeping string reference.

  • @beefydonkey6714
    @beefydonkey6714 Před rokem +5

    Love you!

    • @rngesus8057
      @rngesus8057 Před rokem +1

      i thought you loved me?

    • @nickchapsas
      @nickchapsas  Před rokem +5

      Beefy Donkey, I love you too

    • @rngesus8057
      @rngesus8057 Před rokem +2

      @@nickchapsas im feeling REALLY left out here. i guess im about as needed as a string allocation:(

    • @nickchapsas
      @nickchapsas  Před rokem +4

      rngesus, I love you too too

    • @rngesus8057
      @rngesus8057 Před rokem

      @@nickchapsas hahaha 😄

  • @rade6063
    @rade6063 Před rokem

    Will using ref keyword and changing url variable that is passed be as memory efficient as StringPool but as quick as span? You are not creating new string and changing one that you already have so you need to allocate string only once?

  • @GigAHerZ64
    @GigAHerZ64 Před rokem +2

    Wait... if it's a pool, shouldn't you return the strings back to the pool once you don't need them anymore? Or is StringPool not threadsafe?

  • @urbanelemental3308
    @urbanelemental3308 Před rokem +4

    Maybe you could cover string.Intern as well.

    • @nickchapsas
      @nickchapsas  Před rokem +4

      Good suggestion! Will add it in the list

  • @r.a.avenger
    @r.a.avenger Před rokem

    hi
    I'm curious to know that in real world when we write code and we are almost familiar with docker, kubernetes and ... should we write most performant code that can handle a lot of requests as a single instance or it's more common to write the most human readable code (by writing "new Uri(urlString)") and use load balancing and other techniques to handle more requests?

  • @salmelo16
    @salmelo16 Před rokem

    is manually stack allocing a char array necessary for the span version? I can't remember where i saw it or what the name was (or i'd google it) but I was under the impression the compiler (or maybe runtime?) sort of converts string literals into constants so that it can reuse the same instance every time. In which case, wouldn't just using a string literal (instead of the much harder to read char array) be equally non-allocating? Or alternatively you could manually declare a const string yourself, if the compiler isn't doing it for you like I think. Or is there some reason a const string would still end up allocating that I'm not aware of?

    • @salmelo16
      @salmelo16 Před rokem

      well guess i should've watched the next video, cuz i was definitely thinking of interning. Still curious if there's some reason why using a string literal over a manual char array would be worse for memory, despite interning.

  • @Andrew90046zero
    @Andrew90046zero Před rokem +2

    I think this is literally what I always imagined for creating allocation-less string returns. If I'm correct, StringPool is *essentially* like StringBuilder, except it's a singleton, meaning it's accessed from anywhere. It's essentially a "scratch pad" for your string data! This video made me ask so many other questions though. Like if StringPool was being used in threads, would there be (a need for) a seperate pool for each thread, to assure non momory conflicts? Either way the concept is amazing.
    This would help so much when it comes to games/ui where you are creating new strings each frame. Instead of trying to avoid it by doing value checks or somthing.
    Also, at 5:56 (line 8), I thought if you used a string literal, it is stored globally in read-only memory, so passing in a regular string would not allocate new memory. So no need to stackalloc right?

  • @kinsondigital
    @kinsondigital Před rokem

    Actually pooling is very important in my game framework Velaptor!!
    I use the pooling technique when loading textures and sounds which are expensive operations and its fantastic. Especially in game development where performance is important as well as saving those memory allocations.
    Great video!!

  • @dmitrykim3096
    @dmitrykim3096 Před 8 měsíci

    Do you have any videos on how to find out first what is the bottleneck before trying to do these optimization?

  • @Dustyy01
    @Dustyy01 Před rokem +1

    Could u not just Return a Span (or a readonly one) ?

    • @nickchapsas
      @nickchapsas  Před rokem +3

      Span needs to be returned as a string at some point. You can't visualize a span without eventually allocating it on the heap as a string. The benchmark just cuts to that point directly

    • @Dustyy01
      @Dustyy01 Před rokem +1

      @@nickchapsas right but the span is a view of the originally passed string, so that's where it is allocated right?

    • @nickchapsas
      @nickchapsas  Před rokem +2

      @@Dustyy01 That's for the original string. The host is nickchapsas.com so even though the span points to that, when it is turned into a string it is basically "chopped off" into a whole new string. It detaches from the span.

  • @cicerofoscarini8890
    @cicerofoscarini8890 Před rokem

    Thank you for your video. Few days ago, talking to some IT friends a question came up: Do you actually code? Like, do you work in real projects or not any more? Or never?
    Thank you.

    • @nickchapsas
      @nickchapsas  Před rokem +1

      I do. I've spent the better part of the last 4 years working for one of the largest payment gateways in the world (Stripe competitor) building systems dealing with thousands of requests per second and millions of users. Currently I am educating fulltime, while still practicing the same techniques I used at my work for my own projects. Currently building something that i can't announce yet.

    • @cicerofoscarini8890
      @cicerofoscarini8890 Před rokem

      @@nickchapsas Thank you, Nick, for you reply. I hope my question was not invasive or impolite. And thank you for your videos. We all learn a lot from it.

  • @ryanzwe
    @ryanzwe Před rokem +2

    This is why I'm up at 10:30

    • @ryanzwe
      @ryanzwe Před rokem

      Almost got goofed typing span

  • @Erlisch1337
    @Erlisch1337 Před rokem

    10:23 - wasnt that the one without Span?

  • @ryanelfman3520
    @ryanelfman3520 Před rokem

    What's that site that has all the boilerplate projects you can download?

  • @dmitrykim3096
    @dmitrykim3096 Před 8 měsíci +1

    How stringpool performance compares to interning?

  • @nayanchoudhary4353
    @nayanchoudhary4353 Před rokem +4

    I thought that in C#, the strings were reused via reference and if you create two string objects with the same content, they point to the same memory location. Why would we need this pool then?

    • @vivan000
      @vivan000 Před rokem +6

      that only happens for strings created at compile time and called string interning. Checking every created string at runtime against every string in memory is very expensive.

    • @nayanchoudhary4353
      @nayanchoudhary4353 Před rokem

      @@vivan000 The it seems like we need pools for everything 😀
      Especially for less GC runs.

    • @vivan000
      @vivan000 Před rokem +2

      ​@@nayanchoudhary4353 absolutely not. It's just another optimization technique that might help in some cases, but also could lead to severe problems.

    • @alexisfibonacci
      @alexisfibonacci Před rokem

      @@vivan000 Which is why the StringPool is slower even thought it doesn't allocate. The coder needs to be clear on what they are doing and optimize appropriately.

    • @29Aios
      @29Aios Před rokem

      See, when you add a new string, you need to compare it with all strings in the heap, and as Ivan said it's too expensive.
      Instead it always added to the heap, and references of the same string can be equal only at compile time.
      I suppose when you add a Span to a StringPool, there are 2 cases - 1st, happy, when hash code is calculated from span, and if there is the same entity exist in Dictionary buckets, it fully compared byte by byte, and either returned existing string reference or 2nd - created a new one and returned new string from Span.

  • @XennialGeek
    @XennialGeek Před rokem +1

    Re-inventing the wheel for minor performance boosts on standard tasks that have good general purpose implementations is usually not a good idea.

    • @29Aios
      @29Aios Před rokem

      Surely, it's an optimization for cpu/memory performance which should be used on customer demand and after profiling.
      The maintainability of the code will also suffer because of these optimizations.

  • @Rose-ec6he
    @Rose-ec6he Před rokem +5

    The solution to all the world's problems is span if you ask Nick x3

  • @everythingcode
    @everythingcode Před rokem

    I thought .Net does string pooling by default to reduce allocations whenever possible. Or is that only for string literals?

    • @nickchapsas
      @nickchapsas  Před rokem

      String literals and some other cases. Since so many people misunderstand this, I will be making a video this week to explain it

  • @amirhosseinahmadi3706
    @amirhosseinahmadi3706 Před 9 měsíci

    Why don't they make the implementation of the `Uri` class span-based?

  • @joelhodes1619
    @joelhodes1619 Před rokem

    You’re calling AsSpan method 3 times. Is there any mileage performance wise in creating the span instance initially then manipulating that one object ?

    • @nickchapsas
      @nickchapsas  Před rokem

      It won't make a difference no. As methods are simply casting.

  • @tidus93
    @tidus93 Před rokem

    I thought that using a Dictionary for caching the results from the GetHost function would be better than using a StringPool. What are the advantages of using a StringPool vs Dictionary for caching the results? Wouldn't the Dictionary be better since it's faster?

    • @nickchapsas
      @nickchapsas  Před rokem +3

      There’s quite a few actually. StringPool has logic to automatically unload data that isn’t accessed frequently, it has a fixed size, it is thread safe, it uses a lot of custom set logic to be as efficient as possible.

    • @29Aios
      @29Aios Před rokem

      Guys, ​MemoryCache has the same logic as StringPool and uses ConcurrentDictionary inside, also linked entries and much more.
      However, big advantage of the StringPool.GetOrAdd() is that it accepts Span parameter (MemoryCache/Dictionary do not), I suppose it calculates a Span hash and if it exists in buckets it compares if it the same, like a Dictionary does, but Span == string, and if so, it returns reference to existing string or else allocates a new one from Span. So there no need to allocate a new string on the heap each time, but for MemoryCache/Dictionary you instead allocate a string key on the heap to pass it as parameter to Get()/TryGetValue()
      PS. It would be nice to have string.Intern(Span) overload, so behavior is almost the same as StringPool, but singleton.

  • @urbanelemental3308
    @urbanelemental3308 Před rokem

    Any reason stackalloc would be better than a static readonly instance?

    • @nickchapsas
      @nickchapsas  Před rokem

      In this case, there isn't much of a difference. You can even use the const as it was originally and you won't see much of a diff there. if anything it might be a bit fater

    • @urbanelemental3308
      @urbanelemental3308 Před rokem

      @@nickchapsas Yeah, I would have just used the string "://" and if the extensions don't allow for it directly, just: "://".AsSpan().

  • @ahoja1907
    @ahoja1907 Před rokem

    Why do I need a string pool when the CLR manages an internal string pool itself?

    • @nickchapsas
      @nickchapsas  Před rokem +2

      That's not quite true. People in the comments seem to completely misunderstand how string interning works. I think I need to make a video to explain this

    • @billy65bob
      @billy65bob Před rokem +2

      That 'string pool' applies to the entire App Domain, and cannot ever be emptied or truncated without unloading said AppDomain.
      It contains all constants and literals loaded within the assemblies thus far, and also anything that was explicitly added via String.Intern.
      However, unless you access it via specific paths (e.g. Enum.ToString), or explicitly call String.IsInterned or String.Intern, you will NOT be using these instances, but rather heap allocated copies, and to even call the latter methods, you're still very likely to be using a heap allocated copy to reference and compare.
      In summary, the biggest draw of this StringPool over the built in CLR one is that it can release its references to these strings, and allow them to be GC'd when no longer required or in use, a secondary draw is that you can use Spans to avoid needless allocations.

  • @tofu1687
    @tofu1687 Před rokem

    9:23 : I'm going to change this to release mode - miss click - no ? sorry wasn't looking straight

  • @user-qp8bt7gq4b
    @user-qp8bt7gq4b Před 6 měsíci

    I thought string.Intern does the same. Extract an existing string from the pool or adds one. It's time for me to reread the documentation...

  • @QwDragon
    @QwDragon Před rokem

    Have you seen overload with startIndex?
    public int IndexOf (string value, int startIndex);
    public int IndexOf (string value, int startIndex, int count, StringComparison comparisonType);
    Also replacing of literal string with Span doesn't have sence as literal string is in memory once and forever, but span is recreated each time.
    string GetHost(string url)
    {
    var i = url.IndexOf("://", StringComparison.Ordinal);
    i = i == -1 ? 0 : i+3;
    var j = url.IndexOf('/', i);
    return j == -1 ? url.Substring(i) : url.Substring(i, j-i);
    }
    But actually they all are wrong as host must end not only on /, but also on ? and on #.
    I would've make a static array of these 3 chars and use IndexOfAny(ends, i).

  • @kamushekdev
    @kamushekdev Před rokem

    There is an unnecessary Substring call in the "good" implementation :D

  • @clashclan4739
    @clashclan4739 Před rokem

    aren't strings already pooled ?

    • @nickchapsas
      @nickchapsas  Před rokem +1

      Not really: czcams.com/video/Xeq8YGEyyp8/video.html

  • @dmitrykim3096
    @dmitrykim3096 Před rokem +4

    Could you switch to c++ at this rate? The optimization is more about c++, c# is more about readability

    • @nickchapsas
      @nickchapsas  Před rokem +6

      String pooling is a language agnostic concept. On top of that there is nothing less readable in the final example. It's exactly the same as the previous example with a single line being different

    • @29Aios
      @29Aios Před rokem

      @@nickchapsas I don't think so. In C++ we have pointers to strings. Any mutation of strings in any thread is globally, but in C# it will be first copied, ie will be locally

    • @dmitrykim3096
      @dmitrykim3096 Před 8 měsíci

      ​@@nickchapsasHey love your videos, I just see that you make a lot of optimizations that are not ordinary for c# but a normal day in the office in c++.

  • @everythingiscoolio
    @everythingiscoolio Před rokem

    Is there honestly a problem here you're solving? I mean you could say "every byte is one too many" but does it *really* matter for any other purpose than to learn the internals? If I had a problem with slow services I'd just throw 4 extra nodes against the problem and call it a day. I worked on extremely high traffic APIs (millions of requests per hour) and these things have not once been a problem for us.

  • @snapback9221
    @snapback9221 Před 8 měsíci

    30% of ur vid show me the explorer and the ide, ppl Look vids on their smartphones.

  • @edkachalov
    @edkachalov Před 3 měsíci

    At first you create a language with GC, then you waste your time to make it fast.

  • @Lazzerman42
    @Lazzerman42 Před rokem +1

    if a developer in my team wrote the substring code instead of using URI - I'd kick him from the team :-) KISS is always better than mess.

    • @nickchapsas
      @nickchapsas  Před rokem +4

      👆 Great team lead over here. Wouldn't sit down and talk about it, but simply kick the developer from the team

    • @29Aios
      @29Aios Před rokem +1

      That's depends where this code is used. Say in high load application, when there are thousands/millions of function calls per second, it should be optimized as much as possible, even sometimes it would be preferable to move a code to unmanaged C++ lib, or even in ASM lib, where specific processor code is implemented.
      Just for ex. Medical Visual Systems company, here in Russia, was going to hire me. They write 4K/60 from 8 cams around medical operating room to their DataCenter with cams streams encoded by specific Intel CPUs (not GPUs) to H.264 or H.265 (don't remember which), they are going to provided high-tech medical operating room to Antarctica Vostok station soon.
      So there is no need in code optimization, which is not frequently used, it should be decided on customer demand and then code profiling, but code maintainability is very valuable

  • @alexisfibonacci
    @alexisfibonacci Před rokem

    I think GetHostBetter will be my preferred approach.