You Completely Misunderstand How Strings Work in C#

Sdílet
Vložit
  • čas přidán 12. 02. 2023
  • Check out my courses: dometrain.com
    Become a Patreon and get source code access: / nickchapsas
    Hello everybody I'm Nick and in this video I will explain how strings work in C# and introduce you to the concept of string interning, which based on previous comments, is being misunderstood by most people.
    Workshops: bit.ly/nickworkshops
    Don't forget to comment, like and subscribe :)
    Social Media:
    Follow me on GitHub: bit.ly/ChapsasGitHub
    Follow me on Twitter: bit.ly/ChapsasTwitter
    Connect on LinkedIn: bit.ly/ChapsasLinkedIn
    Keep coding merch: keepcoding.shop
    #csharp #dotnet

Komentáře • 113

  • @paddeeayen8442
    @paddeeayen8442 Před rokem +85

    That was an awesome video! Would it be possible for you to create one that explains the differences between "const", "readonly", and "static readonly" and when to use each of them? Additionally, a video on data types, their memory sizes, and how they operate would be great as well. I am trying to gain a deeper understanding of programming in C# without delving into assembly, and your videos like this one have been incredibly helpful in achieving that goal.

    • @wertzui01
      @wertzui01 Před rokem +15

      "const in i = 42" will basically replace all occurrances of i with 42 in the compiled assembly. "readonly and static readonly" will just instruct the compiler to ensure that you cannot change it. You can change the value of "readonly" through reflection, because they are just variables. You cannot change the value of "const", because the value is directly baked into the assembly code.

    • @petrusion2827
      @petrusion2827 Před rokem

      @@wertzui01 You don't even need reflection to mutate readonly variables. You just need unsafe code. If ' i ' is a readonly integer, you just need Unsafe.AsRef(i) = newValue; to change it.

    • @Briezar
      @Briezar Před rokem +2

      readonly = call through instance, static readonly = call through class

    • @klocugh12
      @klocugh12 Před rokem +4

      since const replaces occurences, you need to be careful with public consts, as other assemblies using that will need to be recompiled as well.

    • @AgentFire0
      @AgentFire0 Před rokem +7

      Use `const` if a value is never to be changed in run time, or `readonly` if never to be changed after some initialization (usually thru ctor).
      Use `static` if it is not instance-dependant. Constants are always assumed to be static since they are unmodifiable.

  • @DeserdiVerimas
    @DeserdiVerimas Před rokem +21

    This is my favourite kind of video -- revealing some underlying part of the language which is applicable everywhere

  • @ArnonDanon
    @ArnonDanon Před rokem +25

    This is why I like watching your videos, even when talking about basic stuff like strings I can learn something new and just know you respect our time and knowledge and teach all levels something new ... i didnt know the Intern function or even the name of that compiler feature(interning), i did know what its doing behind the scene though. This is a great reference for this topic, so big 10x🙏🏼

  • @FriendlyYoda
    @FriendlyYoda Před rokem +2

    This was an awesome and really educative video. This was the first time I'm actually hearing about string interning. I'm a junior C# Dev and am trying to take that next step by always thinking about performant code and knowing the implications of what I'm actually doing. Content like this are a gem. Super straight to the point and informative.
    Thanks Nick!

  • @KibbleWhite
    @KibbleWhite Před rokem +7

    This has seriously answered so many questions for me! Mainly around the change in the applications behaviour when you commented-out the final two lines towards the end of this example, and that the var nickChapas became null. I have experienced changes in an applications behaviour by adding and removing variables in other parts a method and always wondered why, now I know!! SO helpful, thanks for answering this age old question of mine. ❤ 🔥

  • @Dummyerer
    @Dummyerer Před rokem +4

    I watch your videos even if I think I know the answers. Sometimes I learn a lot other times a little. But it always something and it's worth it. Thanks.

  • @jamesmussett
    @jamesmussett Před rokem +5

    I had some optimisation work a few months back where I had to manually intern some strings we were allocating when parsing NMEA messages from sensors.
    I had absolutely no idea this was happening under the hood. If I had this insight back then it would of made my life a hell of a lot easier 😂
    Great video by the way! You’re bringing to light topics that even most of us seniors struggle to grasp.

  • @lptf5441
    @lptf5441 Před rokem +3

    Thank you so much Nick! I love your content so much, as you have such an amazing ability to explain complicated topics in a really logical, easy to understand way. I hope you won't mind if I make a tiny suggestion. For important terms like "interning", maybe you could flash the term as text on the screen so people can see it written? It can sometimes difficult to catch what is being said when you don't know the term already. I don't mean to say that you're difficult to understand, as your English is exceptionally good. It happens with native speakers as well. Thank you so much again for your fantastic content!

  • @carldaniel6510
    @carldaniel6510 Před rokem +1

    An important topic that's widely misunderstood. I've only forced interning one time, in an application that deals with 10's of millions of strings, many of which are short and many of which are identical. In this application, forced interning cut memory usage massively and also improved performance significantly. The documentation on String.Intern explains the runtime mechanism quite succinctly, but since most developers don't know it even exists, relatively few have read it, I suspect!

  • @lexer_
    @lexer_ Před rokem +2

    I knew about the general idea of immutable strings and the internal optimizations to reuse constants if possible because string constants work very similarly in most other languages too but I had no idea that it was called interning in C# and that you can manually force a string to be interned like this. It is really surprising how many really dangerous tools C# has that absolutely allow you to shoot yourself in both feet and knees at the same time. And it is even more surprising how rare these kinds of mistakes seem to be in production code especially compared to C++.

  • @hallowedbythyframe
    @hallowedbythyframe Před rokem +2

    These lower level vids on the CLR are great since I don't use ASP at work, top tier stuff

  • @dwhxyz
    @dwhxyz Před rokem +2

    Very good video. Another helpful video would be to discuss why making strings immutable is a good idea and the huge issues/bugs that would occur if it wasn't. Of all the extremely clever/experienced C# developers I've met in the last 20+ years not one has been able to answer that question and struggle to understand the reasons why when I explain it to them!

  • @robertsilver1296
    @robertsilver1296 Před rokem

    Learnt a lot, as usual, thank you!

  • @AlanDias17
    @AlanDias17 Před rokem

    Loved it! Very informative

  • @nickz6066
    @nickz6066 Před rokem

    Thank you! It was very useful!

  • @dire_prism
    @dire_prism Před rokem

    I wasn't aware of C# using const expressions like that. Pretty cool, and definitely something I need to remember for the future.

    • @dire_prism
      @dire_prism Před rokem

      I did know string literals were being interned. And it makes sense that the compiler performs the concatenation at compile time if the arguments are const, so shouldn't surprise me :)

  • @DeadDad1
    @DeadDad1 Před rokem

    Excellent video! Thank you!

  • @Anequit
    @Anequit Před rokem

    Very very useful, thank you nick!

  • @FunWithBits
    @FunWithBits Před rokem +1

    Thanks. I didn't know the CLR merged simular strings in some cases or that isInterned() even existed. Great to know!

  • @urbanelemental3308
    @urbanelemental3308 Před rokem

    I knew all that because I read Pro Mem management book. What I learned later was the use of string pools. I knew that interning strings is a rare case that you have to be careful about. But once I saw the proper use of string pools with parsers: mindblown.

  • @jongeduard
    @jongeduard Před rokem +2

    Yep, basically I knew almost everything, but except what you show at 7:26. I did not expect those 2 WriteLines with those comparisons to actually make the difference for the compiler to intern the string or not.
    There's one thing I would like to note: When you declare const variables, the C# compiler does a literal copy-paste of the values just like a macro and generates IL that way. This is quite different from the other examples. A bit of trying on Sharplab with your examples demonstrates that well.
    And such, it looks like the actual interning is not done by just the C# compiler, but rather by the JIT compiler at runtime. Can you confirm this?

  • @tsietsiramakatsa7429
    @tsietsiramakatsa7429 Před rokem

    I did not know about string interning. Nice to know what optimizations are available strings.

  • @being_aslam_tiger
    @being_aslam_tiger Před rokem

    Thanks for making this video. Please keep making videos.

  • @pqsk
    @pqsk Před rokem +1

    I knew about all of this except for the Intern methods. Those are interesting. I never actually read about this, but when I was in school we dove deep into JAVA and we learned how strings works so when I learned C# at school it was mainly to build websites and they didn't go deep into that. I just assumed it was the same how strings worked and I always treated them this way. StringBuilder is always my friend when working with any kind of strings. I've seen a lot of junior and mid-level devs just doing lots of concats and wondering why their code uses so much memory. lol

  • @dukefleed9525
    @dukefleed9525 Před rokem +1

    thanks Nick, I've never thought that the compiler cannot intern string concatenation until the originating parts are marked as constant, I always supposed it worked even without "const", i've never stopped reflecting on it until now even if i was well aware of the behavior of concatenating string with other types (i've discovered it in the hard way long time ago). For some reason i have assumed it wrong for i don't know which reason. Even if it is normal that every other type before being concatenated is transformed with ToString implicitly and therefore at the end it is always two strings! Also, I was sure that with "const" the trick works, but i've wrongly extended my mental model to string variables that are not const :| thanks again.

  • @rockymarquiss8327
    @rockymarquiss8327 Před rokem

    Very informative !

  • @billy65bob
    @billy65bob Před rokem +1

    Just thought I'd mention it for completeness, but Trim and Concat don't always make new allocations.
    A Trim that removes 0 characters will return the original string.
    Likewise a Concat that only adds nulls or empty strings will also return the original string.
    The only exception to the latter is doing a concat on 2 nulls; the result is string.empty.
    I've worked on projects that massively benefit from these optimisations, as the developers of yore decided to null guard most things string related by concatenating an empty string.

  • @funnymememoments
    @funnymememoments Před rokem

    Hey Nick. What materials do you browse to know such a things? It is awesome

  • @dilshodkomilov8649
    @dilshodkomilov8649 Před rokem

    Thank you, awesome video. I had a question about string.Empty.
    I was thinking that instead of setting "" we always should use string.Empty, but according to this video it seems we don't need it.
    Does string.Empty will work with the same behavior?
    For example:

    string myEmptyString="";
    string mysecondEmptyString=string.Empty

    Does it allocate two strings in memory (1 for string.Empty and 2 for "")?

  • @mishalitvinenko5136
    @mishalitvinenko5136 Před rokem

    You should definitely mention the Flyweight pattern in this video.

  • @Chiramisudo
    @Chiramisudo Před 8 měsíci

    If the reference being the same for difference variables made you nervous too, you should know that they get different references when the value gets changed by virtue of the immutable nature of strings forcing a new object to be created when the value is reassigned.

  • @Andrew90046zero
    @Andrew90046zero Před rokem

    I love hearing these videos about string, because they do indeed cause the most heap trouble.
    It makes me wonder if there is a super secret way where you can make an array and then mutate it through some unsafe API :o ;)

  • @godfreyofbouillon966
    @godfreyofbouillon966 Před rokem

    Thank you that's great to know. So instead of comparing strings we should check if the reference is the same, for performance reasons? Wouldnt this be considered a hack? And why string comparison does not do the same internally instead of comparing every char (if I got it right)?

  • @shikhaaggarwal5796
    @shikhaaggarwal5796 Před rokem

    Very interesting!

  • @andytroo
    @andytroo Před rokem

    Java has a string compacting garbage collector, where if the sweep notices that 2 strings are the same, it updates them to the same (disposable) reference, and only keeps one of them.

  • @petrusion2827
    @petrusion2827 Před rokem

    I knew most of this, the only thing I never realized is that if the strings being concatenated are const the concatenated result will be interned as well. I should have known though, since creating const strings by concatenating other const strings is allowed and C# generally tries to evaluate expressions with consts and literals at compile time.

  • @haxi52
    @haxi52 Před rokem

    Another reason why its a good idea it const all your string literals. Good video!

  • @arithex
    @arithex Před rokem

    Turns out I did understand how string interning works, after all. (This hasn't changed since .NET v1, or since I can recall.)
    Was expecting maybe to learn some details eg. length-prefixing and null-termination.. UTF-16 vs UTF-8 in-memory representation.. how surrogate pairs are handled. Conversion during interop. Case-insensitive comparison in various locale contexts. (Maybe ideas for followup videos, if you haven't covered that already.)

  • @Mosern1977
    @Mosern1977 Před rokem

    Very nice video, thanks for the dirty details.
    That 'IsIntern' functionality seems extremely dangerous to me.

  • @protox4
    @protox4 Před rokem

    In full Net Framework, string interning is actually shared across applications, so it was strongly discouraged to manually Intern strings which would never be released, even after your application ends. That's less of a problem with Net Core (Net 5+) since the runtime is isolated to the application, but you should still be careful, because once they are interned, the strings will never be released until your application ends.

  • @Shennzo
    @Shennzo Před rokem

    About the last bit with IsInterned at 07:20 . Is it really because the compiler is "smart" that we get a True for reference or equality checks?
    My thought here is that because we gave the compiler the explicit value "NickChapsas" in the equality/ref checks (lines 8-9), then it stored and interned that string. Then, when calling IsInterned(nick + chapsas), we get something because of that interned explicit string value. Later when lines 8-9 are commented out, the compiler has no reason to intern the value "NickChapsas" because of what you already said earlier in the video

  • @honzajscz
    @honzajscz Před rokem

    Great stuff, i was awere of it but didn't know the two methods. Could you please show the IL where compiler interns the string? What is startup time penalty for that? The

    • @phizc
      @phizc Před rokem

      No startup penalty. AFAIK, the strings are stored in the dll and copied to memory when the program starts.

  • @millch2k8
    @millch2k8 Před rokem

    That is a really interesting nuance in that the addition of 'const' allows the interning optimisation to occur. I'm surprised that 'nickChapsas' would not need to be marked as constant as well though.

    • @theMagos
      @theMagos Před rokem

      Since both parts are constant the compiler can concatenate them at compiletime rather than at runtime, thus able to intern the result

  • @alexclark6777
    @alexclark6777 Před rokem +2

    The coolest thing about this was that commenting out lines later in the execution of the program actually affected the result of lines further up, sort of like a backwards cause & effect situation.

    • @GumbootMan
      @GumbootMan Před rokem

      Yep that's because the initial intern pool (consisting of all the string literals in the program) is created at program startup, before the first line of (user) code runs.

    • @billy65bob
      @billy65bob Před rokem

      ​@@GumbootMan The strings aren't actually interned until something requires it.
      Just try this out for comparison by swapping Method1 and Method2.
      static void Method1()
      {
      ReadOnlySpan str = stackalloc char[] { 'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd' };
      var helloWorld = String.IsInterned(new string(str));
      Console.WriteLine(helloWorld);
      }
      private const string cHelloWorld = "Hello World";
      static void Method2()
      {
      var helloWorld = String.IsInterned(cHelloWorld);
      Console.WriteLine(helloWorld);
      }
      static void Main(string[] args)
      {
      Method1();
      Method2();
      }

    • @GumbootMan
      @GumbootMan Před rokem

      @@billy65bob ​ I don't think that's true. When I run your program I get a blank line from Method1 and then "Hello World" from Method2, which agrees with my original comment. "Hello World" is interned at startup because it's a string literal and cHelloWorld is pointing to that literal in the intern pool. The "stackalloc char[]" is not interned because it's a char array, not a string. The "new string(blah)" call never returns an interned string, as far as I know.

    • @billy65bob
      @billy65bob Před rokem

      @@GumbootMan If you swap it so that Method2 runs first, both will output 'Hello World'. That was my point. The stack-allocated Char array there is just to ensure it doesn't use a string literal in Method1.

    • @GumbootMan
      @GumbootMan Před rokem

      @@billy65bob Hmm, you're right. When is the string interned then? I based my comment mostly on the docs: "The common language runtime automatically maintains a table, called the intern pool, which contains a single instance of each unique literal string constant declared in a program, as well as any unique instance of String you add programmatically by calling the Intern method." But maybe I was wrong in my assumption that this happened at startup...

  • @Dimich1993
    @Dimich1993 Před rokem

    Cool, interning is new for me.

  • @vertxxyz
    @vertxxyz Před rokem +1

    I feel like you have missed a very important fact about interning in this, and that is that the query string still needed to be allocated to get the interned value. So you're not saving allocations in the short term, just long term memory usage.
    This can easily trip people up, where they think the allocation made with the string ops is somehow being avoided

  • @zsolt_saskovy
    @zsolt_saskovy Před rokem

    @nickchapsas What about string.Empty? I was always told to use that over "". But it makes the code much worse to read imho. Is it also interned? If yes, does it still make sense to use string.Empty?

  • @chris-pee
    @chris-pee Před rokem

    Nice video. But I don't like your first example with TrimEnd. Just because TrimEnd creates a new string, doesn't showcase that strings are immutable.
    It would be better to show what strings are allocated in memory after doing string2 += someOtherString, but you would probably have to fiddle with GC.

  • @IvanRandomDude
    @IvanRandomDude Před rokem +1

    For sure

  • @saeedbarari2207
    @saeedbarari2207 Před rokem

    5:51 Is that true though? == operator in C# is a different implementation than Equals() method. Not sure if they're implemented differently for strings though, but I guess that part is a bit missleading

    • @nickchapsas
      @nickchapsas  Před rokem +1

      It is true yes. The == operator is overriden to use the Equals method.

    • @saeedbarari2207
      @saeedbarari2207 Před rokem

      @@nickchapsas 👍👍👍

  • @manuelornato3722
    @manuelornato3722 Před rokem

    I knew about string interning but I actually thought it was a CLR thing, not a compiler thing. So I would have swear all strings, including runtime composed strings were interned (and I have 20 years of c# behind me...😬)

  • @phizc
    @phizc Před rokem

    3:14 Isn't it an optimization by the compiler? If not, then both "Nicks" would be stored in the dll/exe.

  • @ArrovsSpele
    @ArrovsSpele Před rokem +1

    Wait, what you did tell about intern and memory bloat? Doesnt it garbage collect those strings after scope ends?

    • @nickchapsas
      @nickchapsas  Před rokem

      Check this out: learn.microsoft.com/en-us/dotnet/api/system.string.intern?redirectedfrom=MSDN&view=net-7.0#performance-considerations

    • @RawCoding
      @RawCoding Před rokem

      strings live on the heap as any other object, interned strings (compile or runtime) will be added to an internally managed table by the runtime which are going to keep references to the strings. The choice of wording is "is not likely to be released" in reality these strings will never get GC'ed.

  • @terjes64
    @terjes64 Před 5 měsíci

    This is a great vid. To bad youtube does not allow for 0.9x playback speed.

  • @Chiramisudo
    @Chiramisudo Před 8 měsíci

    I have to ask how you discover these caveats. Reading the official docs, seeing something interesting, then playing around?

  • @circular17
    @circular17 Před rokem

    Title is untrue for me, though I was surprised the same string content can be at two different memory location. Not sure were I got the info that strings would always have the same address when equal.

  • @neopiyu
    @neopiyu Před rokem

    Why are they immutable? can you make a video on it?

  • @Wittgensteinien
    @Wittgensteinien Před rokem

    Interesting.

  • @MicroLosi
    @MicroLosi Před rokem

    If you want Junior to fail an interview, just ask him - why the lines of code return true. HaHa, Classic! )))

  • @vabhs192003
    @vabhs192003 Před rokem

    Proud to be the first one to view the video and comment on it. @Nick I deserve a reward of some sorts. :D. As always a big fan of your video. :D

  • @syriuszb8611
    @syriuszb8611 Před rokem

    Ah yes, I am sometimes confused about strings because they are reference types but also immutable, so I forget what they really are.

  • @owensigurdson8610
    @owensigurdson8610 Před rokem

    It would be great if interning worked more generally on any immutable type.

  • @cocoscacao6102
    @cocoscacao6102 Před rokem

    Nick Chapsas name of the channel. Nick and Chapsas values for variables named nick and chapsas. I was almost expecting that he would eventually create a var named myVar... Such egomania...

    • @nickchapsas
      @nickchapsas  Před rokem

      Thanks for the feedback, I’ll rename the channel

    • @cocoscacao6102
      @cocoscacao6102 Před rokem

      @@nickchapsas Just kidding, great video :P

  • @coldhands92
    @coldhands92 Před rokem

    tbh IsInterned not returning boolean is kind of surprise

  • @xeroxeroxeroxeroxeroxeroxero

    here's a comment for your engagement. Cheers!

  • @DomainDrivenDesign
    @DomainDrivenDesign Před rokem +1

    quantum effects

  • @Ridwanqrn
    @Ridwanqrn Před rokem

    Hello Nick chipsas can you help with me how to use loops to datagridview columns in c# 👍

  • @dmitrykim3096
    @dmitrykim3096 Před rokem

    Can you explain how your string pool is better than this Intern mechanism?

    • @dmitrykim3096
      @dmitrykim3096 Před rokem

      I had to use string.Intern when reading billions of rows from the DB, it was creating a new string instance for the same database text (for example region "US" was allocated millions of times). Using string.Intern saved us.

    • @nickchapsas
      @nickchapsas  Před rokem +2

      It's better because for strings like the ones in your comment (for example "US" the string is kept in the pool, but later disposed and garbage collected to make room for other strings. The intern pool is unlikely to be garbage collected.

    • @dmitrykim3096
      @dmitrykim3096 Před rokem

      @@nickchapsas thanks for a reply

  • @urbanelemental3308
    @urbanelemental3308 Před rokem

    **applause**

  • @tedchirvasiu
    @tedchirvasiu Před rokem +1

    Wrong. I only partially misunderstood how strings work in C#.

  • @TheOmokage
    @TheOmokage Před rokem

    This string magic immediately disappears if you upcast string to object like so: (Object)s2 == (Object)s1 \\false

  • @volodymyrusarskyy6987
    @volodymyrusarskyy6987 Před rokem +1

    The world is full of miracles, if you don't read the docs :D So clickbait title!

    • @nickchapsas
      @nickchapsas  Před rokem

      The docs don't tell you half of this behavior

  • @js6pak
    @js6pak Před rokem

    nou

  • @nertsch77
    @nertsch77 Před rokem

    It should rather bei called PriorityBag, because it does not preserve any ordering at all.

  • @daretsuki6988
    @daretsuki6988 Před rokem

    Another episode from the series: Why you shouldn't be a software engineer.... because you might not know something. I got it... fine.

  • @clashclan4739
    @clashclan4739 Před rokem

    So string pooling was a lie 😒. Thought all strings are pooled, then later, if there isn't any reference, it will be garbage collected.

  • @fredhair
    @fredhair Před rokem

    You underestimate me with your video title you smug twit 😂(jk, I like your content, and I tolerate your condescending titles). To be fair I didn't know about IsInterned, can't think I'd use it really though. I think an idea for an interesting follow-on(ish) episode could be string.Create

    • @nickchapsas
      @nickchapsas  Před rokem

      I actually have a video on string.Create: czcams.com/video/Kd8oNLeRc2c/video.html

    • @fredhair
      @fredhair Před rokem

      @@nickchapsas Oh, apologies, didn't know you'd already covered it. Maybe something else you could cover in more depth is utf8 strings (+the new u8 literals) and generally working with utf8 strings and the various optimizations and tricks with byte[], span & how best to work with json especially when it comes to custom (de)serialization (i.e. using utf8 reader / writer vs jsondoc, nodes and the trade-offs between different methods). Keep up the good work 👍

  • @kocot.
    @kocot. Před 10 měsíci

    SPOILER ALERT: if you ever heard of strings interning, in c#. java or anywhere else, you understand it just fine o_O. Nice content, but a terrible title

  • @PetrVejchoda
    @PetrVejchoda Před rokem

    I am one of those who thought the string interning is happening by default. I need to dig deeper into it.