How Much Memory for 1,000,000 Threads in 7 Languages | Go, Rust, C#, Elixir, Java, Node, Python

Sdílet
Vložit
  • čas přidán 27. 05. 2023
  • Recorded live on twitch, GET IN
    / theprimeagen
    ty piotr!
    pkolaczk.github.io/memory-con...
    MY MAIN YT CHANNEL: Has well edited engineering videos
    / theprimeagen
    Discord
    / discord
    Have something for me to read or react to?: / theprimeagenreact
  • Věda a technologie

Komentáře • 1,1K

  • @thedoctor5478
    @thedoctor5478 Před rokem +1819

    Using Python's asyncio for this test was the wrong thing to do. It's similar to what was done with NodeJS. Asyncio is an event loop, not a thread. Python has threading libs for threads.

    • @Kobrar44
      @Kobrar44 Před rokem +92

      multiprocessing xD no need for a benchmark, it would be just atrocious

    • @nikonyrh
      @nikonyrh Před rokem +59

      @@Kobrar44 Yeah just run "multiprocessing.Pool(int(1e6))" and you are good to go :D Argh I hate python, but it is still my main language.

    • @just_a_random_
      @just_a_random_ Před rokem +21

      ​@@nikonyrhJust curious, why do you hate Python ?

    • @magicbob8
      @magicbob8 Před rokem +63

      But asyncio is faster because pythons multithreading is so bad, so it’s what people use. And it accomplishes the same things

    • @ibrahimaba8966
      @ibrahimaba8966 Před rokem +32

      this is an IO-Task so asyncio is the good solution!

  • @jonathan-._.-
    @jonathan-._.- Před rokem +970

    compaaring actual threads with async tasks seems kinda weird

    • @ccgarciab
      @ccgarciab Před rokem +114

      And workers and a plain event loop. Terrible all around.

    • @MikyLestat
      @MikyLestat Před rokem +42

      They are not the same, but having async tasks is a powerful functionality that isn't available in all languages. It is correct he wasn't comparing the same, but you could argue that he was comparing how you would achieve the same thing if you wrote it in each language

    • @lozanov95
      @lozanov95 Před rokem +11

      ​@@MikyLestat Depends, because with Python you will run on a single thread, but with go for example you will use multiple threads. If you are actually computing anything this will make a significant difference.

    • @MikyLestat
      @MikyLestat Před rokem +3

      ​@@lozanov95 Exactly. I think that the reason for the comparison is to get an indication of how much memory (minimally) each programming language will use to achieve the same thing. Achieving the same thing in each language is translated to using the features and constructs of each language. Python is a great language, but it isn't the fastest. The global-interpreter lock (in addition to Python being interpreted in CPython) causes it to be slow.
      Just because Python doesn't really have multi-threading, it doesn't mean we shouldn't use multi-threading/tasks in other languages and then profile the memory footprint.

    • @davidstephen7070
      @davidstephen7070 Před rokem +2

      @@MikyLestat i think, this's wrong ways to compare language that only run in single thread vs multi-thread to get requirement memory to run that tasks. garbage collector have feature to queque overload thread. so fastest process means lower memory. and for tasks that have high range let say. first task 20KB, 70th task 1MB. Initial size heap higher give good response than set initial size to 50KB and re-allocate memory size. This all dependent user hardware to choose process ways or memory ways. if memory cheaper than cpu. than go memory, if cpu cheaper then choose like go or rush that re-allocator frequently

  • @nunograca2779
    @nunograca2779 Před rokem +725

    If I'm not wrong, C# uses a theard pool behind the scenes when using async/await and what it does is it recycles theards. That's why in the first test it was way up than the others. I think that was the threads pool being initialized with a bunch of threads.

    • @dziarskihenk8798
      @dziarskihenk8798 Před rokem +31

      this.

    • @3ventic
      @3ventic Před rokem +50

      Yup. It always allocates a fixed size pool of managed threads depending on the system it's running on, unless you set the size yourself, which is possible and would be separately interesting for this benchmark.

    • @MikyLestat
      @MikyLestat Před rokem +71

      @@3ventic The ThreadPool default is much smaller, it shouldn't take 120 MB at idle. I'm betting he wasn't distinguishing between allocated and committed memory.

    • @GabrielSantAna-sm9zh
      @GabrielSantAna-sm9zh Před rokem +25

      as far as I know, C# also compiles the async methods to stateful classes, so it generates the states of each “step” of processing beforehand, when you create that amount of tasks you are basically creating a list of super small instances in a queue to the threadpool to consume until the next state (await) and throw again in the end of the queue

    • @3ventic
      @3ventic Před rokem +8

      ​@@MikyLestat I was a bit mistaken, but there is a fixed minimum number of threads (ThreadPool.GetMinThreads). On my system it's 32 by default and the equivalent program on my system (1 task) takes up 195M RES 108M SHR while a million tasks is using 52 threads and 472M RES 23M SHR.

  • @hansenchrisw
    @hansenchrisw Před rokem +528

    As a Java apologist, it first got virtual threads in 1997 with version 1.1 (edit: later removed and recently re-added in v 19). Also, Java (and presumably.NET) pre-allocates a bunch of memory by default. Hence how mem looks high for small numbers of threads and it doesn’t increase until you hit bigger numbers.

    • @Talk378
      @Talk378 Před rokem +50

      Yep, rare prime L

    • @elraito
      @elraito Před rokem +24

      Yes bu ran the same code aot comüiled for c# and its only 5mb baseline. The blog author misrepresented c# badly

    • @hansenchrisw
      @hansenchrisw Před rokem +26

      @@elraito no doubt, but I don’t expect someone to be proficient at all those langs/runtimes.

    • @giuliopimenoff
      @giuliopimenoff Před rokem +5

      That's why they should have used Kotlin coroutines

    • @mishikookropiridze5079
      @mishikookropiridze5079 Před rokem +1

      ​@@elraito That's the variation introduced by running it locally.

  • @casperes0912
    @casperes0912 Před rokem +95

    There's also the memory vs. speed tradeoff. Sometimes keeping more things in memory can also make it faster. If the managed environments that have a higher starting point in memory usage already has a bunch of kernel threads lying dormant in a thread pool that's taking up memory but speeds up spawning of threads.

    • @cakedon
      @cakedon Před 5 měsíci +9

      if my hello world doesnt use 27 gigabytes of ram i wont write it

    • @maximumcockage6503
      @maximumcockage6503 Před 5 měsíci +3

      Yeah. Bun.js was priding itself on being faster than Rust in it's beta. Then when it came out and people started benchmarking it was slightly faster than rust by like a few percent, but used 40 times more memory on average.

  • @Deemo_codes
    @Deemo_codes Před rokem +226

    Each elixir process spawns with a 50k heap, garbage collection happens on a per process level (you dont stop the world, you stop a process). This is because the way processes are used in elixir is like how microservices are used. Each process does a small amount of stuff then sends a message on to another service.
    The erlang vm that elixir runs on will launch 1 scheduler per cpu and does pre-emptive multitasking. So if you had 1mn processes doing stuff you would get each process executing for a few ms then being switch out and added back into the queue that the schedulers pull from. So if you have more cores you get more parallelism, if you only have 1 core you still get concurrency.
    Whereas async runtimes tend to be cooperative require some form of explicit yielding from a running task, elixir will just swap stuff out. Makes it good for soft realtime stuff, if you want to do cpu intensive things you can delegat to NIFs (native implemented functions) written in C or Rust. The rust ones tend to be safer since panics are caught and raised as errors in elixir. Wheras a panic in C will crash the whole VM

    • @Overminddl1
      @Overminddl1 Před rokem +17

      You can also specify the memory usage of a process as well on the beam VM, this significantly reducing the amount of memory something will use whenever it's spawned and doesn't really allocate anything, like in this case

    • @madlep
      @madlep Před rokem +17

      And to do a test closer to what some of the other runtimes are doing, just call :timer.send_after(10000, :done) a million times, and then do a loop to receive :done 1 million times. Takes about 200mb instead.

    • @genericjam9866
      @genericjam9866 Před 6 měsíci +4

      Elixir / Erlang processes have far less memory by default. More like 256 bytes but depends on word size on your system iirc.

    • @nyahhbinghi
      @nyahhbinghi Před 5 měsíci +1

      really smart GC model! Elixir was very well designed

    • @nyahhbinghi
      @nyahhbinghi Před 5 měsíci +2

      I wouldn't compare it to microservices. I would just say Elixir processes are independent and don't share memory. Which really makes it unique (I don't know of another runtime like this except Node.js webworkers).

  • @shreyassreenivas4786
    @shreyassreenivas4786 Před 11 měsíci +69

    Go reserves 4K of memory for each thread's stack so you could do quite a bit of work on each of those threads without incurring further costs.

    • @demyk214
      @demyk214 Před 11 měsíci +2

      Makes sense

    • @-rate6326
      @-rate6326 Před 5 měsíci +5

      goroutines aren't threads.

    • @tablettablete186
      @tablettablete186 Před 2 měsíci

      ​@@-rate6326Yeah, GO actually creates all threads at startup and just assign gorourines to them.
      All of this to say: it's a thread pool lol

  • @ThePhoenixProduction
    @ThePhoenixProduction Před 6 měsíci +162

    Where is c++?

    • @ErickBuildsStuff
      @ErickBuildsStuff Před měsícem +9

      None cares😅

    • @SowTag
      @SowTag Před měsícem +118

      ​@@ErickBuildsStuffAh yes, no one cares about one of the most important and influential programming languages of all computing history

    • @InternetExplorer687
      @InternetExplorer687 Před měsícem +41

      @@SowTagid argue that C is more influential but yeah, saying no one cares about the language most used in most performance critical applications, that also need low level access to memory, is a really big stretch.

    • @jstro-hobbytech
      @jstro-hobbytech Před měsícem +3

      This guy reminds me of yongyea. Parrots other's work and makes more than the authors combined. He has no insight or original opinions or educated insight (from experiences academic or otherwise).
      I hate how people raise this guy up.
      Agreed on c++. That's my personal preference as I like the syntax being I learned it the same term I took cobol, Java (when it was new), visual basic and oop was still being defined.
      I've never worked in industry as a programmer but keep up to a middling ability.
      One thing I do know is that bullshit always smells like bullshit and this dude is full of it. People that talk during react videos do so only to fall under fair use, I see the same here transposed to a topic he is novice. Want for choice as mediocrity's excuse is no less evident than an untrained hand on display for no person's betterment or an opiate of excuse to be subject for one not turning to their purpose.
      I'm as wrong as apt to be right so there's that as well.

    • @idkwhatcouldbeavalable
      @idkwhatcouldbeavalable Před měsícem

      ​@@jstro-hobbytech I personally use Rust as it keeps some of the cpp syntax and adds on top of it to prevent common mistakes.

  • @devotiongeo
    @devotiongeo Před rokem +233

    Creating a million concurrent "tasks" (or spawning processes as we call them in Erlang/Elixir) and allowing them to remain idle is one thing, while making those processes actually do something, such as each one of them having a persistent connection to a client and feeding it, is something entirely different. In practical terms, when it comes to real-time apps, the BEAM (Elixir/Erlang) outperforms all other languages by a significant margin.
    This is precisely why Brian Action and Jan Koum chose Erlang for WhatsApp after years of experience with Yahoo Messenger and Yahoo Chat Rooms. If someone hasn't had the opportunity to work with any BEAM language, the above statement may appear to them as an empty boast, and I can't blame them for that.

    • @ThugLifeModafocah
      @ThugLifeModafocah Před 11 měsíci +3

      But then this example needs to be done and showed to the world as this primeagen is reacting. I'm surprised with Elixir performance here... in a bad way.

    • @xbmarx
      @xbmarx Před 11 měsíci +39

      @@ThugLifeModafocah I'm not. Erlang processes are completely isolated. COMPLETELY. Every "task" has a separate GC, memory space, everything.

    • @szymonbaranowski8184
      @szymonbaranowski8184 Před 11 měsíci +7

      ​@@xbmarxso if things crush only these things crush that's a feature itself

    • @Aaku13
      @Aaku13 Před 11 měsíci +24

      The BEAM is pretty quick, but it won't "outperform all other languages by a significant margin". Ran several huge elixir services in production with lots of traffic and our Go services were much more performant.

    • @osazemeusen1091
      @osazemeusen1091 Před 9 měsíci +8

      ​@@Aaku13I can agree for only CPU bound tasks. For IO bound tasks, Golang doesn't come close in performance to Elixir

  • @bryanenglish7841
    @bryanenglish7841 Před rokem +212

    You forgot the extra Rust thread it takes to track all the bullshit drama in the Rust community

    • @Marhaenism1930
      @Marhaenism1930 Před rokem +14

      oopsy! is it new feature of crablang in 2023?

    • @BlackistedGod
      @BlackistedGod Před rokem +4

      dammit why did I laugh so hard on this

    • @JensRoland
      @JensRoland Před 11 měsíci +13

      The Rust forums are just clogged with unproductive / outdated discussions that lead nowhere and make it harder to get anywhere as a community. The mods should simply go through all the threads once in a while and nuke the ones that are no longer relevant or helpful so the good stuff can get more space and everything would run smoother. Maybe they could even automate this with an LLM agent? They could call it “RustScheduledGarbageRemover”

    • @juniuwu
      @juniuwu Před 4 měsíci +6

      @@JensRoland Garbage Collector? BAN

    • @JensRoland
      @JensRoland Před 4 měsíci +8

      @@juniuwu banning people is just garbage collection for communities ;-)

  • @diadetediotedio6918
    @diadetediotedio6918 Před rokem +254

    C# was the winner, clearly everybody was expecting this

    • @sanampakuwal
      @sanampakuwal Před rokem +6

      yes

    • @shreyasjejurkar1233
      @shreyasjejurkar1233 Před 11 měsíci +22

      Of course, kudos to .NET runtime team! 😎

    • @mattymerr701
      @mattymerr701 Před 10 měsíci

      Clearly they fucked their setup
      [Insert cope here]
      To be fair, they did fuck it but...

    • @cnikolov
      @cnikolov Před 10 měsíci +7

      Running as AOT has even smaller footprint

    • @FilipCordas
      @FilipCordas Před 10 měsíci +12

      Also he wasn't using ValueTask, they reduce the memory consumption considerably. But I hate tests like this because a compiler could remove everything before the code isn't doing anything.

  • @TanigaDanae
    @TanigaDanae Před rokem +129

    An information that has not been said in the video is that: async functions in C# are State Machines and Tasks (are part of the Task Parallel Library and) are automatically run in thread pools. So the only internal state these async functions have is the time they need to wake up, and all Tasks could theoretically have the same wakeup time.
    I would've loved to see a C# Thread implementation. I suspect the C# compiler is optimizing redundant Tasks away since they lack any side effects.

    • @vitskr1
      @vitskr1 Před rokem +10

      Thread pool has like 512 preallocated threads, hence high memory usage in idle. Tasks are actually running, but max degree of parallelism is 8 (8 threads CPU) so there is practically nothing allocate.

    • @q1joe
      @q1joe Před rokem +2

      @@vitskr1 you can tune this, knowing your workload though. Some languages I feel didn’t he the best showing here as the author isn’t an expert in each one, which is understandable

    • @monad_tcp
      @monad_tcp Před rokem +2

      @@vitskr1 Exactly what I suspected czcams.com/video/WjKQQAFwrR4/video.html . Its using the Server tuning, I think on Desktop the default is Number of Cores * 2 .

    • @monad_tcp
      @monad_tcp Před rokem +4

      @@vitskr1 512 threads * 512Kb = 256MB . Its not that big of a deal for servers with lots of cores.

    • @bangonkali
      @bangonkali Před 8 měsíci

      @@monad_tcp i agree. and irl if you plan to launch 1M concurrency your probably have the RAM to match. i still don't think many people do these in a single process anyway. probably better to distribute workload to multiple servers. i recommend orleans 7 for c# devs. 😅

  • @Hallo503
    @Hallo503 Před 10 měsíci +138

    C# has the lowest memory usage because it is using the threadpool, that recycles blocking threads, like when calling Task.Delay. So there aren’t actually a million threads created but rather they are queued into the threadpool. To avoid this create the threads explicitly

    • @user-qu5cc5oe2h
      @user-qu5cc5oe2h Před 7 měsíci +61

      pff... everyone knows that c# offloads 50% of tasks on Azure servers

    • @dieSpinnt
      @dieSpinnt Před 7 měsíci

      @@user-qu5cc5oe2h ROTFL.
      As a first time viewer I asked myself if ThePrimeTime is always on that level of cocaine?
      Well, its something different than other coding channels. A fresh breeze, so to say .... **g**

    • @muaathasali4509
      @muaathasali4509 Před 6 měsíci

      @@user-qu5cc5oe2h free compute hack

    • @qendrimimeri8561
      @qendrimimeri8561 Před 5 měsíci

      ​@@user-qu5cc5oe2h😂

    • @gregorymorse8423
      @gregorymorse8423 Před 3 měsíci +2

      No shit, Sherlock, all of the languages were using threadpools except Java and Rust with real worker threads. So you've failed to uniquely qualify C# altogether.

  • @SirBearingtonSupporter
    @SirBearingtonSupporter Před 11 měsíci +12

    You actually pointed this out early on. In the Java and C# version, he uses "ArrayList" without specifying the size.
    ArrayList in both these languages hold an actual Array object. It's why the lookup time for "get" is a memory address lookup time.
    When Java needs to expand the array size, it creates a larger array that is twice the size of the current array size. I believe the default is 10.
    Java also doesn't run the garbage collector unless it needs to be run or specifically invoked with System.gc.
    Because the JRE doesn't plan ahead for your bad code, it just looks for a new place to put the object in memory, leaving all the old references that need to be deleted alone - because the GC will deal with it as needed.
    Just to recap there are several arraylist objects each holding an array of size n (below) in memory - and if the JVM is given enough memory, all 11 of these will still be there.
    So that means there are 20510 threads in memory on the test.
    While his approach to joining all the threads was barbaric, it's also the accepted answer on StackOverflow, we are not measuring the speed of the execution, just the memory of it.
    If you were not trying to measure the memory performance of threading on difference languages, I would actually give java more threads to manage the threads (parallelize stream).
    Finally thoughts,
    We aren't concerned about thread space in production equipment, we are concerned about execution time and if my entire program hangs because one calculation couldn't be done, I'm missing out on something important - it could be a trade, moving servo for a robotic (self driving cars) or producing an input for a chess game. Collecting the information that I can allows me to implement an algorithm that is capable of making educated guesses based of what was calculated.
    If we do care about thread space, we would be better off doing single threaded applications since we don't have an overhead associated with the effing cost of the thread.
    TL;DR
    Something something short equal something something int because the JVM go fast blah blah addresses blah blah blah 4. (primitive array blah blah addresses, blah blah)

  • @igordasunddas3377
    @igordasunddas3377 Před rokem +49

    Man I am allergic to empty catch blocks in Java - always. After looking for exceptions that have never been rethrown or really handled, I am really on the fence. Empty catch blocks should not exist or even be allowed...

    • @gregorymorse8423
      @gregorymorse8423 Před 3 měsíci

      You are allergic to using your brain, yes we know. Maybe if you knew what checked and unchecked exceptions are and stopped making dumb comments. This is why you should stop the drugs and go back to school, fool

    • @albertmagician8613
      @albertmagician8613 Před měsícem

      I have no problems with empty catch blocks, as long as my compiler is allowed to optimize them away.

  • @bahtiyarozdere9303
    @bahtiyarozdere9303 Před 5 měsíci +2

    Thank you for sharing and commenting on this one. I would love to see C# with AOT compile. I believe it would make a huge difference.

  • @markusn4614
    @markusn4614 Před rokem +126

    That C# method has 2 extra layers, the code inside the for loop should just be tasks.Add(Task.Delay(TimeSpan.FromSeconds(10)));

    • @Eirenarch
      @Eirenarch Před rokem +19

      This 👆
      They created threads to run their threads inside

    • @PetrVejchoda
      @PetrVejchoda Před 11 měsíci

      @@Eirenarch No it should not. If you did it the way you describe, the work (in this case represented by Task.Delay) would not be scheduled on TaskScheduler and would instead be done on the thread that this code is running at thus blocking it and not using CPU cores to its fullest.
      If any, it should be Task task = Task.Run(Task.Delay(TimeSpan ...)); tasks.Add(task); This would save some memory while still scheduling the work on worker threads.
      I am not sure if there would be any benefits, if you used TaskFactory and Scheduler directly, whether it would be more performant, but I highly doubt so.
      Task itself is glorified coroutine and job child. Its just a premise of an action, that can wait for other actions to complete. Task.Delay does not do anything with scheduling, or threading. It just writes a timestamp, and deposits the Task to run later, when the proper time has come. But it would not start new thread/virtual thread/Task/Coroutine. Since they are trying to figure out, how costly scheduling a new thread/virtual thread/Task/Coroutine is, this would not do the work.

    • @manpt123
      @manpt123 Před 10 měsíci

      c# and you are the 2 most useless stuffs

    • @FilipCordas
      @FilipCordas Před 10 měsíci

      Also I don't see value tasks and the list doesn't have a buffer set.

    • @taqial-faris6421
      @taqial-faris6421 Před 8 měsíci +3

      I was looking for this comment. Guy who created that blog clearly knows nothing since he is using chatGPT and chatGPT also knows nothing if it outputs that kind of code... But hey, even my 'senior' coworker used to write async code like that so who am I to judge.

  • @wlockuz4467
    @wlockuz4467 Před rokem +26

    It should've been "To infinity and NaN" as an homage to JavaScript.

  • @Lyynx92
    @Lyynx92 Před rokem +5

    .Net pre-allocates a thread-pool at startup though the memory shouldn't be quite that high. Pretty sure it also utilizes a work stealing scheduler under the hood for continuations and its async/.await behavior. Also if you want to further optimize for memory the ValueTask struct will do some caching cleverness to dodge Task allocations if the work is either already done or can be done synchronously. Given how simple the test is, the GC probably won't kick in as it can recycle a lot of those Task objects.

  • @W1ngSMC
    @W1ngSMC Před rokem +67

    To be fair, Elixir is spawning new processes with their own memory and PID (inside the VM).

    • @isaacyonemoto
      @isaacyonemoto Před rokem +25

      And also providing stuff for graceful restarts and an entire message queue

    • @BosonCollider
      @BosonCollider Před rokem +13

      And preemptive scheduling, if any one of them fails or blocks indefinitely it cannot take the rest down with it.

    • @sukidhardarisi4992
      @sukidhardarisi4992 Před 4 měsíci +1

      usage of Task.async in elixir, it comes with lot of boiler plate that is wrapped on top of GenServer. if the test has to be performed for concurrent tasks, one could go with primitives like spawn, send and receive in order to know the true potential. Just my opinion on why elixir used a lot of memory.

    • @gregorymorse8423
      @gregorymorse8423 Před 3 měsíci +1

      It's not doing anything. The erlang process concept has nothing to do with threading. Sure it explains the memory usage, but there are ways to pool it so a maximum amount of processes could be spawned at any time.

  • @chigozie123
    @chigozie123 Před 6 měsíci +5

    The go results are not surprising. It's a well-documented feature that each goroutine starts with an initially pre-allocated stack size. Prior to go 1.2, it was 4kb, then it went to 8kb, and I believe it's now at 2kb for go 1.4+.
    So 2kb × 10k means an additional 20mb on start. At 100k, it means a minumum of 200mb on start.
    The math seems pretty consistent with the results we see for go, although they seem to suggest that initial stacksize may be closer to 2.7kb than 2kb.
    We also have to keep in mind that there is a garbage collector running in there, and we didn’t account for how much memory it requires to keep track of everything going on.

  • @NameyNames
    @NameyNames Před 9 měsíci +26

    As likely already pointed out, C# uses a thread pool, and will definitely not create a gazillion threads in this test, and the memory required to house all of these insignificant tasks will be very small, which is apparent in the test results.
    I tried it out in LinqPad, but with one additional task whose only purpose was to keep track of the number of simultaneous threads actually in use. For 1 million tasks, the actual active thread count peak never even exceeded 50 on my system (usually much lower). No wonder, when all that the tasks are "doing" is async-waiting on a delay.
    This benchmark is broken in the sense that it doesn't really do what the author thinks it does, i.e. it does NOT create a lot of threads (virtual or otherwise) in all languages/runtimes, and measuring the memory usage is thus close to pointless.

  • @tofaa3668
    @tofaa3668 Před 8 měsíci +2

    The issue with the java threads i feel like is not preallocating the array list, every time an arraylist gets appended it checks for the size and generates a new array. Which in this case would be a whole lot of arrays in memory for the gc to collect.

  • @stevenhe3462
    @stevenhe3462 Před rokem +12

    Elixir reserves 4kiB of RAM for each of its processes. Each process in Elixir has its own separate heap to eliminate the possibility of stop-the-world-GC.

    • @llothar68
      @llothar68 Před rokem +2

      Each Linux kernel thread needs 32kb (28kb of it are non swappable physical kernel stack space) + 1kb for kernel structures.

  • @woolfel
    @woolfel Před 11 měsíci +3

    back in the JDK 1.3 days, the JVM would allocate 1MB per thread, but it was changed around 1.6/1.8, I forget exactly which release they fixed that. It's also important in Java to get the memory used, not memory allocated. The biggest issue with java for me is once the JVM allocates memory, it doesn't release it until you stop the JVM process.

  • @pinoniq
    @pinoniq Před 11 měsíci +4

    If you want node to actiually use multiple threads, you need to tell libuv to use multiple threads. There is a env variable for this: UV_THREADPOOL_SIZE . Like you said, node has an eventloop. Thats not multi-threaded. It's single threaded with callbacks. Thats why setTimeout is more a 'minimum' guideline and not precise at all (under heavy loads). Just make a busy-wait program in node and you'll see it only filling up a single core on ur CPU

  • @madlep
    @madlep Před rokem +22

    The Elixir solution has a LOT of room to squeeze out. I can get it running in about 990mb with some tweaks. Main thing is the default heap size. Passing `+hms 1` as part of `erl` options sets default size to 1 4-byte word. Also, using plain spawn calls instead of Task (which accumulates results, and adds extra memory and GC and processing overhead) reduces it further.

    • @mennovanlavieren3885
      @mennovanlavieren3885 Před rokem +4

      True, but as long as the "threads" don't actually do anything it is a useless comparison. The constructs on these platform all provide a different feature set, so comparing performance is bogus. I mean a C# Task is just one or a few objects waiting in several queues to be invoked by native threads in the thread pool with a job stealing algorithm. NodeJs and Python are single threaded with a single event loop. I don't know what the others do and give you for free, but this isn't apples to apples.
      (Edit: I automatically type thread with a capital T)

    • @madlep
      @madlep Před rokem +10

      @@mennovanlavieren3885 Yup. The comparison is pretty meaningless. The "cheap", non-idomatic Elixir way to do this, would be to start 1,000,000 timers, and wait for them to finish. Effectively doing the same thing as some other platforms. I just tried that - uses about 200mb in total of memory.
      If all it's doing is starting something that sits there idly for 10 seconds, there isn't much difference.
      No point carting round a whole isolated separate stack and heap for each process, and associated house keeping. Elixir processes are cheap, but they're not *that* cheap.

  • @TizzyD
    @TizzyD Před rokem +5

    🤔 I concur with you Big P...let's look at some more real use cases. Going outside of the process itself will complicate analysis with other elements (e.g. DB, ORM, etc.) that should be held constant; however, there are good use cases to eliminate as much of the 7 layer stack as we can:
    1. Storage - with the good old random file manipulation, etc.
    2. Network - doing something more like a UDP listener to eliminate possible contamination with socket handling
    3. Memory - malloc, 😮multi-threaded data manipulation, release (to watch garbage collection)
    4. Compute - not all compute operations are math-based, but do some string parsing, concatenation, etc.
    I'm thinking we want to eliminate math computations because most of those operations will come down to the underlying math implementation vs. actual performance (e.g. Fortran being fast, etc.), but network issues could have the same impact. Consider the history of Java IO vs. NIO.

  • @Bourn77
    @Bourn77 Před rokem +58

    C# master race. Lets go.
    .NET team is optimizing the fu*k out of the stack for a few years.
    Hands down the best api backend language to work with. 🥰

    • @reddragon2358
      @reddragon2358 Před rokem +7

      I hope that it become so good that it could be perfectly used for full stack language.

    • @BosonCollider
      @BosonCollider Před rokem +2

      @@reddragon2358 It does work fairly well together with HTMX

    • @reddragon2358
      @reddragon2358 Před rokem

      @@BosonCollider Oh, glad to hear, but for example with Java could be used for full stack development with the help of Java frameworks.

    • @mishikookropiridze5079
      @mishikookropiridze5079 Před rokem +3

      @@reddragon2358 That produces horrendous UI. Could be future using WASM.

    • @reddragon2358
      @reddragon2358 Před rokem +2

      @@mishikookropiridze5079 I heard that C# has UI frameworks. I hope that the get better with time.

  • @Jmcgee1125
    @Jmcgee1125 Před rokem +19

    15:11 Python, by default, only uses one worker thread. When writing asyncio code you do need to be careful that you don't block. My understanding is that each event loop may have only one worker, but I'm not experienced enough to be confident in saying that.

    • @ShaneFagan
      @ShaneFagan Před 3 měsíci +1

      To expand on this a little more for people:
      1. They used asyncio which is just an event loop, there is no threading, just a loop that does the tasks in FIFO. The memory usage would be just the amount that stores the task information/statuses, it wouldn't have overhead from spawning threads
      2. Virtual threads in Python are in the threading module. They are limited to one core but can run in parallel and independent as you would expect from a thread.
      3. For proper hardware threads you have to use multiprocessing and it works very similar to other languages that use fork but with the added stuff like the ability to spawn a thread pool for batch processing and maybe limit the amount of threads to a number that wouldn't cause stability issues on the system.
      Also in Python3.12 there are some interesting changes related to the GIL which change how concurrency works in general with the ability to run code in basically another instance of Python. That will change mega high performance Python concurrency quite a bit in the future but as of right now it's one of the 3 above I described. Just note the blog post he is talking about is 1 which isn't parallel.

  • @misterkevin_rs4401
    @misterkevin_rs4401 Před 11 měsíci +2

    C# Uses a thread pool behind the scenes with a default config of #X amount of threads depending on the system it's running, it's usually 20 if I remember correctly from my .NET days. What's interesting to me is how it can spin up more if required and scales correctly.

    • @FilipCordas
      @FilipCordas Před 10 měsíci +1

      Should be equal to number of cores you have available on the machine.

  • @Trekiros
    @Trekiros Před rokem +27

    Intro: let's not compare apples to potatoes
    The rest of the video: compares making threads with maintaining an event queue

  • @Mentox2
    @Mentox2 Před rokem +37

    9:30 - In the 19th century the german mathematician Georg Cantor proved that there must be more than one kind of infinity, such a the infinity of the natural numbers, and the infinity of real numbers and so on, and that there are larger infinities than others. The smallest infinity is that of the natural numbers, and its called Aleph Zero.
    So yes, Buzz can indeed go to infinity and beyond, so long it is mathematical infinity.

    • @ko-Daegu
      @ko-Daegu Před rokem +3

      pretty cool i remember studying this part of set theory and how Alef (first alphbet in Arabic) the idea is that the set of natural numbers (1, 2, 3, ...) has the smallest cardinality and is denoted as Aleph Zero (ℵ₀)

    • @JamieNeubertPedersen
      @JamieNeubertPedersen Před rokem +1

      Thanks. Was thinking the same.

    • @user-zt7gj5ff8n
      @user-zt7gj5ff8n Před rokem +3

      Nothing "and so on". That is not clear. In fact it can neither be proven not disproven with standard mathematics. It is called the continuum, hypothesis

    • @mykhailonikolaichuk6392
      @mykhailonikolaichuk6392 Před 11 měsíci

      @@user-zt7gj5ff8n The continuum hypothesis is that there are no intermediary infinities between "infinity of integers" and "infinity of reals". It is, indeed, but an axiom. However, the cartesian product of a set with itself ALWAYS yields a set with higher cardinality, so infinitely many distinct infinities can be constructed by the repeated usage of it.

    • @d7ffab979
      @d7ffab979 Před 11 měsíci +1

      @@mykhailonikolaichuk6392 That is just wrong. Infinite cartesian products of natural numbers, for examples, are "just" rational numbers.

  • @3x10.8_ms
    @3x10.8_ms Před rokem +25

    crab is fast and fox is slow

  • @metaphysicalconifercone182
    @metaphysicalconifercone182 Před rokem +108

    I wonder why Kotlin wasn't included, I guess it does share similarities with Java and Go but it's implementation of Coroutines is supposed to be different from that in Go. I guess testing it would also have to include both JVM and Native compile targets because you never know.

    • @avalagum7957
      @avalagum7957 Před rokem +6

      If you include kotlinx library, you should add Scala Actor, ZIO ... too.

    • @DeliOZzz
      @DeliOZzz Před rokem +5

      @@avalagum7957 suspend keyword and channels are part of the standard kotlin library. Coroutines package includes coroutines' builders and stuff like flows.
      For some reason Prime just ingores Kotlin whatsoever :/ But i'd really like to watch some quality kotlin roast.

    • @sharkpyro93
      @sharkpyro93 Před 11 měsíci +7

      @@DeliOZzz cause its not a popular choice for backends, alot of people still thinks kotlin is only for android, im afraid this stigma will stick around for the time being

    • @AlanPCS
      @AlanPCS Před 6 měsíci +4

      It runs in the same VM. At most it would be equal to a competent implementation in Java only.

  • @dipi71
    @dipi71 Před rokem +3

    Erlang, a language used in telecommunications, still seems to be the concurrency champion (according to a book by Röhrl and Schmiedl called »Produktiver programmieren«, I've read it in German a while ago).

  • @baxiry.
    @baxiry. Před rokem +9

    There is some important information not mentioned in the article
    Goroutines are compared to threads, either real or virtual.
    It is not compared to event loop
    Go has event loop libraries
    As long as the author of the article used the event loop in other languages, he should use it in Go as well in order for the comparison to be unbiased.
    Other information:
    The advantage of goroutines over threads is that it is portable. It does not depend on the operating system. If your application requires on-the-metal operation such as chips or microcontrollers that do not have an operating system, a goroutine can be run.
    With threads it is not possible. Because the language is not the one who does the job but the operating system. And where there is no operating system, there are no threads.
    One last thing
    When an application uses system threads, the system will reserve memory. The question is: Did the author of the article calculate the memory reserved by the system ??

  • @om3galul989
    @om3galul989 Před 11 měsíci +3

    yea node example is not spawning threads, it's just placing tasks on the timeout callback queue of the eventloop to be executed later using the main thread.

  • @jonstewart5525
    @jonstewart5525 Před 3 měsíci +1

    Since this is a Linux system it’s using the completely fair scheduler (cfs) which means each thread runs at the same priority (as apposed to the mlfq (multilevel feedback queue) that windows uses). The issue then is that the OS is processing at the same priority as each of the threads created so the computer just freezes up. There’s also a minimum time spent in each thread so you rarely get to execute an action.

  • @nelsonoussahsigha1300
    @nelsonoussahsigha1300 Před rokem +1

    yes he could've use worker to create thread for concurrent task, by using settimeout you're still mono thread so all those setimeout will be queued inside the callback queue

  • @peppybocan
    @peppybocan Před rokem +43

    So this article is definitely comparing apples to oranges - light threads/proper threads and runtime limitations.
    Go has support for parallelism, but it will only allocate as many threads as there are CPU processors (see GOMAXPROCS env variable) and on those the runtime scheduler runs these tasks.
    Python with its notorious GIL (Global Interpreter Lock) is the main bottleneck, though not visible in this flawed benchmark, as the threads themselves are not doing anything, this looks fine until you actually need to run some code. So Python would very likely burn in throughput benchmark, regardless of the number of threads. (See Python's sys.setswitchinterval).
    NodeJS, as The Prime mentioned, again, massive event loop and timers on it. If you do a computationally heavy work on it, your one poor CPU will go into early retirement....

    • @daasdingo
      @daasdingo Před rokem +1

      The article was using the single-threaded event loop in Python.

    • @peppybocan
      @peppybocan Před rokem

      @@daasdingo still wrong though.

    • @mennovanlavieren3885
      @mennovanlavieren3885 Před rokem +2

      I concur. With IO heavy tasks the NodeJs event loop is okay, and keeps your programming model simple. With computational work you need to use workers on NodeJs as per NodeJs documentation itself. And even with IO tasks you should not use one Node process on a gazillion core machine.
      Also, not all light thread implementtions (hate the word green in this context. Green, in practice, means illogically wasteful in the name of virtue signaling) offer the same features out of the box.

    • @ddomen9488
      @ddomen9488 Před měsícem

      ​@@daasdingoalso in nodejs since promises are not actual threads

  • @smallfox8623
    @smallfox8623 Před rokem +82

    i'm ready for the C# arc let's go, it has a really bad reputation that is totally undeserved these days

    • @reddragon2358
      @reddragon2358 Před rokem +6

      True.

    • @MH_VOID
      @MH_VOID Před rokem +1

      My personal hate for it came from the pain of trying to use it in my SW dev course on linux compared to those windoze fags who have first class support for everything, and from missing a bunch of the things I love about Rust when doing C# (e.g. immutable by default, f, u, i (though byte is fine and I guess using "long", "short", etc. isn't really bad. more just personal preference and more efficient), match, traits, enums, macros! True some of these stuff are to a decent extent available in C#, but the.. culture doesn't use them primarily like Rust does). But the language itself genuinely looks pretty nice, and has some nice features and shit even over Rust. I'm definitely comfortable calling the language "better Java", and would be okay programming in it professionally or even hobbyistically.

    • @reddragon2358
      @reddragon2358 Před rokem +1

      @@MH_VOID Yeah. Rust is very intriguing language (excluding the dramas and BS). Also things should be a lot better than before. Although there still is some windows/Microsoft bias in the language.

    • @sohn7767
      @sohn7767 Před rokem +18

      I think C# is great honestly. Not the best in anything, but it’s good in many areas

    • @reddragon2358
      @reddragon2358 Před rokem

      @@sohn7767 Yeah agree. And I think that it is its main strength. That it can be used for everything.

  • @mattymerr701
    @mattymerr701 Před 10 měsíci +1

    C# uses loads of thread pools and I think the issue is they likely didnt trim the assemblies etc so it kept a bunch of unused crap

  • @Overminddl1
    @Overminddl1 Před rokem +1

    I'm also curious how OCaml's task library would go, as well as rust using a future joiner instead of full tasks just for curiosity, lol

  • @quachhengtony7651
    @quachhengtony7651 Před rokem +10

    C# fan bois are eating good these days

  • @andzagorulko
    @andzagorulko Před rokem +14

    C# has threads. Benchmarking Tasks instead is just confusing, because those aren't theads.

    • @pavelyeremenko4640
      @pavelyeremenko4640 Před rokem +3

      As you may have noticed, he's benchmarking green threads(tasks in c#, goroutines in go, etc.) across the languages.

    • @carlinhos10002
      @carlinhos10002 Před rokem +5

      C# does not have green threads. Tasks are not green threads

    • @pavelyeremenko4640
      @pavelyeremenko4640 Před rokem

      ​@@carlinhos10002 Now that I've re-read the definition of green threads, I'm not sure how they aren't. They are not OS managed. They are lightweight thread-like primitives managed by the runtime. What are they missing?
      Wikipedia also lists them as such on en.wikipedia.org/wiki/Green_thread
      Not sure if this is as important though, every language in the lists was using their concurrency primitive built on top of some managed pool anyway.

    • @metaltyphoon
      @metaltyphoon Před rokem +1

      @@pavelyeremenko4640 he’s just making things up. Most implementations are using some abstraction over OS thread. Only one of Java and Rust versions dont do that.

    • @zephyrprime
      @zephyrprime Před 26 dny

      C# tasks use a threadpool to execute. But one thread can have multiple tasks waiting simultaneously and the code this guy used had each thread sleeping for several seconds

  • @edino1981
    @edino1981 Před 7 měsíci +1

    It seems to me that C# sample is not tested in the release mode but in debug mode, so memory consumption should be smaller as tasks are just light definitions of work that are executed on thread pool.

  • @thekwoka4707
    @thekwoka4707 Před rokem +2

    Why were they using the newest rust from last month and nodejs from like 4 years ago? Like AWS doesn't support the version they used. Or 3 major verisons after it.

  • @robfielding8566
    @robfielding8566 Před 11 měsíci +16

    Go is definitely not a memory hog; at least for IO-intensive tasks. The main thing is that the Go libraries are always very careful to stream large inputs; rather than buffer them in memory. Java itself doesn't really have major memory issues beyond spawning threads; but in any large Java project, the code will be full of things being buffered into arrays, rather than being streamed. I tried rewriting netty to make it stop doing dumb things; and just switched (permanently) to Go. Part of Java's program is also the legal issues of shipping a JVM; and the existence of Oracle thumb-breakers and lawyers; to come punish you for shipping.

  • @vighnesh153
    @vighnesh153 Před rokem +5

    More interested in seeing Nodejs 20 with worker threads as they claim that there is a lot of perf improvements in Node 20

  • @rahulagarwal968
    @rahulagarwal968 Před 11 měsíci

    For building the backend for a Flutter application or any frontend. Which server side language will you prefer : Go or Node js ?

  • @paklenizmaj
    @paklenizmaj Před rokem +1

    I believe that in the java example, the program will "block" on the first unfinished thread, and when that thread finishes and the dispatcher returns execution to the main thread, the for loop will "fly" to the next unfinished thread and then hand over execution to the next thread.
    As the dispatcher flags the thread when it is finished, the join method simply switches (do not block, just switch) the thread if the finished flag is false. So there is no penalty.

    • @RichardKures
      @RichardKures Před rokem

      The code in java could be done much better:
      try (varexecutor = Executors.newVirtualThreadPerTaskExecutor()) {
      for (int i=0; i {
      try {
      TimeUnit.SECONDS.sleep(10);
      } catch (InterruptedException e) {
      Thread.currentThread().interrupt();
      }
      });
      }
      }

    • @paklenizmaj
      @paklenizmaj Před rokem

      ​@@RichardKures Thread pools are great if you don't need long running tasks, if you need long running sockets or drawing gui in a loop you need to use raw threads. It's not just for Java but for any language. Thread pools create a small number of threads and when a task completes, the new task merges with the previous one, so there is no execution on the new task until the first task completes.
      Thread pools are for (parallel) computation, not for long-running tasks.

  • @insylogo
    @insylogo Před rokem +3

    AOT and tree shaking business has come a long way with c#. I would assume actual minimums an order of magnitude or less, but he did say default release configurations.

  • @iforgot669
    @iforgot669 Před rokem +14

    C# now has native aot and would have significantly improved the memory footprint of this

    • @SurvivalGamingyt
      @SurvivalGamingyt Před rokem +8

      Yeah, 7,4mb for just a standalone release mode app.

    • @sgbench
      @sgbench Před rokem +1

      Also trimming

    • @FilipCordas
      @FilipCordas Před 10 měsíci +1

      @@sgbench ValueTasks and adding a buffer size to the list will help.

    • @CeleChaudary
      @CeleChaudary Před 4 měsíci

      @@FilipCordas That's a good point

  • @sciencefirefly837
    @sciencefirefly837 Před 6 měsíci

    Does it also not depend on the type of task which is executed? Usually, it should be some validations and a CRUD in DB.

  • @awilliamwest
    @awilliamwest Před 9 měsíci +2

    I'm sad for F#. Interesting to see PrimaGen and others re-excited about OCaml, and perhaps the 5.0 release is one reason, but I was an F# fanatic for several years, and just returned to F# for a recent small project. (I *try* to choose Rust for new projects, but got frustrated with Rust's lack of a REPL and wanted to use IonIDE in VS Code for my small project (involving parsing XLS and zips of text files); sometimes it's more about the tooling/IDE than it is the language...) C#'s good performance here makes me think F# might also perform equally well; .NET has continued to make impressive optimizations.

  • @kooraiber
    @kooraiber Před rokem +13

    My man hates C# so much, it's hilarious! To be fair though I agree with everything you said and would love to see your benchmarks about this topic.

    • @sanjayidpuganti
      @sanjayidpuganti Před rokem +15

      ​@@cethienI love C# but hate MS. I use Rider and Linux to code in my personal time and I like it. I think it's very good for API development.

    • @DaddyFrosty
      @DaddyFrosty Před rokem +4

      @@cethien VS sucks, Rider rules. I do also hate Microsoft but it’s a good language nonetheless

    • @pavelyeremenko4640
      @pavelyeremenko4640 Před rokem +1

      @@cethien I've been developing c# on linux and macos for a couple of years now using Rider (I just like it more but the Visual Studio is also fully cross platform).
      I don't personally enjoy the language as much nowadays but the tooling is great whatever platform you pick.

    • @DaddyFrosty
      @DaddyFrosty Před rokem

      @@pavelyeremenko4640 last time I used visual studio on mac it was only for Xamarin

    • @ko-Daegu
      @ko-Daegu Před rokem

      @@cethien I loooove writing Razor components 🤓
      // MyComponent.razor
      @using Microsoft.AspNetCore.Components
      @Title
      @Message
      @code {
      [Parameter]
      public string Title { get; set; }
      [Parameter]
      public string Message { get; set; }
      }
      the fuck is this shit

  • @boredstudent9468
    @boredstudent9468 Před rokem +8

    He said he launched 1 Task, as soon as you start one async task C# (in .NET 6) already sets up all the thread pool stuff and Access control. For such simple instances you should use threads in C#. Afaik it greatly improved with .NET 7. But in exchange you are prepared to scale incredibly, also yeah the .NET runtime does some incredible smart magic in the background, e.g. have a looked at LINQ performance in .NET 7.

    • @metaltyphoon
      @metaltyphoon Před rokem

      CAS is not a thing anymore in dotnet core world.

    • @sgbench
      @sgbench Před rokem +1

      @@metaltyphoon CAS?

    • @rroscop
      @rroscop Před rokem

      Can you really run 1 million C# threads?

    • @boredstudent9468
      @boredstudent9468 Před rokem +1

      @@rroscop on my hardware no problemo, remember that they are way more like go routines than like hardware threads, so only a dozen is actually working in parallel, the rest is just queued.

    • @rroscop
      @rroscop Před rokem

      @@boredstudent9468 nice. Are you talking about System.Threading.Thread's? Or tasks run via Task.Run()?
      my understanding was that Task.Run() used a thread pool under the hood, but real Threads were more heavyweight. I'm not a C# developer though, just dabbled

  • @geraldmaale
    @geraldmaale Před 11 měsíci

    I am interested in finding out what tool this person used to measure the memory usage for the C# part, as these results appear to be questionable.

  • @bentels5340
    @bentels5340 Před rokem

    Quick correction regarding the Java remark: virtual threads are not a preview in 21, they are done. What *is* a preview is structured concurrency, which handles thread-spawn and rejoin more elegantly.

  • @remrevo3944
    @remrevo3944 Před rokem +9

    12:30 Per default tokio creates worker threads equal to the amount of cpu cores.
    Though thinking about it, if you only use timers having a single threaded runtime would likely be just as fast and more efficient.

    • @llothar68
      @llothar68 Před rokem +1

      Not a good choice. You often have long running threads that also do block. In fact all the systems where the kernel is not controlling the worker threads sucks. This means: Linux,Android and the BSDs. The other systems have kernel driven thread pools for much better handling making sure that IO blocks don't prevent utilisation.

    • @remrevo3944
      @remrevo3944 Před rokem +1

      ​ @llothar68 I explicitly meant that for the case of using only timers, which are neither cpu intensive nor use blocking APIs.
      When using a async runtime like tokio you shouldn't use blocking APIs anyway and if you have to there is tokio::spawn_blocking, which spawns a thread/uses a thread pool.

  • @shayvt
    @shayvt Před rokem +4

    C# Task is an abstraction using the threadpool. He should use the Thread class which instantiates a real thread.

    • @DarkOoze123
      @DarkOoze123 Před rokem

      *managed thread

    • @LuaanTi
      @LuaanTi Před rokem +2

      No, C# Task implies no threads whatsoever. It uses the thread pool by default for CPU work, yes, but that can easily be just the part of the job that says "this task is finished" (e.g. handling the async I/O response).
      Creating an explicit thread (_not_ a hardware thread, _not_ an OS thread - you don't have control over those natively in .NET) is something completely different, and very rarely used in modern C#. It negates the whole point of using asynchronous I/O in the first place, which is avoiding the overhead of threads that do nothing but wait for something to complete (whether that's a timer or a HTTP request). Which, let's not forget, was part of the point of the original article - showing how expensive "real" threads are, and that different approaches to handling asynchronous code have vastly different results.
      But that article is very flawed anyway. It would make sense to compare multi-threaded code with other ways of doing asynchronous I/O... but instead, we get an arbitrary choice of one or the other for each platform. You can have promises in any language. Many have commonly used or outright built-in APIs for that. Seeing the difference between, say, Java threads and Java Futures would be a bit illuminating, at least... though it still needs to be noted that you have a lot of control over things that absolutely crush this comparison anyway. The default stack size of a new thread on modern .NET is usually 1 MiB. Windows doesn't really allow you to go very small with thread stack sizes (you're supposed to use a few threads, not thousands). Linux is designed around multiple processes/threads using the same memory for as long as possible, so a thousand threads each with 1 MiB memory can actually occupy just a few megabytes (until you actually start to modify the memory).
      Every performance benchmarks needs to have a goal. This one doesn't really seem to have one, apart from a simplistic "weird that memory usage in async stuff can vary wildly"... I mean, pretty much every platform out there allows you to pre-allocate as much unused memory as you want, but it'd be a weird way to compare different platforms, right?

  • @indramal
    @indramal Před 11 měsíci

    So what is final choose for high traffic? does it only need memory consideration? number of concurrent connection also matter.

  • @robertwhite3503
    @robertwhite3503 Před rokem

    I was taught at primary school that infinity was the same as infinity plus one. However at college was taught about "marking". If you match all the numbers from one to infinity, then plus one is longer. So at eight years old I was being mis-taught maths?

  • @c4ashley
    @c4ashley Před rokem +6

    The name is the C-sharpagen.

  • @erickmoya1401
    @erickmoya1401 Před rokem +4

    My wife says you yell too much. I tried to prove she is wrong.
    My argument didnt last a second.

  • @wdavid3116
    @wdavid3116 Před rokem +1

    I don't think the thread joins are actually an issue. All that is being measured is memory. The time cost would be real but if you actually have to wait on all those threads the order shouldn't be very meaningful and to get any sort of speedup you'd need an os that supports joining multiple threads at once or you'd have to do something more elaborate to make use of some sort of multiple message capability in the kernel (maybe something with epoll?) If you're waiting on thread 0 and thread 1 quits you'll be sleeping in thread 0 while other threads use the CPU to finish and then once the thread you're joining on ends you'll burn through the finished threads and then repeat the sleep as needed. Syscalls are expensive but not *that* expensive.

  • @luvincste
    @luvincste Před rokem

    due to some bugs in services at work they spawned threads and didn't dispose them, and had easily 100_000 threads on an older windows, like server 2008; when it happened i had to restart it, though, or didn't work well after

  • @_daniel.w
    @_daniel.w Před rokem +5

    I'm curious about C, C++ & Zig.
    Also, I love Go. What happened, why did it end up using so much memory? Kinda sucks

    • @_daniel.w
      @_daniel.w Před rokem

      @nósferratu Oh, alright.
      I was watching chat go by and someone mentioned Go is stackbased or something along those lines.
      Thanks for the info 👍

    • @hvaghani
      @hvaghani Před rokem

      ​@nósferratu right I was going to comment the same and found this

    • @scotter7663
      @scotter7663 Před rokem

      The C# implementation is completely bogus compared to the others. It's using a small thread pool (task.run) to set a bunch of timers (task.delay) that's why it shows low memory usage. This is not demonstrating concurrency.
      If the implementation did a thread. sleep or used real threads the results would be completely different and probably worse than Java since C# doesn't have virtual threads.
      In the real world Go runtimes will have considerably less memory overhead than C# or Java

    • @scotter7663
      @scotter7663 Před rokem

      ​@@_daniel.w Go has a delay() function that looks similar to what's used in the C# impl. Rework the Go implementation to use this and I suspect it will perform drastically better

  • @reddragon2358
    @reddragon2358 Před rokem +9

    Let's go C#

  • @TheSwissGabber
    @TheSwissGabber Před 8 měsíci

    in python there is asyncio, thread and multiprocessing. ordered according to their overhead. if you want to use multiple cores you need multiprocessing.

  • @RoccoWocco
    @RoccoWocco Před rokem +1

    C# has a parallel for and foreach for these types of scenarios. You can tell it the degree of parallelism and it'll just do it for you. In no scenario is the way shown in the article correct. That's an anti pattern in 99% of cases. If you do want to do async in your parallel code then there are async versions of the parallel loops.
    You could also just manually make threads

  • @kellybmackenzie
    @kellybmackenzie Před rokem +4

    I would have loved to see Haskell tested like this, it'd be so good

    • @FinnBender
      @FinnBender Před rokem +3

      It's surprisingly bad :(
      1 thread: 5.0 MB
      10 threads: 4.9 MB
      100 threads: 4.9 MB
      1k threads: 8.3 MB
      10k threads: 63.1 MB
      100k threads: 803.8 MB

    • @kellybmackenzie
      @kellybmackenzie Před rokem

      @@FinnBender Aww man! Yeah, that makes sense, Haskell is infamous for its high memory consumption because of thunks and stuff like that. I'm surprised it's that bad for 100k though, damnnn!

  • @casperes0912
    @casperes0912 Před rokem +12

    I will most likely need to use C# as my primary language at my next job

  • @Deletedeletedelete
    @Deletedeletedelete Před 9 měsíci

    Good content. Talented dude!)

  • @nyahhbinghi
    @nyahhbinghi Před 5 měsíci +2

    If you are creating a new Elixir "process" per task it will scale up pretty linearly with the number of tasks, hence why it's high. High memory usage is not really a bad thing, perse. Likewise, the same with Go and goroutines, whereas other runtimes with a fixed threadpool or Node.js with it's single event loop won't keep climbing linearly. I would be more interested in CPU usage. You're welcome for this insight! 🤜🤛

    • @pdgiddie
      @pdgiddie Před 4 měsíci +1

      This. The BEAM VM was designed to prioritise latency and predictable scalability. Copy-on-write and other memory consumption optimisations can produce latency spikes.

  • @SharunKumar
    @SharunKumar Před rokem +7

    I wanna see Nick Chapsas's reaction on this 🤣

  • @R4ngeR4pidz
    @R4ngeR4pidz Před rokem +28

    You're 100% right about the complexity of the task.
    But also, I would have stopped reading after they said they used ChatGPT to come up with the code.
    You need to have these contributed by people that actually write this language and that actually understand this language.
    The ambiguity between what the code was actually doing in all of these was horrible, as other commenters have also pointed out.

  • @Zooiest
    @Zooiest Před rokem

    Well, technically, JS structs can take up as few bytes as any other language, as long as you ignore the sizes of serialization/deserialization definitions and only care about the size of the ArrayBuffer you put data in

  • @berkormanli
    @berkormanli Před 10 měsíci +1

    I think you are right on this benchmark is badly executed, I'm sure those test could be better for some of these languages. I'm working with Lua right now and Lua has coroutines, which I think is exceptional and implemented quite good because it never failed me. But as I gained experience working with Lua I managed to reduce both memory consumption and runtime of my programs. I'd like to see some kind of benchmark from you, that'll be awesome!

  • @autismspirit
    @autismspirit Před rokem +56

    tbh the C# number kind of makes sense, it scales incredibly well, especially in later .NET versions. Some C#-based fancy Unity optimizations can beat out GCC in raw speed and memory.

    • @autismspirit
      @autismspirit Před rokem +6

      Granted, there is probably some optimization going on in Release mode, since it's not doing anything. I'd expect the memory consumption to be higher, but not 4GB high.

    • @marcossidoruk8033
      @marcossidoruk8033 Před rokem +11

      What do you mean by "beating GCC" last I checked GCC was a compiler.

    • @CorvinhoDoMal
      @CorvinhoDoMal Před rokem +6

      ​@@marcossidoruk8033 yeah, the optimizations are made by the compiler. He meant the C language, but specifically with GCC. If you used the microsoft compiler or other options you would have different performances.

    • @marcossidoruk8033
      @marcossidoruk8033 Před rokem +16

      ​​​@@CorvinhoDoMal No way C# is going to beat carefully written C code in any imaginable benchmark ever, its just impossible.
      Plus what he said makes no sense, "unity optimizations" how do you compare C# unity performance with C unity performance if you can't do unity scripts in C? Am I going crazy or what.
      And if he means the engine that is written almost in its entirety in C++

    • @janus798
      @janus798 Před rokem +13

      @@marcossidoruk8033 Google the Unity Burst compiler. Faster than GCC in fibonacci and NBody simulation.

  • @maxharmony6994
    @maxharmony6994 Před rokem +5

    Now imagine giving Tom a C#

  • @dickheadrecs
    @dickheadrecs Před rokem +2

    how many threads can you handle?
    GO: “yes”

  • @memespdf
    @memespdf Před 11 měsíci +1

    Ironically, I think it would make sense to start all programs by allocating a static 1GB of memory and keeping it around at the end. This ensures that no preallocated memory can be used

  • @joejazdzewski
    @joejazdzewski Před rokem +4

    Prime will now worship at the altar of Anders (creator of C# and Typescript) /s

  • @quachhengtony7651
    @quachhengtony7651 Před rokem +5

    Let's rewrite Elasticsearch, Kafka, and Cassandra in C# and get free performance

  • @rian7079
    @rian7079 Před rokem

    10:29 I do my undergrad thesis on a bunch of language, including C#, the memory footprint for my program is only 33 MB running on raspberry pi. I don't know wth is going on for C# to have base memory footprint to exceed 30 MB on x86 architecture. Yes, my program includes async process too, and not only waiting for 10 seconds and do nothing

  • @zolniu
    @zolniu Před měsícem

    In C# when you use Tasks with async/await, the default implementation creates a state machine that uses pre-existing thread pool to schedule execution of your tasks on the threads in the thread pool. Not only that, but it can even detect if the task in the thread is small enough to be executed synchronously - in that case it won't even end up in the thread pool - it will just execute and return as normal function call.
    To test how much memory threads consume in C#, you can't use Tasks with async/await - you have to use Thread class directly - that way you circumvent all of the optimalizations done in the runtime and in the Tasks scheduler.

  • @ringishpil
    @ringishpil Před rokem +24

    Go's minimum stack size is (I think) 4KB per Goroutine and it grows/shrinks as needed. Not sure whats the minimum stack size. Therefore the ~2GBs in Go is not surprising. So in 3GB of memory, you can put 1mil/10mil and probably even 20/30 million goroutines, they will just shrink in size. You can probably with the example from Piotr do even more, since it's a very simple non-memory consuming routines. But as I said, not sure whats the minimum stack size that will be consumed by a gorutine. But its less then 4KB for sure (in your example 2.8GB/1_000_000 = 2.8KB). My guess is that is not shrinking even less than this since there is enough memory available.
    Anyway you put it nicely, this is not a real world test, TCP/Websocket connection would be much better

    • @Rakstawr
      @Rakstawr Před 8 měsíci +1

      Go test here was completely misrepresented by non optimized garbage collection settings and not profiling how much of that was colored for deletion.

  • @tecoberg
    @tecoberg Před měsícem +3

    Where is C++?

  • @HotakaPeter
    @HotakaPeter Před 7 měsíci

    Elixir/Erlang have a lot of services running by default. These can be optimised in the Erlang boot script.

  • @basimal-jawahery5688
    @basimal-jawahery5688 Před měsícem

    Awesome!! :)) extremely funny :) thanks for the video :)

  • @alxizr
    @alxizr Před rokem +3

    The nodejs example is off point. You need to choose worker threads for staying in line with all of the other examples.
    The same goes for the Python AsyncIO example.

  • @istovall2624
    @istovall2624 Před rokem +4

    C# to the moon! Havent finished yet. Drum roll.

  • @sikor02
    @sikor02 Před 9 měsíci +1

    If C# has memory available it will swallow a lot for optimizations. Once i experimented with docker and performance tested my simple api endpoint with Bombardier (tool written in GO) - bombarding it with thousands of requests. My app used 1.5 gig of ram (!). But then I started limiting my container's available memory (-m parameter), and guess what, I went down to 15 MB and still worked. GO equivalent required at least 16 megs to work. The C# API with so little memory available performed almost the same as when using 1.5 GB anyway. (The GO was like 2% faster though, not gonna lie)

  • @guilucasds
    @guilucasds Před 10 měsíci +1

    I really think the threads are not even being spanned in parallel in several programs. Problem is that this is being treating exceptions poorly, so you don't even know it breaks.

  • @tedchirvasiu
    @tedchirvasiu Před rokem +4

    Is this the first time in history he turned off the notifications before starting the video?

  • @nacholopezosa
    @nacholopezosa Před rokem +7

    ☝🤓Buzz may be going from aleph-zero to aleph-one infinity. So to infinity and beyond could be correct

  • @ingenium1502
    @ingenium1502 Před rokem

    Yes we would like to know about socket and tcp connection test. Thx for video😀

  • @Hector-bj3ls
    @Hector-bj3ls Před měsícem

    In Rust, the default stack size for an OS thread on all tier 1 platforms is 2MB. Not sure if it's allocated up front, but that's probably something to do with when all the memory went.