Why thread pools even exist? and how to implement them?

Sdílet
Vložit
  • čas přidán 7. 02. 2024
  • System Design for SDE-2 and above: arpitbhayani.me/masterclass
    System Design for Beginners: arpitbhayani.me/sys-design
    Redis Internals: arpitbhayani.me/redis
    Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.
    Sign up and get 40% off - app.codecrafters.io/join?via=...
    In the video, I explained the importance of thread pools in managing multiple concurrent requests efficiently without overwhelming the system. I discussed the drawbacks of spinning up new threads for each request and how thread pools help mitigate these issues by capping the maximum number of threads. I highlighted real-world use cases and provided a quick prototype implementation in Go. Tuning thread pool size based on hardware and workload characteristics was emphasized, along with practical tips for implementation. Watch the video for a deeper understanding of thread pools and their implementation.
    Recommended videos and playlists
    If you liked this video, you will find the following videos and playlists helpful
    System Design: • PostgreSQL connection ...
    Designing Microservices: • Advantages of adopting...
    Database Engineering: • How nested loop, hash,...
    Concurrency In-depth: • How to write efficient...
    Research paper dissections: • The Google File System...
    Outage Dissections: • Dissecting GitHub Outa...
    Hash Table Internals: • Internal Structure of ...
    Bittorrent Internals: • Introduction to BitTor...
    Things you will find amusing
    Knowledge Base: arpitbhayani.me/knowledge-base
    Bookshelf: arpitbhayani.me/bookshelf
    Papershelf: arpitbhayani.me/papershelf
    Other socials
    I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.
    LinkedIn: / arpitbhayani
    Twitter: / arpit_bhayani
    Weekly Newsletter: arpit.substack.com
    Thank you for watching and supporting! it means a ton.
    I am on a mission to bring out the best engineering stories from around the world and make you all fall in
    love with engineering. If you resonate with this then follow along, I always keep it no-fluff.
  • Věda a technologie

Komentáře • 72

  • @AsliEngineering
    @AsliEngineering  Před 4 měsíci +7

    What all topics do you want me to cover? Do add them as a reply to this comment ⚡

    • @PrashantKumar-zz5zc
      @PrashantKumar-zz5zc Před 4 měsíci +5

      Oauth

    • @codingvibes3537
      @codingvibes3537 Před 4 měsíci +2

      Great explanation sir!! Please make the next video on internals of redis database. Very much curious in deeping down on this topic with you.

    • @arpanmukherjee4625
      @arpanmukherjee4625 Před 4 měsíci +3

      Implementing a SSTable based data storage layer from scratch.

    • @dharsanr6504
      @dharsanr6504 Před 4 měsíci +1

      Sir if you can please make a video series on building real word projects from scratch ( like the one's which are there in BUILD YOUR OWN X repo)
      Eg :
      * creating our own compiler
      * Git
      * HTTP server
      * Network application without libraries

    • @_ipublic
      @_ipublic Před 4 měsíci

      Can you bring more such contents on threads considering more bottleneck problems. thanks in advance!

  • @reudrax
    @reudrax Před 4 měsíci +6

    This concept is same as connection pooling. saving the time to make connections. putting a limit on numbers of concurrent db connections that can be made, thus handling the load effectively

  • @Chicken_Soy
    @Chicken_Soy Před měsícem

    Thank you for this insightful explanation Mr armpit

  • @shawkishor
    @shawkishor Před 3 měsíci +16

    In one of the faangs, there was an interview question: one method (class method/REST call etc) is being called by 1 million users, all users get the response at no delay. What is happening behind the scene?how come that method is getting executed million times concurrently.

    • @SampathKumarKamati
      @SampathKumarKamati Před 3 měsíci +5

      There must be more than one server handling the requests that are distributed by a load balancer

    • @shawkishor
      @shawkishor Před 3 měsíci +3

      @@SampathKumarKamati given the same answer even threadpools on web server too. Interviewer not satisfied

    • @freddyv5353
      @freddyv5353 Před 3 měsíci +3

      response at no delay? that’s impossible. did you ask if the response was valid? they may all be getting errors

    • @shawkishor
      @shawkishor Před 3 měsíci

      @@freddyv5353 how? Suppose thousands of users are ordering different products at the same time. I have an order create API that is handling the requests concurrently.

    • @sasha751
      @sasha751 Před 3 měsíci +8

      Resource is cached 😂

  • @burionyt
    @burionyt Před 3 měsíci +2

    Hey awesome video but here is a tip on "thread pool in Golang".
    In go thread pool is an anti-pattern and you should generally use semaphores instead. One can implement a basic semaphore using a buffered channel. There is a weighted semaphore implementation in the golang/x pkg if you wanna checkout or could make one yourself(I generally do this)

  • @jspnser
    @jspnser Před měsícem +1

    For the topics in future: I wish you could explain internals of go routines more in depth, (why they're light weight etc), thank you for the video

  • @LeoLeo-nx5gi
    @LeoLeo-nx5gi Před 4 měsíci +6

    A very IMP question I had.
    1. So we know that usually when the run() method of a thread is executed, the thread dies automatically after it's work is done.
    2. Then why is it said that we put the same thread back to the pool or how does ThreadPool does that (i.e reuse the same thread), I think we are again creating a new thread right? and putting that back to the pool? and not reusing the original thread (which did the earlier task).
    Would be great if you could guide here. Thanks for the knowledge share !!

    • @AjayKumar2
      @AjayKumar2 Před 4 měsíci +4

      You are correct in saying that a thread dies after returning from the run() method. The key here is to never return from the run() method. This is an implementation detail of the ThreadPool class (in Java context). A common pattern is to maintain an array of threads (worker threads) and read tasks from a queue. When a new task arrives, a thread from the pool is assigned to it if it's available else the task waits. As soon as the thread is done executing the task, it goes back to the queue to get a new task else waits for it to arrive.
      This way, the lifecycle of the worker threads are maintained by the owning ThreadPool object which is executed in the main thread. Worker threads are alive as long as the ThreadPool object is alive.

  • @t1m3__
    @t1m3__ Před 3 měsíci

    great video, thanks!

  • @da5m0n
    @da5m0n Před 4 měsíci

    For our server example there is also another way to handle client, epoll/kqueue, so my question is what is the tradeoff of using thread pool over epoll or vice versa? Specially for the I/O bound task.

  • @keerthi1070
    @keerthi1070 Před 4 měsíci +9

    Another main reason for thread pool is "creating new thread and deleting then itself is very costly in terms of CPU".
    Also what is the max thread for a n-core cpu ?

    • @hiteshbitscs
      @hiteshbitscs Před 4 měsíci +1

      In my experience in special VM means your workload is running.. I would say 2 * n should be size of thread pool... the reason is each core support 2 thread due to hyper threading concepts in modern chips.... so ideally across your application number of threads should not exceed 4 * n cores... take with a pinch of salt!!!

    • @durgaprasadreddy5033
      @durgaprasadreddy5033 Před 4 měsíci

      What is costly?

  • @BhanuReddyputtareddy
    @BhanuReddyputtareddy Před 4 měsíci +1

    What is the font you have used in vscode looks nice

  • @_sudipidus_
    @_sudipidus_ Před 4 měsíci +2

    just a nitpick: you demonstrated with an example of go routines which are not actually threads (even using just go routines without keeping a worker pool would perform better than threads)
    but on a positive side, this was more about generic pooling concept (pooling of threads, HTTP connections, db connections or anything whose creation/bookkeeping is expensive)

    • @AsliEngineering
      @AsliEngineering  Před 4 měsíci +4

      Yes. I tried to keep it slightly simple. If I would have covered the specifics, it would have repelled a lot of folks 😅

  • @durgeshkhot7811
    @durgeshkhot7811 Před 2 měsíci +1

    underastood, by your video

  • @hiteshbitscs
    @hiteshbitscs Před 4 měsíci +1

    You mentioned for network bound workload we can have more threads, for CPU heavy workload we have few threads... Isn't it should be opposite?
    1. For network I/O workload because you wait more on network packet to send/arrive due to it's millisecond latency.. we can have few threads. so that thread has works rather than doing short stuff and wait
    2. For CPU heavy workload : because you have more works we have more threads.. (Here I guess in hyper threading each core has 2 threads per core.. hence max we can go is approx/. 2 *number of cores) at max...

  • @joelphilip2942
    @joelphilip2942 Před 4 měsíci +1

    if thread pool is 5 and no of tasks > 5, doesn't it execute parallelly (not concurrently) ?

  • @FaintArt
    @FaintArt Před 3 měsíci +2

    that's a damn good handwriting

  • @MarathiNationOne
    @MarathiNationOne Před 4 měsíci

    If cpu cores are 4 and threads are 10 is it worth? Or can we have at most treads equals to no of cores

  • @malaypatel4014
    @malaypatel4014 Před 3 měsíci

    One of the best content 👌

  • @neo_otaku_gamer
    @neo_otaku_gamer Před 4 měsíci

    Thank you for explaining the concept on thread pool. How can we implement similar concept in python, such as server less functions on cloud. Python inherently doesn't have support for threads and thread pool due to Gil.

  • @user-yf1ge4mj4m
    @user-yf1ge4mj4m Před 4 měsíci

    Your videos are Netflix series to me❤

  • @mayankujawane3147
    @mayankujawane3147 Před 4 měsíci

    We have so many server side languages, so how to decide which language to choose. What are the factors which can help to make the decision. Mainly for Golang, Java, Javascript, Python, C#

    • @RahulSsup
      @RahulSsup Před 3 měsíci

      It’s a decision you make based on many factors : comfort-level of Developers, performance expectations, task domain etc.
      Personally, i think go gives a good balance between the two.
      I would choose Python, if the work is more data science focused.
      Javascript for Server-Side only if the developers are JavaScript lovers and cant touch anything else.

  • @sharoonaustin551
    @sharoonaustin551 Před 4 měsíci

    Isn't it something similar to the batch processing (Correct me If I am wrong) For example: let's say you have a slice which contains 40 elements and at a time you want to process 20 of them, so you'll fetch the first 20, iterate over them and keep adding to the wait group, and perform the task using go routines and outside the loop you added wg.wait(). This way you're creating 20 threads and running them for a batch isn't it?

    • @AsliEngineering
      @AsliEngineering  Před 4 měsíci +1

      You are assuming all tasks will take equal time to complete. Which might not be the case.

  • @undergraduate6050
    @undergraduate6050 Před 4 měsíci

    Programming language agnostic Concurrency control mechanism.

  • @sachinmalik5837
    @sachinmalik5837 Před 24 dny

    Hey Arpit, I know it a old video, but if you do see this and have a seconds to spare, I have a doubt.
    Why do we have a unbuffered channel, here like if we are adding jobs, to a workQueue, Should our channel have some buffer, in above code/scenario, are we blocking the putting of the job on workQueue, because it is unBuffered, so I add a job, I have to wait for some thread/go routine to pick up that job? before adding new Jobs to queue
    I am new to Go, and I am loving what you have to teach, Thanks

    • @AsliEngineering
      @AsliEngineering  Před 23 dny

      You can use an buffered channel here, but that would not change the concept or the approach. That still remains the same.
      I just kept things simpler for the demo :)

    • @sachinmalik5837
      @sachinmalik5837 Před 23 dny

      @@AsliEngineering Awesome. Thanks for reverting back.

  • @SurajYadav-yd2cu
    @SurajYadav-yd2cu Před 4 měsíci

    Great explanation ! Just wanted to ask , will the worker threads created in the pool keeps on looking for a task submitted ? Or they only get active when a task is submmited ?

    • @AsliEngineering
      @AsliEngineering  Před 4 měsíci

      they get active when there is work to do, otherwise they sleep. It is wither managed by OS if kernel thread, and by language runtime if user threads.

  • @Chakree45
    @Chakree45 Před 3 měsíci

    Aa golang is your fav language, why don't you upload golang related videos?

    • @AsliEngineering
      @AsliEngineering  Před 3 měsíci +1

      If by Golang videos you mean language tutorials, then I am not keen on doing it because I believe language learning can happen the quickest through books and blogs.
      Hence I do use Go in most of my examples to explain a concept or showcase how some problems can be solved.

    • @Chakree45
      @Chakree45 Před 3 měsíci

      @@AsliEngineering thanks for the reply, could you please suggest which books or blogs to follow to learn golang better and fast. Thanks in advance:)

    • @AsliEngineering
      @AsliEngineering  Před 3 měsíci +1

      @@Chakree45 I have the books I referred to for Go on my bookshelf ArpitBhayani.me/bookshelf

    • @Chakree45
      @Chakree45 Před 3 měsíci

      Thanks arpit and all d best to u🫂

    • @AsliEngineering
      @AsliEngineering  Před 3 měsíci

      @@Chakree45 thank you 🙌

  • @mallukittens177
    @mallukittens177 Před 4 měsíci

    Make golang tutorials

  • @t1m3__
    @t1m3__ Před 3 měsíci

    great video, thanks!