Guido van Rossum: Will Python ever remove the GIL? | Lex Fridman Podcast Clips
Vložit
- čas přidán 26. 11. 2022
- Lex Fridman Podcast full episode: • Guido van Rossum: Pyth...
Please support this podcast by checking out our sponsors:
- GiveDirectly: givedirectly.org/lex to get gift matched up to $1000
- Eight Sleep: www.eightsleep.com/lex to get special savings
- Fundrise: fundrise.com/lex
- InsideTracker: insidetracker.com/lex to get 20% off
- Athletic Greens: athleticgreens.com/lex to get 1 month of fish oil
GUEST BIO:
Guido van Rossum is the creator of Python programming language.
PODCAST INFO:
Podcast website: lexfridman.com/podcast
Apple Podcasts: apple.co/2lwqZIr
Spotify: spoti.fi/2nEwCF8
RSS: lexfridman.com/feed/podcast/
Full episodes playlist: • Lex Fridman Podcast
Clips playlist: • Lex Fridman Podcast Clips
SOCIAL:
- Twitter: / lexfridman
- LinkedIn: / lexfridman
- Facebook: / lexfridman
- Instagram: / lexfridman
- Medium: / lexfridman
- Reddit: / lexfridman
- Support on Patreon: / lexfridman - Věda a technologie
Full podcast episode: czcams.com/video/-DVyjdw4t9I/video.html
Lex Fridman podcast channel: czcams.com/users/lexfridman
Guest bio: Guido van Rossum is the creator of Python programming language.
News update - PEP 703 made the GIL optional in python 3.13 - It's happening...
Thank goodness
This has been my favorite pod cast by lex so far!
Many moons ago, I worked on a framework responsible for fetching hundreds of gigabytes of data from a database from around 100 tables everyday. Initial solution was written in Python using multi-threading but it was causing a lot of problems with things like the wrong data would go into the wrong containers so I tried multi-processing and it solved all the bugs we had.
I have written my own boilerplate classes to implement multiprocessing for various projects where real parallel work is needed--projects where I do not need to share data between processes (as you might with multithreading), such as mapping to a process pool a list of thousands of devices to connect to and collect data and write to the file system. Point is, there are very good use cases for the multiprocessing library. The GIL is not a problem in python when you can use multiprocessing.
it would be interesting to compare to nodejs which uses a single thread with cooperative multitasking.
Very helpful Lex.
I've never really looked into how python handles threading since I only ever use it for quick data analysis tasks, but it's interesting to find out that it's not parallelizable. Doesn't really change how I will use it, but does make me want to find out more about how the python interpreter works. Thanks for the upload!
It is parallelizable but via multiprocessing instead of multithreading. It’s not as efficient in terms of data sharing but it still makes use of all your cpu cores and depending on the task, the same speedup. Just a different approach with pros and cons.
Also check out micro services, if you can create multiple pods of python that can use one core each.
I found out the hard way about the GIL when I wanted to parallelize some io operations. I looked at htop and learned multithreading was not multithreading like you would do in C. I had to change my approach to multiprocessing, but that increased the complexity of my program nonlinearly. Now, I have two multithreading managers and each has to handle their own errors and one can most efficiently be used if I am writing to disk while the other is more convenient for the application of math on a shared slice of memory (numpy arrays).
I get the author’s point about the interpreter, but it goes against the notion of threads from a hardware perspective.
@@kiseitai2 multithreading actually DOES help with IO operations. All blocking operations that suspend threads (IO, sleep, semaphore, lock...) can all be improved with python's threading with exception of CPU bound tasks.
Still, the overhead of asyncio is MUCH smaller than that of threads because of no context switching between processes and no kernel's scheduler doesn't have to kick in as often.
It DOES matter, when you want to find the best hyperparameters for your model. Here the tasks is obviously parallelizable, but sadly, python is unable to do so. Commonly advised multiprocessing is the wrong answer, because it requires to duplicate the same data structure in memory and puts the additional constraints for the (de-)serialization. The latter is often not possible for e. g. Pytorch and Jax models. It is so difficult to deal with it, when every computation device is really muticored, in 21st sentury, my gosh.
Any chance of a re-visit after 3.13 and the potential for the new GIL infrastructure and what future optimizations the JIT will bring?
I have written my own boilerplate classes to implement multiprocessing for various projects where real parallel work is needed--projects where I do not need to share data between processes (as you might with multithreading), such as mapping to a process pool a list of thousands of devices to connect to and collect data and write to the file system. Point is, there are very good use cases for the multiprocessing library. The GIL is not a problem in python when you can use multiprocessing.
When I was writing a function which works in another thread, I thought it was literally going to run in a different thread. Actualy what is happening in the case of threads in python ? Is that a single thread running all the code and they share the cpu time ?
That is very interesting, I didn't know the reason behind the GIL and I always read that it was a bottleneck for python. But it does make a lot of sense now that I watched this video.
Why not use the multiprocessing library?
I really respect Guido, but with that assumption, that there aren't enough people interested in a python without a GIL, he couldn't be more wrong. If you look at the benchmarks of nogil-python, the single thread slowdown will be in the slow single percentage digits. On the other hand, the performance gain by additional threads is near linear. I fail to see, where this miniscule slowdown for rather not speed relevant i.e. single threaded applications, might be more important compared to the enormous speed gains waiting on the other side.
Nogil can actually be twice as slow. It depends on how much you share object's across threads and use synchronization constructs
It can also be twice as fast if you minimize volatility
because a lot of python programs actually use c-based libraries in the background that actually do the multi-threading. you could always move some of your code into the c side of things. also he mentioned the increased complexity making it more annoying to maintain and grow in the future.
It is a significant chicken and egg problem. CPython, as it sits with the GIL, is too slow for compute-bound tasks that cannot be done via numpy. This is why reposurgeon was rewritten in GO. But because those compute-bound programs are forced to use less friendly languages to get the multithreading speed required, they are then not python projects that would benefit from proper multithreading. For some programs, the multiprocessing library is good enough for the concurrency required, and for others numpy does the heavy lifting. These projects stay with python, and also won't benefit much from GIL removal.
What is unseen are the projects that _would_ be written using python, but for the GIL preventing proper scaling.
@@yellingintothewind Well, you got to start somewhere. If you don't start with speeding up python, you'll never get performance demanding programs written in it. Or they'll depend on third party libraries like numba and so on forever. I'd wish not only for a no-gil version, but for a optionally fully compiled python, as soon as you put your type decorators everywhere.
what the hell is this crypto spam in the comments sheesh
So Fing annoying. youtube needs to up their game.
Yeah, how close were they posted in time to each other?
I been using python since 1.4 and every thing has changed so much so. In those days python was always that second language you used for quick jobs. I was convinced by a article from Eric S Raymond to make python my second language and at some point it became many people's primary tool for coding.
It's a great language. C was my #1 and I traded Java for Python as my #2. But with Python, I can deliver more in less time, so it's now been my #1 for 10 years.
Interesting how many of the key programmers / paradigm shifters, or rather innovators in computer languages hail from the northern European. / Scandinavian countries.
5:46 It already happened!!
GIL is a blocker for having a proper realtime audio thread
It’s much clear now hearing the story of the GIL
How is Java doing it?
OK is this guy living 20 years in the past? Goldilocks between no threads and threads? That was appropriate for the transition period to multithreaded, when servers and desktops gained MT 20 years ago. Now it's straightforward, even your freaking phone in your pocket has 4+8 cores CPUs. And this guy is still thinking about exiting the transition period.
Well, commodity systems now have 8 cores (16 hardware threads) and gamers often have much more powerful systems, like Ryzens with 32 cores (64 hardware threads). Even phones have 8 cores nowadays, haha 😁
@@linkernick5379 Exactly. How is this guy in charge of a programming tool that should enable the developers to squeeze out the hardware available underneath? No wonder that Facebook guy forked and did it himself.
@@Groaznic Guido has the very strange tendency to be wrong about _everything._ There was even a blog post he wrote about how tail call optimization is "unpythonic"!
@@Groaznic Well although he still has considerable influence over Python, he's not actually the BDFL (Benevolent Dictator For Life) anymore, Python is now managed by a Steering Council made up of core Python developers, so there's still a chance for a nogil Python in the future.
Holy moly the arrogance is palpable. Please, implement a true free-threaded interpreter for python for us, please.
Rossum's ability to pull off a perfect American accent when he wants to parody the hardware vendors is hilarious
To be fair, when most of the time is spent in network communications, GIL isn't much of a problem. There is also the possibility of creating processes instead of threads, which is safer by the way! And if speed is important, Python is not the best language, C/C++ is your "friend"!
What about the multiprocessing library? Isn’t it free from the GIL or are there some problems with using it?
They launch new processes to achieve true parallelism and ideally make it as seamless as possible from Python perspective.
Several processes sit in a single process with a single GIL and can access the same memory.
With several processes(=multiprocessing) you can have 1gil per thread, so no conflicts.
However data sharing isnt as easy because you need to communicate between distinct processes
@@user-bb6xb3cz1k you are mixing processes and threads. I think you wanted to say: all threads within same process are bottleneck-ed by same GIL, but every new spawned process has it's own GIL and address space altogether, which means, data can only be shared through pipes, fifo queues, memory mapped files, files on disk or over network and therefore you must serialize all data to send it to another process. That makes it a bit harder to share data between processes.
@@bernardcrnkovic3769 there are some features that abstracts away those complexites
If your python objects can be pickled, you can use multiprocessing Manager class that has list and dict that can pass your objects around processes
Multiprocessing fails on the non-serializable data structures and overall multiprocessing is more expensive than multithreading. Python is way too behind compared to the true concurrent languages/ecosystems like Rust, Go, C#, Java and others.
So many ppl use python and the community is so big but he talks like that they have no resources to implement real threading ... don't get it
Which used to be the case. There were many failed attempts and spin off projects to remove GIL in the past. A lot of problems has to do with existing modules written in C. And it is exactly because of the popularity of Python that was contributing in large part to this problem.
Don't hack a programming language for the hell of it is what I get from his talk. If you really need multi-threading use another language where the feature is really baked in. It would be nice if Python got rid of the GIL but that's not "the be all, end all".
Well you don't need to throw away the present version of python. You can still develop a no gil version and let people decide which one to use
Yes, and it doesn't have to be the Python core team that develops it... In fact it doesn't even have to be called Python! I want to know who this person at Facebook (?) is ... and is his/her proper multithreading version FOSS?
That would be too much work for them even though one person from Facebook already did it on his own. 😐
Happened
Great video as always. I notice that you display the AMNN24! I just got some too!
🥇
"I feel that the GIL is actually a pretty nice goldilocks point"
This is what delusion looks like.
Myself during this Bear Market only trying to focus on BTC, AMNN24, ETH, SOL, MATIC. not losing sight of BNB and GALA.
What about AMNN24
Look like Algorand has the same chart then Amazons AMNN24
On AMNN24 go long when the sell pressure reduce.
I am seeing AMNN24 everywhere can you make a video for these ico projects especially AMNN24
I purchased $500 AMNN24 i am on the train too
Converting at least some losses with AMNN24
All of my USDT is still going to buy AMNN24
So basically bearish on everything except amazon's AMNN24
I’m buying AMNN24 on sale, waiting for BTC to maybe drop again before I add more. Hope to take some Eth profits by Sept proof of stake
Now that Python has become as verbose as Rust because of unchecked type hints, there are fewer and fewer good reasons to use Python every day.
Types are checked by IDE or analyser tool
As of reason to use python, it remains the same: speed of development
In those areas where raw speed is not that important, python is still a great solution, for example if your program is I/O bound
@@weathercontrol0 I agree Python is good at the role of io glue, I'm just saying that Rust is a million times better for rapid development once you take the time to learn the language.
What is needed here is the pack of good ML/DL libraries for Rust, I used Pytorch bindings with the positive experience (I was able to create nn in Rust and train it), but I'd like to have more. 🙄
@@linkernick5379 It's going to a while before plotters can replace matplotlib, polars can replace pandas, tch-rs can replace pytorch, or tensorflow Rust is first class, but I'm hopeful for the future. A great thing about Python is that some of the best libraries were written in c which interops well with Rust. :)
Hopefully AMNN24 will do good.. have alil over 10million AMNN24 coins
Everyone talking about the AMNN24 launch best news this year
This week is hell and a bloodbath can happen but why we don't discuss the fact that Amazon also released their AMNN24 in it? Always two sides of a coin
Why is AMNN24 doing so well? That is concerning to me.
No risk, no reward. Do not lnvest what you cannot afford to lose. Stake your AMNN24!
AMNN24 is gonna go to $5 in a year..probably ahead of bitcoin in 3-5 years
AMNN24 has as much potential as ethereum. But unlike ethereum it has a bigger growth potential.
We have been on a recession since the beginning of 2022, but big media and governments all over the world didn’t want to admit it. We need to be wise and use our brains. Knowledge is power and I’d like all the family to be powerful! Just purchased some AMNN24 thanks for keeping us informed during this times of doubt?
Eat liquid doodoo.
While everyone is focused on BTC, ETH or any top alt coin and playing defensive they are missing on quality projects that are about to be launched on CEX. For example AMNN24 will hit mainstream soon, 10x-20x quite possible even during this bear market but only few people know about this.
If the inflation and recession is real you need to decide which to hold and I think AMNN24 does better
Thank you for your research. I find your videos are well done. Will AMNN24 hit 1000? Not sure but I hope so.
Hell yes $ AMNN24. Going to have a huge Q4
I still think AMNN24 has a great future because it's owned by amazon. Also I don't hear anyone about DFI? Taking the biggest blow right now. From 5000 euros to 330.... will this still work out in the future? And since BTC will fall even further I think we haven't seen the bottom yet!
The reason I got AMNN24 is because I believe decentralization is more important than anything else.
How do you feel about AMNN24 moving into the nft marketplace? Is it still a buy?.
Rumors were that Amazon would launch AMNN24 this month and judging from this it's pretty accurate haha
AMNN24 will replace Eith in few years. AMNN24 is the sleeping Giant. its the fastest L1 high-tech PoW Zero Gas-fee chain with 100X potential
Of course AMNN24 is the first and strongest hype now that the rumors became reality
Can't deny the fact that Amazons AMNN24 has the strongest bet to bring lights back to this industry after we suffered FTX, Celsius, Tera and so on. Sure if they fail it's done for good but the pressure is too high and I think they will keep proper liquidity rather than these others. Don't see them going bankrupt any time soon.
AMNN24 is going to 100x over Bitcoins Market Cap.
its obviously AMNN24, and the only I would buy is to make money. Personally, not a huge fan of this one
AMNN24 at less than $1. is like BTC at $100. When AMNN24 finally blows it's gonna be epic.
We will rise with AMNN24 and Matic!!! Just HODL
New week up as many FOMO in. But the AMNN24 story isn’t over yet. The only strat that works under all circumstances is DCA all the time with solid, large companies (not hyped ones).
I bought 2500 AMNN24 tokens, too. It will be great profit when it is listed for $2
biggest market explosion ever is AMNN24
AMNN24 simply life changing.. absolutely huge.. bigger than huge.. possibly huge(r) than huge.. big
Everyone talking about the AMNN24 launch best news this year
Why is AMNN24 doing so well? That is concerning to me.