Top Five Tricks for Coding in Pandas - with Matt Harrison

Daniel Chen: Cleaning and Tidying Data in Pandas | PyData DC 2018

James Powell: So you want to be a Python expert? | PyData Seattle 2017

SKRYTÝ TALENT MMA ZÁPASNÍKA 😆🫃🏻

女孩妒忌小丑女？ #小丑#shorts

Hilarious Fake Snake Prank On Husband 😂🐍

Effective Pandas I Matt Harrison I PyData Salt Lake City Meetup

PyData

zhlédnutí 68 500

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 27. 08. 2024

Komentáře • 49

@MartyAckerman310 Před rokem ⁺²¹
I'm not exaggerating when I say this video changed my life.
I went from a guy who did everything upstream in SQL and grudgingly used Pandas to a guy who uses Pandas for everything.
The approach Matt demonstrates also translates generally to PySpark.
I'm now considered the go-to guy for Pandas and PySpark code in my department. There's so much bad code around, often written by people with advanced degrees and MATLAB experience it seems. I could make a full time job out of cleaning up bad code.
Dot chain FTW!
@mattharrison721 Před rokem ⁺¹
Thanks! Glad to help.
@amilkyboi Před 9 měsíci
Heh, MATLAB and bad coding practices - the two are never far from one another it seems.
@DavidDobr Před 2 lety ⁺⁵
90 minutes of pure gold.
Thanks Matt!
@mattharrison721 Před 2 lety
Thanks David. 👍🙏 Make sure you check out my book, Effective Pandas, if you appreciated this.
@scottlucas232 Před 2 lety ⁺¹
AGREE COMPLETELY ! FANTASTIC PRESENTATION ! Learned more here than in past two years
@ninhluong5004 Před rokem ⁺³
This is easily the best pandas guide I have ever watched so far.
@mattharrison721 Před rokem
Thank you!
@AkashRana1111 Před 2 lety ⁺⁷
This is gold! Matt did an amazing job showing best practices when using pandas and a lot of intuition about how pandas function run under the hood.
@bendirval3612 Před 2 lety ⁺⁹
This was a ridiculously useful video. I feel like I've watched a lot of python videos, but I think this might be the most practically useful for people who are not brand new to pandas--who use it all the time.
@flipside5482 Před 6 měsíci
This man is a living data legend.
Mass respect.
@nickhodgskin Před 2 lety ⁺¹⁰
Really interesting talk, was doubtful about chaining at first but you have converted me :) . A very very informative talk, thanks
@mattharrison721 Před 2 lety ⁺¹
Thanks for coming around Nick. 😉 Hope you find these techniques useful to you.
@erginceyhan Před 2 lety ⁺¹
Great presentation. As others said pure gold. If there is button called pure gold I would have clicked it. A simple like is not enough. It also changed my view of code organization. Thanks for sharing.
@annagora6409 Před rokem
Matt, big thank you for chaining idea!
@johannes-euquerofalaralema4374 Před rokem
By far the best pandas video I have ever seen
@aoihana1042 Před rokem
This tutorial had so many gems! Thanks Matt
@rephechaun Před rokem
This is mind blowing... Thank you very much!
@gregorywpower Před rokem
I can’t wait for you to give another talk on polars!
@bullbranch Před rokem
Excellent Pandas best practices video. I was already a big user of chaining but for some reason hadn't used append much. This is a cleaner way to do things and I will be using it. My next notebook is going to be much easier to maintain and much easier to build. Thanks Matt!
@mattharrison721 Před rokem
Awesome. Thanks
@FRANKWHITE1996 Před rokem ⁺¹
Thanks for sharing ❤
@whkoh7619 Před 2 lety ⁺¹
Thanks Matt, this was an incredible presentation. Came here from the Real Python podcast, just bought the book too!
@mattharrison721 Před rokem
Thanks for your support
@santchev1326 Před 2 lety
Really interesting, many thanks to Matt and Pydata :)
@NearLWatson Před 2 lety ⁺¹
I was looking how to speed up my pandas operations since I read Python itself is faster than R and pandas should be faster than python, i am happy i came here.
Excellent tips that I am going to experiment and hopefully achieve a quicker output time.
Excellent session nevertheless.
@Davidkiania Před 2 lety ⁺¹
I really love this session and it’s completely changed the way I process data going forward.
Thanks a lot !
@elidrissii Před 2 lety ⁺¹
Here from your HN comment. Super informative.
@ioannisnikolaospappas6703 Před rokem
Thanks for the wonderful pandas insights matt and pydata!
@firefoxmetzger9063 Před 2 lety ⁺⁴
1:18:00 For the specific question being asked (find duplicates in a primary key) there is a much simpler solution than what Matt Harrison suggested: df.duplicated("primary_key", keep=False). It will select all rows with non-unique values in the "primary_key" column, i.e., all the rows that are duplicated.
Matt solves the more general problem of "find all rows for which the element in primary_key occurs at least N times". A more concise (though perhaps less readable) solution to this would be something like
(df
[df.primary_key.value_counts()[df.primary_key].reset_index().primary_key > N]
)
@kernel2006 Před 2 lety
An alternative to your approach is to use .transform() with .groupby(), to act effectively like a SQL window function that counts the primary keys, but whose result is the same length as the original data (rather than being collapsed due to aggregation).
Something like:
num_dups = df.groupby('key')['key'].transform('size') # has same index as df
df.loc[num_dups > N]
@hazemmosaad3440 Před rokem
Really interesting and informative talk.
Thanks
@samplaying4keeps Před rokem
Thank you for this! This is super helpful. I learned so much!
@grumpy_techo Před 2 lety
Thanks for you 'rant' Matt - have your recent books and still realised something that I should be doing with my data. 👌
@mattharrison721 Před 2 lety
Thanks Tyrone! Good luck with your Pandas. 😉🐼
@jongcheulkim7284 Před rokem
Thank you
@antecavlina8897 Před 2 lety
just a tip:
at 48:30 when commenting line by line upwards you could point with mouse at desired line, then press (i think) ALT and keep pressed, pointer might switch to a thin lined cross, then drag with mouse pointer up or down the lines and then insert #
its like doing block comment...
still looking for a way to do that without mouse, but not sure to use sth like vim extension, if there is one...
@mischaminnee Před rokem
Awesome!
@abimaeldominguez4126 Před 2 lety
I have a problem with aggregations, sometimes if you aggregate two columns and one column has a cell with a NaN .groupby will ignore it, I know you can keep those NaNs, but I would like to see a use case when is good idea to keep NaNs while using a .groupby and when is not a good idea.
@tariqaziz1795 Před 2 lety
Sir the apply method gave me error such as unhashable series.
How to fix that?
@dragangolic6515 Před rokem
Great video, I need this data set.
Where can I find it?
@pmiron Před rokem
Can someone identify the font he uses in Jupyterlab ? :D
@JimmieChoi93 Před 3 měsíci
'Lato' I guess
@pmiron Před 3 měsíci
@@JimmieChoi93 I just tried and I don't think it is Lato.
@JimmieChoi93 Před 2 měsíci
@@pmiron damn. Here's an idea, screenshot it to ChatGPT and ask
@pmiron Před 2 měsíci
@@JimmieChoi93 haha I actually did try with some screenshots. It recognizes that is a notebook and a monospace font but then suggest it might be the default JupyterLab font or Consolas, Menlo, etc. Also tried WhatTheFont and FontSquirrel with no luck.
@joecookieee Před 2 lety
ty for the video matt this is awesome
can you explain how u got those numbers @ 57:30 --
6_220 / 125
Thank you!
@walkingintopeople Před 2 lety
235.215 is a ratio between mpg and l/100km. It's a constant the presenter looked up on a search engine ahead of time

Další v pořadí

Automatické přehrávání

Top Five Tricks for Coding in Pandas - with Matt Harrison

Top Five Tricks for Coding in Pandas — with Matt Harrison

Daniel Chen: Cleaning and Tidying Data in Pandas | PyData DC 2018

Daniel Chen: Cleaning and Tidying Data in Pandas | PyData DC 2018

James Powell: So you want to be a Python expert? | PyData Seattle 2017

James Powell: So you want to be a Python expert? | PyData Seattle 2017

SKRYTÝ TALENT MMA ZÁPASNÍKA 😆🫃🏻

SKRYTÝ TALENT MMA ZÁPASNÍKA 😆🫃🏻

女孩妒忌小丑女？ #小丑#shorts

女孩妒忌小丑女？ #小丑#shorts

Hilarious Fake Snake Prank On Husband 😂🐍

Hilarious Fake Snake Prank On Husband 😂🐍

S Tary Drinkem máš šanci si plnit sny! 😝

S Tary Drinkem máš šanci si plnit sny! 😝

681: XGBoost: The Ultimate Classifier - with Matt Harrison

681: XGBoost: The Ultimate Classifier — with Matt Harrison

Thomas Wiecki - Solving Real-World Business Problems with Bayesian Modeling | PyData London 2022

Thomas Wiecki - Solving Real-World Business Problems with Bayesian Modeling | PyData London 2022

1000x faster data manipulation: vectorizing with Pandas and Numpy

1000x faster data manipulation: vectorizing with Pandas and Numpy

Max Mergenthaler and Fede Garza - Quantifying Uncertainty in Time Series Forecasting

Max Mergenthaler and Fede Garza - Quantifying Uncertainty in Time Series Forecasting

So You Wanna Be a Pandas Expert? (Tutorial) - James Powell | PyData Global 2021

So You Wanna Be a Pandas Expert? (Tutorial) - James Powell | PyData Global 2021

Exploratory Data Analysis with Pandas Python

Exploratory Data Analysis with Pandas Python

Thomas Bierhance: Polars - make the switch to lightning-fast dataframes

Thomas Bierhance: Polars - make the switch to lightning-fast dataframes

SDS 557: Effective Pandas - with Matt Harrison

SDS 557: Effective Pandas — with Matt Harrison

I've Read Over 100 Books on Python. Here are the Top 3

I've Read Over 100 Books on Python. Here are the Top 3

248 lízátek za 2 500 korun! 😝

248 lízátek za 2 500 korun! 😝

Before VS during the CONCERT 🔥 "Aliby" | Andra Gogan

Before VS during the CONCERT 🔥 "Aliby" | Andra Gogan

Classic Italian Pasta Dog

Classic Italian Pasta Dog

The First Time You Say ' Mom ' #shortsfeed #funny

The First Time You Say ' Mom ' #shortsfeed #funny

【斗罗大陆】坏人居然敢欺负唐舞桐？斗罗家族可不好惹哟！#斗罗大陆#唐舞桐#唐三#小舞

【斗罗大陆】坏人居然敢欺负唐舞桐？斗罗家族可不好惹哟！#斗罗大陆#唐舞桐#唐三#小舞

Vážně Tohle Řekl? 😨 - OMEGLE RIZZ EDITION💦 ft. Lišák

Vážně Tohle Řekl? 😨 - OMEGLE RIZZ EDITION💦 ft. Lišák

Proč první Deadpool nemĕl ústa? #deadpool #wolverine #shorts

Proč první Deadpool nemĕl ústa? #deadpool #wolverine #shorts

NEJRYCHLEJŠÍ Střela v Historii FOTBALU…

NEJRYCHLEJŠÍ Střela v Historii FOTBALU…