RichardOnData
RichardOnData
  • 128
  • 908 911
Classification Metrics Explained | Sensitivity, Precision, AUROC, & More
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html
In this video, I go through the different types of binary classification metrics. These include: accuracy, prevalence, confusion matrices, sensitivity (aka recall or true positive rate), specificity (aka true negative rate), precision (aka positive predictive value), F1 score, and the areas under the precision-recall curve and the receiver operating characteristic curve, that is: AUPRC and AUROC. We close with how to implement these using the scikit-learn package in Python, going through a Jupyter notebook.
Code can be found here: github.com/RichardOnData/CZcams/blob/main/Python%20Notebooks/classification_metrics.ipynb
Patreon: www.patreon.com/richardondata
BTC: 3LM5d1vibhp1F7pcxAFX8Ys1DM6XLUoNVL
ETH: 0x3CfC599C4c1040963B644780a0E62d45999bE9D8
LTC: MH8yPjvSmKvpmRRmufofjRB9hnRAFHfx32
zhlédnutí: 329

Video

SHAP Values: An Overview
zhlédnutí 435Před měsícem
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video, I talk about SHAP values and how these can be used for explainable AI and explaining how features contribute to a machine learning's predictions for each observation. These are great tools when your goal isn't (only) prediction, but is also inference - that is, understanding the most important featur...
Is ChatGPT-4 Worth It?
zhlédnutí 662Před měsícem
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html NOTE: Sorry about the bad audio quality on this one. I switched microphones when I upgraded phones recently, and thought during testing that it would be a lot better than it was here. Looking into a REAL microphone upgrade here. NOTE 2: I didn't talk about DALL-E on this one, which is another feature to GPT-4. The ...
Follow THESE 5 Tips to Get a Data Job
zhlédnutí 836Před 3 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video I'll break down some tips that I have to get data jobs. This is going to be broad and apply to all types of positions, whether those are data analyst, data science, or data engineering jobs! To summarize: 1) Have good education in a field like statistics, computer science, math, engineering, business,...
Learn (and Do) Data Science FAST with ChatGPT
zhlédnutí 987Před 3 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video I show some ways I've used ChatGPT to both learn, and to data science faster. ChatGPT can be an excellent tool if you're responsible with it. It can provide great ideas to help get through creative roadblocks, as well as to generate great coding examples that you can turn around and use to learn. You ...
The Data Job Market in 2024
zhlédnutí 7KPřed 4 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html My thoughts on the data job market in 2024. I looked at data scientist, data analyst, data engineer, and machine learning engineer jobs. In particular we talk about some broader trends in tech more recently, the recent tech layoffs, and what hiring and salaries are looking like for these positions. Crunchbase: news...
No, AI (Probably) Won’t Take Your Data Job Soon
zhlédnutí 596Před 4 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html NOTE: The beginning of this video is somewhat tongue in cheek. Certain things, you just have to let yourself have fun with. Some of the articles and videos I reference make very different points, specifically regarding the rise of data engineering and constructing end-to-end machine learning pipelines. Those are va...
R or Python: Which Should You Learn in 2024?
zhlédnutí 4,1KPřed 5 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video we're revisiting the R vs Python comparison in the year 2024. How do they stand in recent job reports and in indices like PyPL or the TIOBE index?
Four Data Science Jobs: My Experiences
zhlédnutí 540Před 5 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video I talk about every data science job I've had, how each job was dramatically different from the others, and how each one sort of led to the next.
10 Python Packages You Should Know (in 2024)
zhlédnutí 849Před 5 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video I'm going to provide a recommended 10 packages that you should know and focus on to get strong at Python programming, in the context of data science. Recommended book "Python for Data Analysis": amzn.to/3cDXKcE 1. pandas pandas.pydata.org/Pandas_Cheat_Sheet.pdf 2. numpy images.datacamp.com/image/uploa...
What Is Survival Analysis?
zhlédnutí 499Před 5 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video I cover survival analysis. Specifically what it is, and why it's useful when the time until an event is important and when you have "censored" data. I talk about what censored data is and provide definitions of the survival and hazard functions. This is illustrated visually by showing a Kaplan-Meier c...
How I Would Learn Data Science in 2024 (If I Had to Start Over)
zhlédnutí 2KPřed 5 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html ChatGPT: Bri Does AI: czcams.com/video/MnDudvCyWpc/video.html Ryan Scribner: czcams.com/video/X9ksiScY7hM/video.html Statistics: Duke: www.coursera.org/specializations/statistics John Hopkins: www.coursera.org/specializations/jhu-data-science University of Amsterdam: www.coursera.org/specializations/social-science ...
How to Setup Your Python Environment (With VSCode & Anaconda)
zhlédnutí 4,9KPřed 6 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html In this video, I walk you through how to set up your Python development environment. If you're a complete beginner, you'll probably be good with just Anaconda/JupyterLab/Jupyter Notebooks. If you're going to be a serious developer, you'll want to use Visual Studio Code and as a best practice set up virtual environm...
How I Passed the Google Cloud Professional ML Engineer Exam
zhlédnutí 8KPřed 6 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html 'Journey to Become a Google Cloud Machine Learning Engineer': amzn.to/3TjwmYT Exam guide: cloud.google.com/learn/certification/guides/machine-learning-engineer Github compilation: github.com/sathishvj/awesome-gcp-certifications/blob/master/professional-machine-learning-engineer.md Medium articles: towardsdatascienc...
Update | Where I’ve Been
zhlédnutí 750Před 7 měsíci
Subscribe to RichardOnData here: czcams.com/channels/KPyg5gsnt6h0aA8EBw3i6A.html Hi everyone. It's been a while.
Tufte's Principles of Graphical Integrity
zhlédnutí 4,3KPřed 2 lety
Tufte's Principles of Graphical Integrity
Why Is It SO HARD to Get a Data Science Job?
zhlédnutí 4,8KPřed 2 lety
Why Is It SO HARD to Get a Data Science Job?
I Quit My Data Science Job. Here’s Why
zhlédnutí 7KPřed 2 lety
I Quit My Data Science Job. Here’s Why
10 Good Coding Practices for Data Science
zhlédnutí 3,7KPřed 2 lety
10 Good Coding Practices for Data Science
Data Science Advice for College Students
zhlédnutí 3KPřed 2 lety
Data Science Advice for College Students
The TRUTH About Learning Data Science
zhlédnutí 4,2KPřed 2 lety
The TRUTH About Learning Data Science
Is the Future of Data Work Remote?
zhlédnutí 1,5KPřed 2 lety
Is the Future of Data Work Remote?
The State of Data Science in 2021 | Anaconda's Annual Report
zhlédnutí 2,5KPřed 2 lety
The State of Data Science in 2021 | Anaconda's Annual Report
What Is Data Engineering?
zhlédnutí 2,4KPřed 3 lety
What Is Data Engineering?
When Should You Use Regression Methods?
zhlédnutí 5KPřed 3 lety
When Should You Use Regression Methods?
Tuning hyperparameters and stacking models with "tidymodels" | R Tutorial (2021)
zhlédnutí 2,3KPřed 3 lety
Tuning hyperparameters and stacking models with "tidymodels" | R Tutorial (2021)
Evaluating ML Performance, Resampling, and Workflows in "tidymodels" | R Tutorial (2021)
zhlédnutí 1,9KPřed 3 lety
Evaluating ML Performance, Resampling, and Workflows in "tidymodels" | R Tutorial (2021)
Intro to machine learning in R with "tidymodels" | R Tutorial (2021)
zhlédnutí 8KPřed 3 lety
Intro to machine learning in R with "tidymodels" | R Tutorial (2021)
20 R Packages You Should Know
zhlédnutí 39KPřed 3 lety
20 R Packages You Should Know
Creating ROC curves and ensembling models in R with "caret" | R Tutorial (2021)
zhlédnutí 4,4KPřed 3 lety
Creating ROC curves and ensembling models in R with "caret" | R Tutorial (2021)

Komentáře

  • @MindfulInsights4u
    @MindfulInsights4u Před dnem

    Pls mic fix

  • @adamf5018
    @adamf5018 Před 3 dny

    I am a PhD candidate in data analytics looking for job for eight months right now not even a single interview it’s very tough😫

  • @wb7779
    @wb7779 Před 8 dny

    That was a really good explanation. Short and powerful.

  • @matthewson8917
    @matthewson8917 Před 10 dny

    I used Python during my PhD and ended up shifting to R. Python's statistical packages are lackluster (maybe not surprisingly). I'm not a big fan of dot chains and pandas' index system, and the deal breaker was that it was so sluggish and busts memory so often with medium size data (20GB+) even with 60GB+ RAM machine. Tried Dask but it's pandas based and slow - with duckDB / polas I think dask project will be less popular. I picked up tidyverse and data.table from R, and it did the job without a problem, and I kinda regretted learning Python. R has fixest package that is really fast for high demensional fixed effects regressions, and python doesn't seem support large scale regressions very well.

  • @lorenzopeiyang6934
    @lorenzopeiyang6934 Před 10 dny

    R is more capable of doing amazing things better than python

  • @yoyo-ue5pf
    @yoyo-ue5pf Před 13 dny

    I am AWS ML certified

  • @narayanasrikanthreddyg

    Good to hear that you learnt R and then created the video. I can understand @6:41 - After learning c , c++, basic , cobol ie having a programming background. R really felt funny and weird because there are multiple ways you can do the samething. But later i fell in love with R . I have heard numpy and pandas are inspired from R datastructures. You have computer engineers backing up development and usage of python whereas bunch of academicians and statisticians for R. R initially looked like hotchpotch but after looking at numpy and pandas with basic python...... i just laugh at my judgements reversing over time. Python seems to be more in line with traditional expectation from OOPS syntax...i can go on ..... but both could have been more streamlined for the workflow of datascience

  • @chrishardy2909
    @chrishardy2909 Před 14 dny

    I can't believe FORTRAN is #12! I programmed my master's thesis project in 1995 in FORTRAN and I thought nobody used it anymore. As a statistician I'm guessing R is the way to go.

  • @1wuniverse675
    @1wuniverse675 Před 18 dny

    Nice honest and informative video. Thank you.

  • @BSTDeepaneeshRV
    @BSTDeepaneeshRV Před 20 dny

    helpful video , thank you sir 🌟

  • @Sonntagssoziologe
    @Sonntagssoziologe Před 21 dnem

    It remains vague. What exactly can you do with R that is not possible with Python?

    • @narayanasrikanthreddyg
      @narayanasrikanthreddyg Před 14 dny

      You mean to say python along with packages numpy , pandas scikit learn etc....

  • @princempungalume8136
    @princempungalume8136 Před 22 dny

    I like how you present the data/ideas 😂... Thanks for the information ❤

  • @djangoworldwide7925
    @djangoworldwide7925 Před 23 dny

    I dislike videos that make reproducibility challenging. You could demonstrate the exact same concepts using a simple data frame that can be found in seaborn (or any other imported package for that matter). Nice video otherwise

    • @RichardOnData
      @RichardOnData Před 22 dny

      That' a totally fair point. If I'm understanding you correctly here, the issue basically being that this dataset requires an API key and a few steps overall to get ahold of. I do find these concepts easy to understand through the lens of a disease, but I totally see what you mean here. I have a video coming out soon on bagging vs boosting, and I'll use a dataset for that one that's simpler to get your hands on.

    • @djangoworldwide7925
      @djangoworldwide7925 Před 22 dny

      @@RichardOnData That's cool man. I appreciate you replying and enjoy your overall content (i'm subscribed for quite sometime now). To be clear, I didnt "disliked" as the "button dislike", but i mean in general that i dont like the idea that [...]. cheers!

  • @mugomuiruri2313
    @mugomuiruri2313 Před 24 dny

    good.mugo on data

  • @firstname4337
    @firstname4337 Před 26 dny

    how do you LEARN this stuff -- I mean really LEARN -- I took a course in biostatistics where we covered this and for every problem I had to keep referring back to a page where I had written all the formulas -- there was no way I could tell you the formula for specificity or sensitivity -- I understood the consequences and reasons for them (telling someone they have diabetes when they don't leads them to spending money on drugs they don't need) -- but as for applying the correct measure and formula to every scenario I was totally lost -- if we weren't allowed to use a page of formulas for the final I would have failed spectacularly

    • @RichardOnData
      @RichardOnData Před 22 dny

      There's really no substitute for repetition and experience. Years ago I had to give multiple presentations for a sepsis prediction model and had to use a ton of these metrics and then answer questions. It went from always mixing them up, to being able to rattle them off in my sleep, but it did take a lot of time.

  • @silvertube52
    @silvertube52 Před 26 dny

    Thanks Richard, that was a good overview of classification metrics.

  • @mugomuiruri2313
    @mugomuiruri2313 Před 29 dny

    good.mugo on data

  • @dimitrioskioroglou4316
    @dimitrioskioroglou4316 Před měsícem

    For me the greatest difference between the two languages is the mentality. R users are taught basic programming fundamentals and learn that for every solution there is a package they can use. Python users are taught programming first and how the language is used to create packages. So R users learn to use the language at a higher level, and when they go deeper then things get messy. Also in 2024 I wouldn't keep putting labels such as R for statistics and Python general purpose etc. This kind of labels is absolutely nonsense.

  • @matteolatinov6630
    @matteolatinov6630 Před měsícem

    Nice one! This is a topic I need to start getting into. Would love more content on XAI!

  • @KN-tx7sd
    @KN-tx7sd Před měsícem

    Awesome, instead of Python, can this be done using R

  • @JackieReu
    @JackieReu Před měsícem

    Thanks for the great video! Should i reverse the remotesigned execution policy after finishing the install and setup, or does it have to stay on Y permanently?

  • @rccola362
    @rccola362 Před měsícem

    This was awesome. Thanks. I’m gonna go see how I can apply these when presenting to stakeholders

  • @Daniel83021
    @Daniel83021 Před měsícem

    You are awesome bro, thanks

  • @mart1484
    @mart1484 Před měsícem

    Well damn 1yr left to learn

  • @asdnmr6858
    @asdnmr6858 Před měsícem

    How many weeks or months did you take to study for the exam?

  • @Antowan
    @Antowan Před měsícem

    My university economics program uses R. I learned both for obvious reasons

  • @moviezone8130
    @moviezone8130 Před měsícem

    Thanks for the great video. it was an awesome comparison. Are you practicing data science, I am looking for small role in data analysis with R programming software, do you have any advice. I have a masters degree in environmental science from Addis Ababa University. By the way are you on LinkedIn, would like to follow you. Thanks.

  • @moviezone8130
    @moviezone8130 Před měsícem

    Hi Sir thanks for yet another great video. can you make a video on the most widely used ML tools. I have a background of chemistry and Environmental science on a masters level, I have started learning r through reading book and watching you tube videos. do you think I have a future on data science. I'm from Ethiopia.

  • @Kids_zone389
    @Kids_zone389 Před měsícem

    Video of my wish😂

  • @emmaccode
    @emmaccode Před měsícem

    that data dictionary does seem useful, I would be curious to see how well it deals with more " disguised" features, such as a categorical feature that seems continuous or the inverse. Because sometimes I feel getting the feature type right can really be a matter of knowing, it would be impressive if it was very consistent.

  • @arcadevampire
    @arcadevampire Před měsícem

    Stopped using ai, as I stopped learning and got lazy. Now only reach for it as a last resort.

  • @moviezone8130
    @moviezone8130 Před měsícem

    I'm a newbie of R and I like it. Thanks for the great video.

  • @lekanfagbuyi7963
    @lekanfagbuyi7963 Před měsícem

    Thanks for this breakdown . I am graduating from a master's in analytics in a few months and this video came at the right time

  • @fuggit4638
    @fuggit4638 Před měsícem

    this nigga is a genius

  • @xavierromerocarrion1369
    @xavierromerocarrion1369 Před měsícem

    Great and relevant content. Thanks my pal.

  • @kschen1620
    @kschen1620 Před 2 měsíci

    i like R

  • @rohitquat
    @rohitquat Před 2 měsíci

    I got admitted to MSDS in MSU. Can you tell me some relevant course work that I should take to get into data engineering?

  • @user-dl5go9tg6g
    @user-dl5go9tg6g Před 2 měsíci

    Thanks

  • @ron101346
    @ron101346 Před 2 měsíci

    If you're in healthcare or pharma, the other language to know is SAS. I know, it's old hat, but it has a simple syntax, a rich macro language and it is certified for use in FDA-regulated industries.

    • @chrishardy2909
      @chrishardy2909 Před 14 dny

      Or SPSS which is similar but cheaper and the the user interface is much nicer

  • @drakeweissman6499
    @drakeweissman6499 Před 2 měsíci

    Has this helped with making any transitions to MLE? Have you noticed companies caring?

  • @EdwardYang-rd6zi
    @EdwardYang-rd6zi Před 2 měsíci

    Thanks, Richard! Great video.

  • @harshithabondhuchandrashek8895

    How to decrease the size of the column?

  • @TalsBadKidney
    @TalsBadKidney Před 2 měsíci

    Julia is the light and the way

  • @pabliuxvm30
    @pabliuxvm30 Před 2 měsíci

    Welcome back!

  • @scottparrish7244
    @scottparrish7244 Před 3 měsíci

    Thank you for this. Very informative.

  • @truthruster
    @truthruster Před 3 měsíci

    You mention at the initial stages getting a degree in a related area but what about folks who have education and experience in a different area. For example, what if a designer or lawyer wants to get into the data field. What would your recommendations be for such people to build their knowledge and get their foot in the door ?

  • @sdmasroor
    @sdmasroor Před 3 měsíci

    Very good summary. Thank you!

  • @georgechristy11
    @georgechristy11 Před 3 měsíci

    please share your linkedin