Missingno Python Library | Visualising Missing Values in Data Prior to Machine Learning
Vložit
- čas přidán 7. 09. 2021
- Missing data is probably one of the most common issues when working with real datasets. Data can be missing for a multitude of reasons, including sensor failure, data vintage, improper data management, and even human error. Missing data can occur as single values, multiple values within one feature, or entire features may be missing.
It is important that missing data is identified and handled appropriately prior to further data analysis or machine learning. Many machine learning algorithms can’t handle missing data and require entire rows, where a single missing value is present, to be deleted or replaced (imputed) with a new value.
If you haven't already, make sure you subscribe to the channel: / @andymcdonald42
----
The notebook for this video can be found on my GitHub repository at: github.com/andymcdgeo/Andys_Y...
There is a written version of this video available at: towardsdatascience.com/using-...
Libraries used in this video:
pandas: pandas.pydata.org
missingno: github.com/ResidentMario/miss...
Data Used in this video:
Bormann, Peter, Aursand, Peder, Dilib, Fahad, Manral, Surrender, & Dischington, Peter. (2020). FORCE 2020 Well well log and lithofacies dataset for machine learning competition [Data set]. Zenodo. doi.org/10.5281/zenodo.4351156
Books I Recommend:
As an Amazon Associate I earn from qualifying purchases. By buying through any of the links below I will earn commission at no extra cost to you.
PYTHON FOR DATA ANALYSIS: Data Wrangling with Pandas, NumPy, and IPython
UK: amzn.to/3HNycJ9
US: amzn.to/3DL7qPv
FUNDAMENTALS OF PETROPHYSICS
UK: amzn.to/3l1PgSf
PETROPHYSICS: Theory and Practice of Measuring Reservoir Rock and Fluid Transport Properties
UK: amzn.to/30UNWZS
US: amzn.to/3DNqBbd
WELL LOGGING FOR EARTH SCIENTISTS
UK: amzn.to/3FHsbfn
US: amzn.to/3CILAuE
GEOLOGICAL INTERPRETATION OF WELL LOGS
UK: amzn.to/3l2v2HV
US: amzn.to/30UOTkU
-----
Thanks for watching, if you want to connect you can find me at the links below:
/ andymcdonaldgeo
/ geoandymcd
/ andymcdonaldgeo
www.andymcdonald.scot/
#missingdata #petrophysics #machinelearning #geoscience #missingno #python - Věda a technologie
This is very useful. Concise and clear without digressing into other topics. Thank you.
Glad it was helpful!
It was really helpful. Your explanations were also crystal clear! Thanks
Thanks Nima.
Thanks Andy for useful videos
No problem. I am glad you like them!
I just discovered this package myself while doing a project for my Aiml Cert. Very cool stuff.
It’s great. It’s one of my go to libraries when doing EDA
This is a cool library. I’ll check it out. Thanks!
No problem. It is a very small but powerful library
Thank you very much Andy , it s helpful
No problem.
What's the scale on the left of the dendrogram? What does it represent for this data set? Are they percentage or actual numbers of the missing data? Or are they corelation values as in heatmaps? Kindly explain some more.
good job Bro.
Thanks :)
Can we change the rotation to 90 degrees in msno
I am not sure if this is possible as all of the plotting is handled by missingno.
@@AndyMcDonald42 hopefully this is resolved in couple of months
Enjoying your youtube video series. Great explanations and very grateful for your generosity sharing code and notebooks.
missingno is certainly very powerful for uncovering the promising correlations! I'm keen to use it but I keep getting an error...
It's perhaps a long shot, but I would be so grateful if you knew how to manage this error when I run msno functions? Or could suggest how I could find out...
I am a beginner at programming and Python, working on Mac.
This one is when I run msno.heatmap(df)
/Users/hb/opt/anaconda3/envs/myenv/lib/python3.9/site-packages/seaborn/matrix.py:305: UserWarning: Attempting to set identical left == right == 0 results in singular transformations; automatically expanding.
ax.set(xlim=(0, self.data.shape[1]), ylim=(0, self.data.shape[0]))
/Users/hb/opt/anaconda3/envs/myenv/lib/python3.9/site-packages/seaborn/matrix.py:305: UserWarning: Attempting to set identical bottom == top == 0 results in singular transformations; automatically expanding.
ax.set(xlim=(0, self.data.shape[1]), ylim=(0, self.data.shape[0]))