Understanding Data Cleaning | Google Data Analytics Certificate

Sdílet
Vložit
  • čas přidán 23. 07. 2024
  • Data cleaning is essential for successful analysis. If a piece of data is entered into a spreadsheet or database incorrectly, or if data formats are inconsistent, the result is dirty data. Let's go through why and how to clean data.
    0:00 Getting Started with Data Cleansing
    3:20 Why Data Cleaning is Important
    9:05 Identify Dirty Data
    14:31 Starting the Data Cleansing Process
    20:51 Cleaning Data from Multiple Sources
    26:37 Data Cleaning Features
    34:43 Optimize the Data Cleaning Process
    48:50 Data Perspectives
    59:12 Even More Data Cleaning Techniques
    This video is part of the Google Data Analytics Certificate which teaches learners how to prepare, process, analyze, share, and act on data.
    The program, created by Google employees in the field, is designed to provide you with job-ready skills in about 6 months to start or advance your career in data analytics.
    Take the Certificate HERE: goo.gle/3YZJx1Z
    Subscribe HERE: bit.ly/SubscribeGCC
    #GrowWithGoogle #GoogleCareerCertificate #DataAnalytics
    Why earn a Google Career Certificate?
    ► No experience necessary: Learn job-ready skills, with no college degree required.
    ► Learn at your own pace: Complete the 100% online courses on your own terms.
    ► Stand out to employers: Make your resume competitive with a credential from Google.
    ► A path to in-demand jobs: Connect with top employers who are currently hiring.
  • Věda a technologie

Komentáře • 67

  • @sustainability.s
    @sustainability.s Před 2 lety +24

    With clarity, eloquently and in simple language concepts are explained. Much appreciated and anyone interested in data analytics & related disciplines could learn a lot.

  • @inhlam5909
    @inhlam5909 Před 2 lety +48

    * Starting the data cleansing process: 18:57
    * Cleaning Data from multiple sources: 25:16, 25:48
    * Data cleaning features: Conditional format rules, Remove duplicates, Split, Concatenate
    * Optimize the Data Cleaning Process: 35:11, 38:27, 39:36, 40:00, 41:18
    (Functions: COUNTIF, LEN, & Conditional formatting, LEFT, RIGHT, MID, TRIM, CONCATENATE,..)
    * Data perspectives: Sorting, Filtering, Pivot Table, VLOOKUP, Plotting
    * Even more data cleaning techniques: Data mapping, Compatibility, Schema, Primary Key, Foreign Key

  • @robinpuerta
    @robinpuerta Před 2 lety +8

    Excellent summary of key functions on Spreadsheets for data cleaning!

  • @kakayemi
    @kakayemi Před 2 lety +6

    When you are able to clean you data, your analysis begins and that is where it all starts, data cleaning. cleaning data is critical to a good analysis

  • @nanasophia6242
    @nanasophia6242 Před 2 lety +2

    So easy to understand. Thank you so much!

  • @ksumar
    @ksumar Před 9 měsíci +3

    Nicely explained. Very clear vocabulary and examples. I've not really used Google Sheets but Excel is the King of tools I use. Some great tips were given, thank you!

  • @ivanklful
    @ivanklful Před 2 lety +22

    I've past through the entire course and I had completed it in period between July and November 2021, this year, and I earned certificate. I would say that the course is really awesome in many aspects, one of which is data cleaning jobs searching!

  • @minjinjargalsaikhan7325
    @minjinjargalsaikhan7325 Před 5 měsíci +1

    Thank you so much. I would love to see your videos more. I really needed this well explained content. Thank you. You are the best!

  • @akin242002
    @akin242002 Před 3 lety +6

    This videos are awesome!!!

  • @ditsaphongthep827
    @ditsaphongthep827 Před 7 měsíci +3

    Her section is the best of this course.

  • @taiwoakinosho7982
    @taiwoakinosho7982 Před 8 měsíci +3

    Interesting stuff. I don't use conditional formatting to find errors that much. I use filters and inspect the dropdown for inspecting anomalies. The dropdown contains unique values anyway.

  • @AnneLopezlovesLife
    @AnneLopezlovesLife Před 2 lety

    Done watching. Thanks for freely sharing this Google :)

  • @fuehrer_tb5597
    @fuehrer_tb5597 Před 2 lety +59

    keep in mind that data cleaning is the hardest and the most time consuming phase in data analysis..

  • @beetelvj10
    @beetelvj10 Před rokem +2

    This session is eazy n engaging, it's can be more effective if the whole screen in the vedio becomes the computer screen n the instructor holds a small place at the corner...it's solves the problem for those who are using it in mobile , and sometimes the text becomes hazy n hard to read.

  • @sanjayrajbanshi7698
    @sanjayrajbanshi7698 Před 2 lety +3

    Beauty with the brain. Clearly explanation!

  • @brendamg7298
    @brendamg7298 Před 2 lety

    Thank you so much!

  • @WidianiLarasati
    @WidianiLarasati Před 2 měsíci

    Easy to understand even for a very beginner like me!

  • @LaCorvier
    @LaCorvier Před rokem

    Thank you !!!!

  • @fevenasefa1865
    @fevenasefa1865 Před 2 lety +1

    Thank you so much I get my certificate last month!!

  • @nicouekuete8842
    @nicouekuete8842 Před rokem

    This is good!

  • @Tanaka-Buchou
    @Tanaka-Buchou Před 3 lety +5

    Thank you, Google.

  • @user-si6iz9fj1f
    @user-si6iz9fj1f Před rokem

    Thank you

  • @Mrroy08657
    @Mrroy08657 Před 11 měsíci

    Plz provide this Data Sets where I can access the MULTIPLE Data Sources for DATA CLEANING Portfolio Projects.

  • @vazdabilo6983
    @vazdabilo6983 Před 4 měsíci

    how to get this data u are clearing so we can follow ?

  • @shreeshree4270
    @shreeshree4270 Před 9 měsíci

    Can We Have A Video On Cleaning A Database With Multiple Column Header Like, Subject, Term1 & 2, Unit tests For Yearly Resuult Of a School

  • @usmaniqbal1836
    @usmaniqbal1836 Před 2 lety

    Please share practice file..

  • @tabishkhan-qz2vz
    @tabishkhan-qz2vz Před 3 měsíci

    where are these excel files???? can find them anwhere...
    any help is welcome

  • @adamyagogreen
    @adamyagogreen Před 10 měsíci

    please provide this dataset

  • @fathimak9434
    @fathimak9434 Před rokem

    How do we deal with false positives?

  • @musicbox6574
    @musicbox6574 Před 11 měsíci

    Is this included in Google data analytic certificate offered by coursera?

    • @GoogleCareerCertificates
      @GoogleCareerCertificates  Před 11 měsíci +1

      Yep! All of our lessons on CZcams are the same as the lessons offered on Coursera.

  • @keylanoslokj1806
    @keylanoslokj1806 Před 8 měsíci

    We have an Excel file with 10 columns and 500 rows. Last.4 columns have years and sales. There are blanks there but are randomly distributed. We want to delete the rows that have 4 blanks in a row. How would you go on about it?

    • @MrEveloff
      @MrEveloff Před 8 měsíci +3

      Hi! Add filters to the columns using the filter button, then filter on blank cells on each of the 4 columns, you should find your empty lines in a second. Hope this helps!

    • @keylanoslokj1806
      @keylanoslokj1806 Před 8 měsíci +1

      @@MrEveloff how to isolate, select en mass, and delete Only those who have all 4 horizontally cells in a row blank? Cause in that whole range you can have rows which have from 0-4 blank ones randomly distributed.

    • @MrEveloff
      @MrEveloff Před 8 měsíci

      @@keylanoslokj1806 Select the columns you want to filter, add a filter by clicking the filter button, then on each column select the blank cell to filter on empty cells - the order does not matter as you want to select only rows with all 4 cells empty.
      There you have all the empty rows appearing.
      -> If you're using Google Sheets: select the whole lines then right click and click "Delete the selected lines" which will delete only the lines you selected.
      -> If you're using Excel you will need to select the filtered lines then type Alt+; to select only the filtered lines, then right click and delete the lines.
      This is a quick solution, there are other more complex ones but this one seems the most adapted to the situation you described ;)

  • @federicofiorio6996
    @federicofiorio6996 Před 2 lety +12

    12:50 The rows #27 and #31 are not duplicates, they have different SO number and different amount of dollars transfered

    • @adamyapgoyal
      @adamyapgoyal Před 2 lety +1

      Needs more upvotes!

    • @pablovelasquez9806
      @pablovelasquez9806 Před 2 lety +1

      If you paid more attention you would've heard that the data she was looking for at that point was "How many users you have", in that sense both accounts for Elaine, regardless of expenditure or any other variable, belong to the same user, therefore they are in fact duplicates.

    • @andilensele247
      @andilensele247 Před 2 lety +1

      @@pablovelasquez9806 as long as the column values in those 2 rows are not exactly the same then they are not duplicates. The 2 rows are unique and cannot simply delete because you losing some information !!

    • @tsaoh5572
      @tsaoh5572 Před rokem

      @@pablovelasquez9806 As someone with an extremely generic name, just because the same ‘names’ occur in a dataset, doesn’t mean they’re actually the same people. Even if they share the same address! (Could be a son named after a father)

  • @user-in1gh2zb2z
    @user-in1gh2zb2z Před rokem

    #playlist#view#all

  • @user-in1gh2zb2z
    @user-in1gh2zb2z Před rokem

    #history#view#all

  • @stevesmith2553
    @stevesmith2553 Před 2 lety

    That a DBA

  • @TanmoyTheDeadBoy
    @TanmoyTheDeadBoy Před 9 měsíci

    She's too cute to sound this serious. 😶😅🥰

  • @calmmind3160
    @calmmind3160 Před rokem +2

    why is google saying to fix problems manually in a spreadsheet? such a bad practice....... give me the way to do it like a professional using 100k records

    • @amaebarnes
      @amaebarnes Před rokem

      That's exactly what I'm saying. What a waste of time. Just do everything manually? Wow thanks for nothing

  • @geelemo
    @geelemo Před 4 měsíci +1

    This course is on CZcams for free?? Why am i paying on coursera then lol

    • @GoogleCareerCertificates
      @GoogleCareerCertificates  Před 3 měsíci +3

      Hi! On our CZcams channel, we offer the videos so you can get a preview of the content before enrolling, but it is only through Coursera that you will receive your certificate for completing the program. The full certificate program on Coursera also includes assessments, readings, and hands-on labs.

  • @pewolo_nyenh
    @pewolo_nyenh Před 7 měsíci +1

    All the errors mentioned here can be avoided using a database!

  • @YlmazDALKIRANscallion
    @YlmazDALKIRANscallion Před 2 lety +1

    0:50 Who is she?

  • @DataSet
    @DataSet Před rokem

    Im ready for my bath 🛁

  • @mohammadmujeeb6592
    @mohammadmujeeb6592 Před rokem +3

    is she robot??

  • @bbayat4093
    @bbayat4093 Před 2 lety

    Better to say corrupt data.

  • @ubiquitous1212
    @ubiquitous1212 Před rokem +13

    Why she is not blinking her eyes..

  • @user-ne5zz4xd7b
    @user-ne5zz4xd7b Před měsícem

    is the instructor human??

  • @yameen3448
    @yameen3448 Před rokem +1

    A very poor video on data cleaning by Google. Expected way more from them. Using a spreadsheet is just not acceptable.

  • @annanikolayan931
    @annanikolayan931 Před 6 měsíci

    dude just show, stop talking that much