Master Data Cleaning with Power Query in Excel in 9 Minutes
Vložit
- čas přidán 23. 07. 2024
- Change your life with this data cleaning tool built into Excel that most users don't know. It's easy to use and fast to learn. No programming required.
⬇️DOWNLOAD the example file & practice data here: www.myonlinetraininghub.com/g...
🎓 LEARN MORE in my Excel course: www.myonlinetraininghub.com/e...
🔔 SUBSCRIBE if you’d like more tips and tutorials like this.
📢 Please leave me a COMMENT. I read them all!
🎯 CONNECT with me on LinkedIn: / myndatreacy
🎁 SHARE this video and spread the Excel love.
Or if you’re short of time, please click the 👍
💬 EXCEL QUESTIONS: Get help on our Excel Forum: www.myonlinetraininghub.com/e...
⏲ TIMESTAMPS
0:00 Introduction
0:33 Why Excel Users Don't Know This Tool
1:05 What The Tool Is
1:32 Example
2:19 Getting the Data
3:37 Cleaning the Data
8:00 Updating with One Click - Věda a technologie
Everyone that uses Excel should be aware of and using Power Query. I will share this video link.
Well said, Roy! Thanks for sharing this video too.
You never cease to surprise - you make it look so easy and simple! Thanks !
When you know how, it is simple 😊 that said, Power Query is easy to learn.
I've bought your power query, power pivot and dashboards course. Well worth it by far. Not only have I saved a tremendous amount of time, but set a new standard for how we use excel in my company. The courses are packaged in small, easy to digest pieces (2-5 min) that make it really easy to get through and implement right away.
Wow! That's awesome to hear and has made my week! Thank you for your feedback. 🥰
Learning and employing Power Query two years ago was a game-changer! And, your tutorials significantly elevated my knowledge and leverage of the tool to generate efficiencies and accuracy. Thank you!
Wonderful to hear!
I nowadays use PowerQuery and PowerPivot to teach coworkers the system of the ETL process and the functionality of datawarehouse (in the simplest form) as you mentioned in your tutorial. Great stuff - thank you very much indeed!
Awesome to hear!
I used this exact method based on an earlier video from you to gather and track Covid data for my company. It really does help! Thank you for these lessons.❤
Great to hear!
Thanks Mynda! PQ has been one of the best things I ever learned. I use it constantly now and it never ceases to amaze.
Amazing videos and great pace of presentation to learn. Many thanks Mynda. Your inputs are aprreciated and most valuable (and I learned something about the Sunshine Coast conditions!)
Wonderful to hear!
Mynda I personally can admit that your tutorial(s) on data analytics; specially related to Power Query have improved my NEW career in data visualization. You are the best....my SENSAI of business intelligence.
thank you so much as always,
mo
Wow, that's awesome to hear! Congratulations on your hard work and success.
Fantastic video for starting Power Query. Best I’ve seen for sure.
Wow, thanks!
This is a very useful tutorial! Thank you for this information.
Great to hear!
HOLY COW! System analyst for 25 years. This is game-changing!
It sure is! Have fun with it 😊
I had never used a Power Query before watching this video. The videos make is seem so easy - Thank you!
You are so welcome!
Very nicely explained. Thanks for sharing.
So nice of you 🙏
Absolutely, this is one tool that excel users should know. Such a powerful tool that can work miracles with data. Surprising part is that it is 13 years down the line since it was introduced and a vast majority of the people are still not aware of its existence! Hopefully your simple video will induce users to try this power packed tool.
Which tool
@@keylanoslokj1806 PowerQuery !
It has been a long time and it's incredible that everyone doesn't know how amazing this tool is.
Great video - loved the View > Column Distribution, and setting up the Group-by in the Power Query. Super helpful!
Great to hear!
Simple and clean introduction! Great video!
Glad you liked it!
Very nice video 📹 Thank you for sharing your knowledge with us
My pleasure 🙏
Great! Thank you!
Glad you liked it!
your videos are always great!!! I am going to try it soon
What if you have a dataset and you add some calculated columns (i.e sum of total income, satisfaction score etc) and you use the PQ to clean it, when then you are going to replace it with the new dataset from a second round of data collection will it automatically add the new calculated columns added in your masterfile? Thanks
Yes, it will automatically apply the same steps to the new data.
PQ is the best! Thanks Mynda
Indeed, Chris 😊
I love your videos. You are one of the best trainers I have found. I have one caution regarding the automatic "Change Type". I work with data that contains leading zeros. The auto change type believes they should be "number" columns and thus drops the leading zeros. I have made it a habit to remove the auto Change Type and control that myself.
Great suggestion for those working with numbers as text 👍
thanks for the excellent and free example of using power query. as a being an it teacher I really loved it especially grouping columns in pq editor .Thanks again to give us this opportunity to widen our minds.
You're very welcome! 😊
You are amazing! Thank you
You're so welcome!
Thank You, Power Query ATL Extract transforms and loads data with Excel & Power Bi
You are welcome 😊
Great explanation :)
Glad you liked it!
Definitely Excel is exciting with you 😘
Awesome to hear 🙏😊
Awesome 👍👍👍
Thanks so much 🤗
Great. 👏
Thanks! 😃
❤ PQ. It changed the way I use Excel
Great to hear!
I just watched your video on how pros spot amateur charts, so I snickered when you moved the legend to the top of the chart. Even pros sometimes take the easy way :)
Yep, this video is about Power Query, not reports. I didn't want to get side tracked and waste time formatting a chart 😉
Very well explained! :)
What if I have access to a PowerBI report / dashboard and I would like to make my own analysis in Excel using PowerQuery?
I was able to export the result of a filter (but not sure if it can also work unfiltered) and there were about 30k rows in the report and excel was taking forever. I think I either clicked "analyze in Excel" on the dashboard visual or something like that and it came in a form of a pivot table in excel with live connection to the database (which I do not know if I am able to copy the connection using other apps) but I'd like to transofrm it either with PQ or pandas in Python, just need to figure out how to get the raw connection to the database / query..
Would you have any idea?
Analyze in Excel will make a connection to the data model in Power BI for analysis in Excel via PivotTables. If you want to query the data, then copy the query from Power BI into Excel by editing the query in the advanced editor in Power BI and paste it into the advanced query editor in Excel.
Thank you for your interesting video. Wanted to try it for myself but unsuccessfuly.
Is there an option for xls file which is not well organised to filtering and analysis? My document is hard to filter as there are a number of merged cells with text. At the same time, the table is pretty massive and contains around 30 000 lines.
It would be great to know the better way for analysis.
.xls files are a very old format and are a bit flaky with Power Query. Better to open the file and Save As > .xlsx then use Power Query to get the file. I suspect there is some unpivoting required, so this tutorial will help: www.myonlinetraininghub.com/power-query-unpivot
Power Query is a very powerful tool.
Indeed 😊
Amazing Mynda as always. You truly do excel 🫡
Thank you so much!
M'am, I am learning new things from your videos and give new way of thinking Thanks
And i have a question that in Group By function, we select columns which we want to Group By and other columns will Automatically removed so we don't need to take extra step for remove other columns.
I just asking that am i right or wrong..
Yes, correct 😊
Hi @mynda... Have been using Power query but came a situation whereby, the web page ask for login credentials. Does this translate I picked the wrong URL path. Please explore further
I can't speak for the URL, but it does sound odd that it didn't originally ask for login credentials.
Do the same steps apply in power BI as well since it has the same process in power query? im interested when having new excel files for additional new dates and data like what is mentioned around the last part about the automated process, can this be applied same with power BI or is this only for excel power pivot?
Yes, you can do the same steps in Power BI 😊
I used VBA to do this years ago. It worked great for me, but it was difficult to teach others how to use it. Power Query is much easier to use and teach.
Indeed. Power Query is way easier to learn and use!
Very easy to follow steps, great presentation. Thanks for sharing, I am trying to combine multiple worksheets, each work sheets having different header columns. For eg,
Sheet 1 has A,B,C has columns.
Sheet 2 has B,D,E has columns,
Sheet 3 has A,CF has columns,
Sheet 4 has E,G,H has columns,
I wanted to merge all 4 sheets together A,B,C,D,E,F,G,H is this possible in Power Query? Can you please share a technique if there is any?
Thank you! Please see this video on combining files with different column headers: czcams.com/video/tpK_xklbDf0/video.html
Hi, thank you for the video!
Power query is taking forever to refresh data (close to a million rows), do you know how I can improve the performance?
Very hard to say without analysing the query. See this post with tips to speed up queries: www.myonlinetraininghub.com/excel-forum/power-query/any-way-to-speed-up-really-slow-refresh-times-in-power-query
Would it be possible to get all the employees who worked three or more weekends in the row last year ? I have a simple attendance table (Emp_ID/Date) and I have to use power query. Thanks
Probably. If you're stuck, you're welcome to post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
Please i need help, how do i transform a few rows to column on power bi and it becomes as a a stand alone column with data derived from same table
Sharepoint tables is really a tough nut. If I use a sharepoint tables connector the query loads the hole document libary into storage (>100mb) befor it even starts fetching rows. I even tried filtering befor expanding. If I use sharepoint contents I can not see my table :(
Hmm, I'm not familiar with SharePoint Tables. SharePoint Lists and Libraries, but not Tables.
Is there any difference between PQ in Excel and PQ in Power BI? I overheard someone saying that it was more limited in Excel but I've never known that to be the case, at least not in terms of the transformations it can apply - not sure if it's more limited in terms of data sources it can connect to.
It has a few less data sources and you can't do incremental refresh, but otherwise it's pretty much the same.
@@MyOnlineTrainingHub Understood. Thank you!
@MyOnlineTrainingHub Also, isn't there a button 'Extract from Table' or something - I forget what is is called, but I remember seeing it in one of your web scraping videos - which exists in PQ in PowerBI, but not Excel?!
What about a webpage that has multiple pages and the query cuts off after certain lines. Also the webpage has drop downs that need to be entered to get current data? I couldnt get it to work
It depends on the structure of the web page URLs. This video explains it: czcams.com/video/VkDpwSGWPmk/video.html
Hi Minda. As always another excellent tip from you. Just one thing: I do not see where to download the file. Thank you
mee too please add raw data
Oops, sorry. it's here now: www.myonlinetraininghub.com/get-started-with-power-query
@@MyOnlineTrainingHub Thank you a lot.
@@MyOnlineTrainingHub There is no option showing to download the file, can you please share.
PQ has definitely changed how I work. But when something in the new table doesn't jibe with the previous table, then the hunt begins to correct the code or to redo the whole query. 😢
The data sets I get are for 4 states. Each state has their own analyst who created the source data, so sometimes headers are inconsistent or has extra spaces. 🙄
Once i get a partially scrubbed data set, then i need to hunt for IDs that have more than 10 unique addresses. Since sometimes the address is ST vs Street AND the requirement is no more than 10 addresses per ID, it requires me to hunt them all down visually with some added formulas.
All that to say, you got me part way there. And for that, im grateful. Is it wishful thinking to learn that i might be able to use PQ for a final report?
Glad Power Query has somewhat helped. Sounds like you need to go to the source and get them to use a standard layout. Power Query isn't going to produce your final report as it's just the tool for gathering and cleaning the data which you then load to Excel/Power BI for reporting on with PivotTables, formulas and charts.
I love going through your videos, I have learned so much from viewing them. But I have been given a task that is causing me problems. I receive csv files weekly and one column header is "Wk ending 3.3.24". this will change every week. I need to create a new column using this date and then copy it down for every row. I managed to do this but the following week when I received the new data the column heading was different, so my query errored out because it couldn't find the column named "Wk ending 3.3.24". do you have a video that would help?
Hi Andy, Please post your question and sample Excel file on our forum where someone can help you with a tailored solution: www.myonlinetraininghub.com/excel-forum
You live in Caloundra?! Hello from the Glass House Mountains! Practically neighbours!
We’re in Buderim 😉
Hello from Brisbane
Best certification for excel to get hired?
You will definitely stand out if you can save your employer time.
Yes, this is powerful. But, is it available on the Mac? If not, this is a 4th reason why some people don’t use it.
It is available on Mac with limited functionality as it's still being developed for Mac. If you do a lot of data cleaning, then it's worth switching to a PC or putting Windows on Parallels or similar on your Mac so you can make use of it.
When i paste charts and slicers in Power point by excel .charts works fine and show paste link option but when i do the same for slicer it does not work and shows not paste link and embeded option tell me how to add the slicer in Power point
I already answered this duplicate question in the Embed in PowerPoint video comments: czcams.com/video/WgpfiVGs6qc/video.html
Is there a way to stop the PQ preview refreshing each time you add/change a step? I am getting frustrated with how much waiting around I have to do when creating or modifying some queries.
I like the idea of PQ but am starting to lose patience 😢.
May be one of the transformation is causing this...but if your data size is huge. One of the tip I use is to filter the data to keep only 100 rows at the beginning and then perform all my transformation steps. Once done, I go back and delete this step where I filtered to 100 rows..this works for me ..
@@Sumanth1601 thanks very much. I will certainly try this. Is there a particular way you suggest filtering to 100 rows, please? If I can’t think of anything better, I could add an index column and filter by that.
I am usually connecting to database tables with a few million rows when I have this problem. It is a shame there isn’t a built in way of addressing the slow refresh. In Microsoft Query there is a setting to turn off the refresh while editing the query. MQ is so much less powerful than PQ, but this is the main reason I still think of MQ first for doing something simple. I would like to stop using MQ if I could because it feels so outdated! Thank you again.
You can turn off Background Data in the Query Options > Current Workbook: Data Load > Background Data.
Is it safe to assume that in order for you to be able to replace the data and update them; the files must have the same header with the same order?
Since the quality of the data I receive kinda depends on a third-party; there are times where some of the headers are not there or may be on a different order; what would you advise to do in that situation?
Ideally, yes. You can always edit the query if the source data is different to allow for those differences.
@@MyOnlineTrainingHub Well, we all wished for the ideal situation. Unfortunately in my work, rarely will the data quality be ideal.
I would like to have your opinion on the workflow of updating the data on a periodical basis; with the same analysis (power pivot) and Dashboards.
To give a context, I work as an Employee Benefit Consultant; analyzing companies' benefit utilization (insurance claim usage). To analyze them I need 2 data: Claim and Member, which I need to transform, connect them, and analyze them.
These are the perimeters:
1. Periodical Data Updates (every 3 months)
2. The Given Data Field Column may vary in its availability (sometimes naming) but the content for each data field is consistent.
My Current Workflow is:
1. Data Clean-Up and Transform (no Power Query)
2. Power Pivot both data (member and claim)
3. Do the analysis and copy the result manually from Pivot to make the graphs.
4. Repeat 1 - 3 when the new data comes.
My question would be, if I already make a ready-template and dashboard; how do I add the data? at this video you were replacing them, is there a way to just add on top of the existing connection without adding the duplicates?
or,
Transform the Updated Data to match the existing format and just replace them?
1. Use Power Query - If there are only 2 columns of data (or even if there are more), it'd be simple enough to rename the columns so they're the same as the original file the query is attached to. Make a copy of the original query file and point it to the new data file i.e. replacing the data from the original query.
2. The original query file will load the data to Power Pivot and you will have already done the analysis. Refresh All via the Data tab will get the new data and update the PivotTables.
3. Create charts directly from the PivotTables. These will update on the refresh performed at step 2. Either with Pivot Charts, or regular charts that reference the relevant parts of the PivotTables. See this video for creating regular charts from PivotTables: czcams.com/video/5vOqZBmBRos/video.html
😎
🙏😊
? Remember Excel 2.0.. 👍😎✊
Just about 😁
@@MyOnlineTrainingHub I never used VisiCalc, but often used SuperCalc and Lotus 1-2-3 before using Excel.
😊
Thanks for watching!
How could I contact you?
Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
In alot of places power query , macros and office scripts are blocked by server policies.
That's disappointing. I can understand macros and maybe scripts, but not Power Query.
@@MyOnlineTrainingHub If your higher up there is usually a group that allows it - Businees Support Administrator level - people who make tools for ordinary admins.
How can i pass the message " we reached the end of the buffer"
Please post your question and sample Excel file on our forum where someone can help you further: www.myonlinetraininghub.com/excel-forum
It's still too much of a pain on Excel for MAC (even though I have the latest version with a 365 subscription)
Yep, it's still being developed for Mac. It's getting there, but slowly.
I want to learn vb sript I don't know coding
I'd learn Power Query before VBA. VBA is being replaced by Office Scripts.
@@MyOnlineTrainingHub I found power query in you playlists will start with it, Thanks Mam for guidance
Just don't tell your bosses you can do data cleansing a lot quicker now unless you see a chance for promotion otherwise claw back the time to relax ... or further develop PQ skills 😄
😁yep, further development is a great way to use that time.
While I am able to bring people to PQ, I still see way too much garbage in-garbage out, including business managers reporting on incorrect or simply irrelevant KPIs. Before bringing ppl into PQ a lot of work has to be done educating professionals about what is relevant data in a report, how and which KPIs to choose and measure, and last but not least maintain one version of the truth with a proper schema construct. Last month’s report becoming next months source is a wrong way of doing things and can be dangerously misleading with multiple versions of your master data floating around.
Yes, good points. Power Query can help you gather and clean data, but it can't stop you giving it the wrong data.
@@MyOnlineTrainingHub that may be reason #4 why ppl may not tackle PQ :)
sum of max height in the pivot is misleading a bit... aggregation is done in PQ, I would change the pivot column headers....
Yes, you can certainly change the column names either in Power Query or the PivotTable. This tutorial was more about getting the data than analysing it 😉