Unstack a Million Uneven Rows to Columns in Power Query
Vložit
- čas přidán 30. 07. 2024
- Check out our newly launched M Language course ↗️ - goodly.co.in/learn-m-powerquery/
In this video, we will discuss a smart & efficient way to unstack a large number of uneven rows in separate columns using UI & M Code in Power Query
===== ONLINE COURSES =====
✔️ Mastering DAX in Power BI -
goodly.co.in/learn-dax-powerbi/
✔️ Power Query Course-
goodly.co.in/learn-power-query/
✔️ Master Excel Step by Step-
goodly.co.in/learn-excel/
✔️ Business Intelligence Dashboards-
goodly.co.in/learn-excel-dash...
===== LINKS 🔗 =====
Blog 📰 - www.goodly.co.in/blog/
Corporate Training 👨🏫 - www.goodly.co.in/training/
Need my help on a Project 💻- www.goodly.co.in/consulting/
Download File ⬇️ - goodly.co.in/unstack-uneven-r...
===== CONTACT 🌐 =====
Twitter - / chandeep2786
LinkedIn - / chandeepchhabra
Email - goodly.wordpress@gmail.com
===== WHO AM I? =====
A lot of people think that my name is Goodly, it's NOT ;)
My name is Chandeep. Goodly is my full-time venture where I share what I learn about Excel and Power BI.
Please browse around, you'd find a ton of interesting videos that I have created :) Cheers! - Věda a technologie
This is 100% EXACTLY what I was looking for, down to the blank rows as separators! Astounding as always.
the challenge with uneven columns in Real life is slightly different, you might have Name and Number as the only two fields available on a record. While your solution is pretty amazing, but if you can address this case it will be real help!!!
In any case, you need a reliable indicator for changing a data record. In Chandeeps example it was as NULL-Cell. If this is not the case on your case then the only way is to identify either it is a name or a number (no text inside) . That approach could be done by check if the datatype could be a number or not.
Great video!!
lambda alternative: VUNSTACK(v) Vector Unstack
=LAMBDA(v,LET(a,(v="")*SEQUENCE(ROWS(v)),b,FILTER(a,a),c,VSTACK(0,DROP(b,-1)),d,b-c-1,e,SEQUENCE(ROWS(b))^0+SEQUENCE(,MAX(d))-1,IF(e
- no need to input nr. of fields, no refresh, calc time instant
Even if we want to have all steps in one step without using Index and Fill Up we can do it in one step as below:
= Table.Combine(Table.Group(Source, {"Data"}, {{"GpData", each Table.FromRows({List.RemoveNulls([Data])})}}, GroupKind.Local, (x, y) => Number.From(y[Data] = null))[GpData])
Now that is impressive! And if the headers are put in the first record of the data you can simply promote the headers…
But "long" way of Chandeep is visually more simple for understanding.
And after some practice we could understand complex code like your.
I created a custom index to do my grouping which performs about the same as your code but yours is much simpler, I can see you are using the result of the logic math 0 or 1 to create a change in the column being grouped on for groupkind local to create buckets, did know you could do that.
let
Pos = List.Buffer( List.PositionOf(Source[Data],null,Occurrence.All) ),
Source = Excel.CurrentWorkbook(){[Name="LargeData"]}[Content],
Custom1 =
[
Co = List.Count(Source[Data]),
St = List.Buffer({0} & List.RemoveLastN(List.Transform(Pos, each _ +1),1) & (if Lst < Co then {Lst} else {})),
En = List.Buffer({ Pos{0}-1 } & List.Transform(List.Skip(Pos,1), each _-1) & (if Lst < Co then {Co -1} else {})),
Lst = List.Last(Pos)+1,
GetPos = List.Transform(List.Zip({St,En}), each {_{0} .. _{1}}),
Grp = List.Combine(List.Transform(List.Buffer(GetPos), each List.Repeat({_{0}},List.Count(_))))
],
Custom2 = Table.RenameColumns(
Table.Combine(
Table.Group(
Table.FromColumns({List.RemoveNulls(Source[Data]),Custom1[Grp]}),
{"Column2"},
{"Record",each Table.FromRows({[Column1]})}
)[Record]
),
Renames
)
in
Custom2
Great examples of how to transform lists into tables and vice versa! Thank you 😊.
Thank you for sharing your knowledge so generously. Super helpful for me!
Very good man... Useful! Thank you!!!
Great!! I have used a mix of your two unstack videos to solve a big problem here. Thanks a lot!!
That was really great, I'd been trying to do something similar, and getting in a bit of a mess, I was trying to use List transform( and then Table from List( , not my favorite function.
Thanks for all your hard work.
Thank you for the video - I have been able to adapt it to my particular problem - like your example - over 700K lines and has largely cleaned the data but most importantly placed the appropriate data on a single line.
Your automating column renaming process is fantastic and simple and useful in many scenarios. Pity it gets hidden at the end of this video as it deserves a video on its own.
You have a new follower! Thank you!
Superb..Hats off to you
Great video...Icwas working on this issue I had with a client's PDF Report that I impoerted into Power Query. Had to do a manual transfer. Wish this video was created sooner. Power Query is a BIG Game Changer for Data Analysis . I have gotten many jobs just from my limited (but growing) knowledge. 😂
Man, this is gold. Chandeep, I have managed to do most of these steps to an irregularly formatted PDF. I have the entire page of each PDF in a single row. Now I need to extract from each the date of an event, the event number which has a specific alphanumeric format, and also extract when there are multiple event numbers same format on same date. I'm thinking using the split functions, but getting stuck here - maybe you can consider this for future videos😅
Great and very good solution
Love the clear way you explain power query. Great teaching method backed up by very intelligent animations! Great job!
Glad you like our work Alexandru!
Super helpful as always Chandeep! Thank you for sharing!
Glad it was helpful Yami!
So love your solutions and approach. Any special prices on your power query course.
Awesome , Thank you so much.
Just an addition : Instead of creating excel table for column names we can create nested list within Power query and use it to rename columns. It will save us from an extra connection of table.
Limitation: It won't be as flexible as Excel table for new users but advance users can do it pretty easily.
Amazing 👏
Super Awesome Video❤
Perfect. Thank you for sharing.
Glad you like this Raimundo!
Love you!
Incredible Goodly !
Thanks Ismael!
Plz make video for Column Sorting (2 Column same time)
super awesome indeed! you're a nerd man...
Thank you Chandeep, they way you explain is amazing! one question how can we do it when we have two columns instead of one. The second column has repeated Headers.
Hi Chandeep... Great video as always..
Can you create a video on connecting power bi to service now and Jira.. Adding to it.. There is a complex procedure to set up a data refresh in workspace... If you can make a video on that... It will help many.. Trust me
Mera to jeewan safal ho gya ye video dekh ke apka bahut bahut abhar
This is perfect timing! I am working on a dirty data set with this exact problem right now!
Your videos are the best Chandeep!
Glad it was helpful Ted!
Great Sir
Very good technique short and sweet... If multiple columns has uneven records can we apply same logic. Please make clips multiple uneven row and more than 10 column having uneven data
Brilliant stuff Chandeep 👏
Thanks Paul!
This is another great video, though, but how would you deal with this little change in scenario? The same data set, but where the employees don’t have all 5 pieces of data, you don’t know that the ones they have are always the first 3, or 4 etc. what I mean is that some incomplete records might be (Name, City and Age) and others might be (Name, Age, Phone). Do you e what I mean? I think this is a more likely scenario. All you can be sure of is that the data will always be supplied with the fields in a specific order, if they exist. I’d love to see this. Thanks
**I see this was answered previously - apologies **
Amazing Goodly ! May a question how do the UNICODE function work in Power query ?
How the heck this guy get this Logic ???.... Its Just awesome. Thanks Chandeep
Thanks for vedio
Excellent view. Thanks for sharing. A query: If I have multiple column unstacking problem (example, data is in multiple rows with each row having a different combination of id, course, grade, university). Is it possible to unstack it and have data reorganized with columns headings id, course, grade, and university)? Thanks,
fantastic
Chandeep - the Power Query GOAT!
Hi @goodly, how come the columns get aligned so nicely? I was not expecting this. Seeing some interesting replies too that need to be studied.
Hello Goodly. I have been watching your videos and they are really helpful. I have a situation that is the inverse of what you did in this video. I have a table with ID, name, and email. What happens is that some rows have 1 email, others 2 and some have 10 emails. I want to generate a list or a table with the ID and individual emails on on top of the other so that I can submit it to a email higiene pass to eliminate hard bounces and invalid emails. Would be really cool if you could help me out. I am faily new to Power Query and sometimes it gets a bit overloaded and freezes me up. Best regards
You're a wizard sir !
Thanks Hicham!
Good trick! btw, in the step of
= Table.Group(#"Filtered Rows", {"Custom"}, { { "Data", each _[Data] } } )
why not just use the following syntax?
=Table.Group(#"Filtered Rows", {"Custom"}, { { "Data", each Table.Transpose( _[[Data]] ) } })
With this and Table.Combine, you only need one or two more steps to get the query done🫡
Thanks!
Thanks for the Tip Alex!
Hi Chandeep, what if the columns missed in the records are not at the end, but in between, For instance if age is missing in one record with all other values available.
very nice, zabar10...
Glad you like this zahoor
Such a wonderful video....But I have a query.....What if I do not have an identifier (space in between each item) as in your example...Could you help me out
Buen video!!! Pero que pasaría en el caso de que en ciertos grupos de datos faltará el número de teléfono, pero en otros faltará la edad o otro dato?
Nice
i saw this trick somewhere - you have data & record # - you group the by the record # and SUM the data value - this causes an error - in the group step manually change the List.Sum function to Text.Combine set your delimiter - split the value and you are done
Could be this one
www.google.com/url?sa=t&source=web&rct=j&url=m.youtube.com/watch%3Fv%3DjLpgt-wptH4&ved=2ahUKEwiblMfs0Nr-AhVX7LsIHUggD2AQtwJ6BAgeEAI&usg=AOvVaw1uRBemalphbDdtFVi5e3h8
at 4:40 Any idea why I keep getting Expression.Error: The column 'Data' of the table wasn't found
{{"Data", each [Data]}})
in this line below (since I have no null rows)
= Table.Group(#"Inserted Integer-Division", {"Integer-Division"}, {{"Data", each [Data]}})
Thanks
You have the data which follows an order. What if we have name & age or name & phone number. What happens then.. how could the data be organized in a proper manner?
Yes, I have the same question
One way to do that would be to detect the column's data type and then define it's name. But it becomes very difficult to solve the problem when you have to same data types. For e.g., Power Query won't understand Peter is a name and Paris is a city since both are text values.
Can u please create content on alteryx ...🙏🏻
5:20 How do I keep the Custom column? In my case I am using a Date as that custom column to group by.
Great work,
Please, can you build your qurey with us step by step rather than showing the applied steps
What if you got more than 3 nulls (2 intersect by a text row, and 1 before the next set)?
The Filled up step did not come out right.
Any ideas is appreciated. Thanks.
I have a pdf containing four pages table. every page has same headers but when i load into power query then last page columns values goes to another cells. how to get rid our of it. could you help me.
I have 1 m record on which I want to apply group by to get max value of each category and then same value should reflect against each category in separate custom column.....I tried it but it will take almost 2 hours and eventually system hanged....any suggestions pls
Thanks dear, please help me how to data export in multiple workbook from data in Power query
Hi Goodly
I have a column in which there is employee id and on a different row there is the bank name.I want to these 2 information on 2 separate columns.please help
Thks
❤
Is there a solution for unpivoting as well?
What if I want to replicate the same process but with additional columns expressing the same data?
Really good insight, however it didn’t work for me 😢 Had to trim the data manually 😢 Thankfully there were only 1,300 entries
I'm getting a Network Error when I try to download the file whether or not I subscribe.
Since I couldn't get the file, I entered data manually, but the video didn't show that there was a fifth field, so when I tried to do the Rename Columns I got an error because there was no fifth column. Luckily I remembered that Table.RenameColumns has a third option which defines how to handle missing fields. They are:
MissingField.Error (default if no third option present), MissingField.Ignore (which left out the 5th column for me), and MissingField.UseNull (which creates the column and fills it with nulls).
Hi Chandeep...great video. I was following successfully until I hit 4:49 to 4:50 after you edited the formula converting a Table to a list. I get Error for every row instead of a list despite the formula being exactly the same. I noticed you cut the video at that exact moment and continued....did you encounter a similar error or why might my Data Column give me Error?
Send me your data and query
goodly.wordpress@gmail.com
I solved my own question: The original column had to be renamed to 'Data' since that is the column being grouped. I had it named Column1. I was confused because the original column and the new column were both named 'Data'. To clear things up I would ensure your original column is renamed Data and the new column is Data2. New code:
= Table.Group(#"Renamed Columns", {"Custom"}, {{"Data2", each _[Data]}})
I also had my source data in two columns, so I used concatenate with a comma , separator and cleaned up the data using Text to Columns once it was back in Excel. To do the same, select both columns in PQ and go to Transform> Text Column> look for a function that allows to merge columns. Once that was done, I was able to follow the steps in the video, but remember to change the column names in the formula to the 'Merged Column name'
Rather than splitting the data into separate four or five columns, how can we combine each (name, contact, etc.) into a single column for each record using space or comma as separator.
i saw this trick somewhere - you have data & record # - you group the by the record and SUM the data value - this causes an error - in the group step manually change the List.Sum function to Text.Combine set your delimiter - split the value and you are done
Hi Chandeep, there is some problem with downlloading the file. Can you please verify it? Thank you very much.
Sir, when I try to load a data into power query with more than 5000 rows it only loads 33 rows. What may be the reason??
how to do if there are NO blank rows between each set of data?
Can we instead do this Without applying M language , just using the User Interface 😢? I am finding hard to learn M.
Thanks sir,
I want to transpose data like this format
Heading
A
B
C
Heading1
A
B
C
D
Heading2
A
B
C
D
Heading3
A
B
C
D
E
Now need to data below format, that is possible sir, if yes could please help
Here is not black row, this an example for data. Actually I have 1 lac row and I want to transpose in this format
Heading A B C
Heading1 A B C D
Heading2 A B C D
Heading3 A B C D E
Hi could you share the M Codes so that we dont have to type in manually
Yes, the file used in this video is available in the description, and the M code can be found there.
1:02 remove empty rows☺