Can't create a relationship? Time for a surrogate key in Power BI!

Sdílet
Vložit
  • čas přidán 7. 07. 2024
  • This looks at a scenario of a hierarchy but the child ID repeats and we can't create a relationship on it. Adam looks at how you could use a surrogate key to get a relationship in place so you can build your Power BI report.
    📢 Become a member: guyinacu.be/membership
    *******************
    Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
    🎓 Guy in a Cube courses: guyinacu.be/courses
    *******************
    LET'S CONNECT!
    *******************
    -- / guyinacube
    -- / awsaxton
    -- / patrickdba
    -- / guyinacube
    -- / guyinacube
    -- guyinacube.com
    **Gear**
    🛠 Check out my Tools page - guyinacube.com/tools/
    #PowerBI #PowerQuery #GuyInACube
  • Věda a technologie

Komentáře • 44

  • @Naryan17
    @Naryan17 Před 2 měsíci +24

    I usually create a new column in both tables to use for the relationship. Is there a reason why a merged column like EntityID:ChildID for the relationship is worse than the presented Solution?

    • @douglascory
      @douglascory Před 2 měsíci

      Would also be interested to know

    • @yuuzhang
      @yuuzhang Před měsícem

      The compression for a integer (consecutive and Starting from 1) ist much better than for the string.
      Could reduce your dataset size and Refresh significantly (from 10gb to

    • @markwallstrom9994
      @markwallstrom9994 Před měsícem

      @@yuuzhang I am not so sure. Is it worth doing a join operation in PQ instead of just concatenate in each table? PQ is notoriously slow at joining from my experience. Besides, if you join you break the folding, whereas with a concatenated key in a SQL statement, for example, you do not. Your example of 10gb -> 1gb would not be true for a "normal" dimension where this is most common (say a table with 2m records).

  • @kevindoherty8788
    @kevindoherty8788 Před 2 měsíci +12

    But wouldn't it be nice if Semantic Model supported multi-column relationships? We can dream.

  • @j-town4408
    @j-town4408 Před 2 měsíci +27

    I'm confused. Why wouldn't you just combine the entity id and child ID into a unique value/surrogate key (i.e. 1-1, 1-2, 1-3, etc.)?

    • @mrcosmojones880
      @mrcosmojones880 Před 2 měsíci +4

      My initial thoughts too, Definitely would love to know if this isn't the optimal solution!

    • @skv4611
      @skv4611 Před 2 měsíci +1

      Reason could be indexing

    • @Tyberes
      @Tyberes Před 2 měsíci +2

      There's three answers that I can give you here.
      1. It's just good practice for all of your records to have their own unique identifier anyways.
      2. Ideally you'd have this done upstream so when the system create the record it's coming in with a unique key in both the parent and child. (This would be on the software dev team to implement)
      3. You'd need to do the compound key generating steps in every individual table that references the parent in this case, and that can be a hassle.

    • @Tyberes
      @Tyberes Před 2 měsíci +1

      But yea, just the steps of duplicating and then merging all of the relevant "key" columns on both sides of the relationship would achieve the same result.

    • @MrSparkefrostie
      @MrSparkefrostie Před 2 měsíci +2

      Two things in my mind, your key would be hyphenated and your key would be hyphenated, ok first one is just about how it looks and how you can easily say add the key on both sides so a 1 and a 2 is 3 and a 2 and a 1 is a 3 so you will need to pad the child key. So 1 becomes 1000 so now you have 1002 or 2001 in the examples above, the issue is if 999 suddenly is no longer enough. That's the one potential issue, the other issue is the text value, if you keep everything integer then you gain some efficiency, each space only has 10 available characters, for text it's 36 if there are no special characters, but I am sure it's far more with all the special characters. Apologies for the long answer and no bullet point, phone not doing line breaks

  • @peterdaniels3428
    @peterdaniels3428 Před 2 měsíci +2

    Such a clean and effective description, Adam. It gets around using a term like "non-semantic", and clearly shows the importance of retaining the actual "semantic" (or "business") composite key, too. I appreciate your teaching style!

  • @dragobrumen
    @dragobrumen Před 2 měsíci +1

    I tried this approach on a fact table with several million records. Initially, it was too slow and even failed during refreshes. To resolve this, I added an Index column for ProductID in Power Query and applied the changes. Next, I created a calculated column in the fact table using both CategoryID and ProductID to properly integrate ProductID. After setting up these changes and establishing the relationships, the process ran much more smoothly. In fact, this was part of the optimizations I implemented for a customer.

  • @opod84
    @opod84 Před 2 měsíci +2

    For measures, you can also use the TREATAS function for virtual relationships.

  • @Lebkuecher
    @Lebkuecher Před 2 měsíci

    Like several other people have mentioned, I have done stuff like this before, but I would have created a custom column to combine the Entity and Child Ids, using a dash or hyphen or something. This would ensure consistency across the two tables and avoid the Merge step, essentially being more effiecent.....because I'm what...I'm not lazy, I'm efficient, lol. Thanks!

  • @He3nt6
    @He3nt6 Před 2 měsíci

    спасибо вам , что вы есть

  • @Chiz_1
    @Chiz_1 Před 2 měsíci +1

    In similar scenarios I create a new key column in both tables that is a merge of child id and product id. That way all the information is in the actual key and easier to validate.

    • @davejl8982
      @davejl8982 Před 2 měsíci +1

      Definitely. This way, it provides greater data transparency and traceability during data troubleshooting..

  • @alvarorodriguezlasso
    @alvarorodriguezlasso Před 2 měsíci

    Great, regards from Cali-Colombia

  • @Baldur1005
    @Baldur1005 Před 2 měsíci +1

    Well somehow the ChildId and EntityId has made it to the Fact Table, so I guess Star Schema is working. Definitely what is advised here is bad practice (you have your disclaimer in the video). The unique key need to be created for dimension table - period and repopulated in DATA WAREHOUSE during ETL.

  • @Tyberes
    @Tyberes Před 2 měsíci +3

    Gosh I wish they'd let us just do joins on multiple columns at once. Skip this whole process entirely.

  • @wojciechjaniszewski9086
    @wojciechjaniszewski9086 Před 2 měsíci +2

    I didn't know, PQ merge operation supports composite key :) Every day is a school day! Thank you

  • @roshannawaz1977
    @roshannawaz1977 Před 2 měsíci

    if we refresh will the surrogate key update as well or should we update it manually?
    can ayone help

  • @Milhouse77BS
    @Milhouse77BS Před 2 měsíci +1

    Microsoft Fabric will make it easier to do surrogate key generation and integration with fact tables in the "back room" as Kimball would like.

  • @HarshithaNagaraj-lv7vc
    @HarshithaNagaraj-lv7vc Před 2 měsíci

    hi, I am facing issue in refreshing a dataset (api) in services, i am getting SSL certificate issue but in local am able to refresh please if anyone has any idea please help me out

  • @Blog-igorbelovRu
    @Blog-igorbelovRu Před měsícem

    Здравствуйте
    Подскажите пожалуйста, а как можно синхронизировать визуальный элемент в Power BI?
    К примеру я создал фигуру и поместил в нее три фильтра, затем все это сгруппировал. Затем этот элемент я помешаю на другие созданные вкладки и если я вношу изменение в эту фигура, то эти изменения автоматически распространяются на на остальные вкладки где содержится данный объект.
    --
    Hello
    Please tell me, how can I synchronize a visual element in Power BI?
    For example, I created a shape and put three filters in it, then grouped it all. Then I will add this element to other created tabs and if I make a change to this shape, then these changes are automatically propagated to the other tabs where this object is contained.

  • @akki4718
    @akki4718 Před 2 měsíci

    Q: can’t we set parameters rule for postgre sql in deployment pipeline

  • @sauravtomar341
    @sauravtomar341 Před 2 měsíci

    Anyone who reads this comment can reply ......If i am creating an power BI dashboard and publish it as app. Now my user wants to create new graphs or KPIs in the app in a blank page. Is it possible to have this capability in Power BI.

  • @Acheiropoietos
    @Acheiropoietos Před 2 měsíci

    You can do this in your sleep, can’t you? 😂

  • @paullevchuk
    @paullevchuk Před 2 měsíci

    Nothing new