ETL Architecture In-Depth - Advanced Dimensional Modelling

Sdílet
Vložit
  • čas přidán 6. 04. 2015
  • The third is a series of three Presentations on Dimensional Modelling based on the Kimball Group's 4 day course: ETL Architecture In-Depth.
    Chapter numbers and timestamps:
    1. 00:00 Intro
    2. 05:37 Hybrid Slowly Changing Dimensions
    3. 15:07 Mini Dimensions
    4. 28:03 Junk Dimensions
    5. 33:35 Bridge Tables
    6. 41:29 Ragged Hierarchies
    7. 51:38 Coverage Facts
    8. 55:10 Summary and Questions
    The other Presentations are
    Dimensional Modelling 101 : • ETL Architecture In-De...
    Intermediate Dimensional Modelling : • ETL Architecture in De...
    DWS Ltd is an Australian Software Development and Professional Services Company.
    www.dws.com.au

Komentáře • 14

  • @neelred10
    @neelred10 Před 7 měsíci

    Great presentation. For Ragged hierarchy example of salesperson and managers would be highly relatable. Like sales is attributed to a salesperson and many salesperson fall into a specific manger and then senior manager..

  • @DebabrataPatnaik
    @DebabrataPatnaik Před 8 lety +7

    Ross ... Its a pleasure watching your presentation on Dimensional Modeling.(I watched all 3 continuously) I believe going through the 3rd Edition of toolkit, though Ralph strongly opposes urges for normalization and believes in dimensional approach for modeling. In my opinion SCD4/SCD5 are definitely an alternate definition of Normalization to tackle Monster Dimensions. Similarly Junk Dimensions/Out-trigger are concept that is indeed supportive to Normalization. I feel both Bill Inmon & Ralph Kimball's approach are widely used. I am fascinated by Dimensional Modeling, I had been involved in projects with Normalization approach and would love to make foundation more solid in Dimensional Modeling. Thanks for all the insightful video tutorials, enjoyed and relished every part of the session. Thanks for providing it to the larger audience via CZcams.

  • @usmanrahat5254
    @usmanrahat5254 Před 3 lety +3

    Really nice set of presentations. Learned alot, thanks.

  • @satyak1337
    @satyak1337 Před 6 lety

    super helpful and really nice presentation.

  • @Stelios.Posantzis
    @Stelios.Posantzis Před 3 lety

    Awesome intro to DW design.

  • @fishsauce7497
    @fishsauce7497 Před 2 měsíci

    Doesn't Type 5 dimension directing towards Snowflake design of dimension?

  • @jimtonykiller
    @jimtonykiller Před 7 lety

    Hi Ross in the recursive relationship, we can filter also on the basis where parent key = child key?

    • @boganpies
      @boganpies Před 7 lety

      Do you mean, "can we *safely* filter on ...?" And by *safely*, I mean without risk of picking up duplicate rows in the fact. Yes, you can, but it doesn't make much sense. The purpose of using the ragged hierarchy bridge (in my example, the CUSTOMER_BRIDGE) was to find facts that belong to children (or grandchildren, etc) of the chosen customer(s).By filtering CHILD_CUSTOMER_KEY = PARENT_CUSTOMER_KEY, we get the subset of the bridge that includes only level-0 relations (ie. level-2 = self, children and grandghildren; level-1 = self and child; level-0 = self only).The LEVELS_FROM_PARENT attribute provides this same functionality (LEVELS_FROM_PARENT = 0) in a slightly more intuitive fashion, and it also extends to deeper levels (1, 2, etc).But even THAT is not the best solution. If you're not interested in picking up facts for children and grandchildren etc, then don't use the bridge table at all - just join the fact to the CUSTOMER dimension.

    • @ParasDoshiBlog
      @ParasDoshiBlog Před 3 lety +1

      Great series, thank you!

  • @loudravetortoise
    @loudravetortoise Před 2 lety

    Lead and lag functions can do the same thing for the ragged hierarchy

    • @videet
      @videet Před rokem +1

      they can, but adds a complexity interms of querying the data.

  • @jimtonykiller
    @jimtonykiller Před 7 lety

    Hi Ross just thinking, in your treatment-diagnosis example, during ETL load can't I group the rows to have coma separated diagnosis, which then simplifies the architecture and there is no need of any bridge table.
    So you have patient joined to treatment fact on patient key and with diagnosis appearing in treatment fact as comma separated values? We also eliminate the exercise of "constructing" possible "groups of diagnosis", to be more close to real world.
    Also with this architecture then we have no need for weighting factor.... your thoughts please.

    • @boganpies
      @boganpies Před 7 lety +3

      Hi AT. Thanks for the question. Yes, you can indeed have a column containing repeating elements--either comma separated, XML, or some other well-formed scheme for storing such data. But you've only solved one part of the problem - storage. What about querying? After all, that's why we're here.For your comma-separated example, when you want to search for cases with diagnoses including FRACTURE - TIBIA, you can't query on DIAGNOSIS = 'FRACTURE - TIBIA', you need to include wildcards. e.g. DIAGNOSIS LIKE '%FRACTURE - TIBIA%'.This opens up a new world of problems, like how do we treat strings that are a subset of other valid strings. For example, if our medical records system used particularly archaic language, we might record malaria as AGUE, but then DIAGNOSIS LIKE '%AGUE%' might well serve up results of PLAGUE, which would be undesirable.Bridge tables resolve these problems with predictable and intuitive results.