Data Validation Using Pyspark || ColumnPositionComparision ||

Sdílet
Vložit
  • čas přidán 7. 09. 2024

Komentáře • 11

  • @MuzicForSoul
    @MuzicForSoul Před 4 měsíci

    why we have to do ColumnPositionComparision? shouldn't the column name comparison you did earlier catch this?

  • @DuyTran-tx5jq
    @DuyTran-tx5jq Před 7 měsíci +1

    Can you do end to end portfolio project please

    • @DataSpark45
      @DataSpark45  Před 7 měsíci +2

      We don't have the azure account bro as of Now. Once after creating the account we will do it for sure. Thank you

    • @DuyTran-tx5jq
      @DuyTran-tx5jq Před 7 měsíci +1

      @@DataSpark45 sure! Btw love your content so much

  • @vinothkannaramsingh8224
    @vinothkannaramsingh8224 Před 7 měsíci +1

    Sort the both ref/df column name based on alphabetical order and compare column names ? will it be sufficient ?

    • @DataSpark45
      @DataSpark45  Před 7 měsíci

      Certainly, whatever the order will mention at reference_df is the correct order as we expect.If we sort dfs column names in alphabetical order then their would be chances of failure. Thank you

  • @vamshimerugu6184
    @vamshimerugu6184 Před 4 měsíci

    Sir Can you make a video on how to connect adls to DataBricks using Service principle

    • @DataSpark45
      @DataSpark45  Před 4 měsíci +1

      Thanks for asking, will do that one for sure .

  • @MuzicForSoul
    @MuzicForSoul Před 4 měsíci

    sir, can you please also show us the run failing, you are only showing passing case, when I tested by swaping the columns in dataframe it is still not failing because the set still have them in same order.

    • @DataSpark45
      @DataSpark45  Před 4 měsíci +1

      Set values will come from reference df .so it always a constant one