From Query Plan to Performance: Supercharging your Apache Spark Queries using the Spark UI SQL Tab

Sdílet
Vložit
  • čas přidán 26. 08. 2024

Komentáře • 7

  • @viswanathana3759
    @viswanathana3759 Před 7 měsíci

    Awesome presentation. Really useful

  • @Sathishkumar-rl7gj
    @Sathishkumar-rl7gj Před 2 lety +1

    Thanks much !!! Very useful

  • @anirvansen2941
    @anirvansen2941 Před 3 lety +1

    Awesome presentation :)

  • @Learn2Share786
    @Learn2Share786 Před 10 měsíci

    is there a repository to go over the real time bad vs good written spark sql ?

  • @aviyehuda
    @aviyehuda Před 3 lety

    Why does HashMergeJoin not mentioned in the presentation?

  • @aviyehuda
    @aviyehuda Před 3 lety

    Why does a spark query is translated to multiple spark jobs?

    • @user-mx7mc7sv2q
      @user-mx7mc7sv2q Před 2 lety

      Every job is a piece of work to be executed by an executor on a cluster. A query is analyzed and then split into stages according to the transformations in the query itself. Every stage is then split into multiple jobs which can be parallelized and pipelined for best efficiency.