Mock Interview for Data Engineers | Spark Optimizations | Real-time Project Challenges and Scenarios

Sdílet
Vložit
  • čas přidán 8. 07. 2024
  • To enhance your career as a Cloud Data Engineer, Check trendytech.in/?src=youtube&su... for curated courses developed by me.
    I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
    30 INTERVIEWS IN 30 DAYS- BIG DATA INTERVIEW SERIES
    This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
    A highly experienced guest interviewer, Himanshu Mishra, / himanshu-mishra-4796014b conducting a well engaging interview covering all the important topics that a Data Engineer should be aware of.
    Our talented guest interviewee, Hamida Bano, / hamida-bano-793804208 answering the interview questions in a very simplistic way with good examples.
    Link of Free SQL & Python series developed by me are given below -
    SQL Playlist - • SQL tutorial for every...
    Python Playlist - • Complete Python By Sum...
    Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
    Social Media Links :
    LinkedIn - / bigdatabysumit
    Twitter - / bigdatasumit
    Instagram - / bigdatabysumit
    Student Testimonials - trendytech.in/#testimonials
    Discussed Questions : Timestamp
    1: 40 Introduction
    2:21 Challenges you faced in your project
    4:40 What’s the contribution towards your project ?
    6:20 File formats you have worked on in your project ?
    7:53 What is wide and narrow transformations ?
    9:38 Lazy evaluation in spark ?
    11:25 What is fault tolerance in spark and mapreduce and how does it work ?
    13:32 Client mode and Cluster mode in spark ?
    14:15 Broadcast joins we have in spark ?
    15:18 Memory management in spark ?
    18:12 In live production, if you are facing an out of memory error. So what’s the approach you follow to debug that?
    19:51 What is Data skewness ?
    20:16 What is Caching ?
    21:38 How do you test your spark code ?
    22:17 What are the performance tuning techniques that you use to tune your spark job ?
    23:18 What is coalesce and when should we use it ?
    24:54 Managed and external tables with a use case
    26:28 How do you deploy your spark code ?
    27:29 How did you schedule your workflow ?
    28:14 What are the version control tools you have used ?
    28:49 What is shuffling and why do we need to think of minimising it ?
    29:50 One of the Spark jobs you've developed is experiencing slow performance. How would you go about resolving this issue?
    31:00 What are the transformations and actions you have performed in the current project ?
    32:03 How does spark work ? Explain Spark Architecture ?
    33:05 What is lineage in spark ?
    33:50 Different types of joins in spark ? Use case on any one of those joins ?
    35:25 What is a spark session and how do we initialise it ?
    36:33 How to read a parquet file into a dataframe ?
    37:37 How can you perform filters on a dataframe?
    39:20 How to remove duplicates in a dataframe ?
    39:56 Consider a scenario where in dataframe we want to update a column name, So how will you do this ?
    40:40 Usage of withColumn ?
    41:27 How to remove any column from a dataframe ?
    41:50 Have you handled any null values in your dataframe ?
    42:37 SQL Coding Question
    Tags
    #mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs

Komentáře • 25

  • @chetankakkireni8870
    @chetankakkireni8870 Před 3 měsíci +8

    she spoke about user memory, executor memory, cache memory which uses off heap memory which does not use garbage collector, which I felt very useful.

  • @PraveenSingh-no8ol
    @PraveenSingh-no8ol Před 3 měsíci +2

    Sumit Sir kindly make a video on a person who has transition from non-It to Data Engineering profile it will be really helpful

  • @akshaythengane4302
    @akshaythengane4302 Před měsícem

    This series is too good! Keep em coming!

  • @poojabarawkar1808
    @poojabarawkar1808 Před 3 měsíci +1

    Thanks

  • @prannay19
    @prannay19 Před 3 měsíci +2

    Thanks again. I am following these closely and feel that these would be immensely helpful in cracking the interviews. Appreciate it. 👍

  • @_-_Abhinav_-_33
    @_-_Abhinav_-_33 Před 3 měsíci

    This interview is really very helpful. Thank you so much Sir for this entire series.

    • @sumitmittal07
      @sumitmittal07  Před 3 měsíci

      Pleasure to share more such content for all my supportive followers!

  • @swapnildande4706
    @swapnildande4706 Před 3 měsíci

    Really thanks sir for mock interview playlist 🙏🏻

  • @zaffer2024
    @zaffer2024 Před 3 měsíci

    🙏

  • @sadiqueahmad6781
    @sadiqueahmad6781 Před 3 měsíci +3

    Insightful interview 👍

  • @karthikeyanudayakumar9553
    @karthikeyanudayakumar9553 Před 3 měsíci +1

    Excellent mock interview 👍

  • @user-rx3vl2en5i
    @user-rx3vl2en5i Před 3 měsíci +2

    Hi sir good morning it was helpful to us please do make some AWS data engineering interview also instead of azure..

    • @sumitmittal07
      @sumitmittal07  Před 3 měsíci +2

      Noted

    • @user-rx3vl2en5i
      @user-rx3vl2en5i Před 3 měsíci +1

      Yeah please we facing the end to end data pipeline AWS side explanation where use etl used nd which transfer that used and so on.

  • @pritamkabiraj7691
    @pritamkabiraj7691 Před měsícem

    Hi Sumit Sir
    I also want to appear for Mock Interview. Is there any process involved or Can you help me with the process to appear?

  • @umeshpagoti1017
    @umeshpagoti1017 Před 3 měsíci +3

    Sir continue the python videos

  • @shiprasarwada
    @shiprasarwada Před 3 měsíci +2

    Sir keep mock interviews for gcp data engineer

  • @karthikeyanr1171
    @karthikeyanr1171 Před 3 měsíci +1

    too many questions

  • @telugoons2292
    @telugoons2292 Před 3 měsíci +1

    Thanks