Mock Interview for Data Engineers | Spark Optimizations | Real-time Project Challenges and Scenarios
Vložit
- čas přidán 8. 07. 2024
- To enhance your career as a Cloud Data Engineer, Check trendytech.in/?src=youtube&su... for curated courses developed by me.
I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
30 INTERVIEWS IN 30 DAYS- BIG DATA INTERVIEW SERIES
This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
A highly experienced guest interviewer, Himanshu Mishra, / himanshu-mishra-4796014b conducting a well engaging interview covering all the important topics that a Data Engineer should be aware of.
Our talented guest interviewee, Hamida Bano, / hamida-bano-793804208 answering the interview questions in a very simplistic way with good examples.
Link of Free SQL & Python series developed by me are given below -
SQL Playlist - • SQL tutorial for every...
Python Playlist - • Complete Python By Sum...
Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
Social Media Links :
LinkedIn - / bigdatabysumit
Twitter - / bigdatasumit
Instagram - / bigdatabysumit
Student Testimonials - trendytech.in/#testimonials
Discussed Questions : Timestamp
1: 40 Introduction
2:21 Challenges you faced in your project
4:40 What’s the contribution towards your project ?
6:20 File formats you have worked on in your project ?
7:53 What is wide and narrow transformations ?
9:38 Lazy evaluation in spark ?
11:25 What is fault tolerance in spark and mapreduce and how does it work ?
13:32 Client mode and Cluster mode in spark ?
14:15 Broadcast joins we have in spark ?
15:18 Memory management in spark ?
18:12 In live production, if you are facing an out of memory error. So what’s the approach you follow to debug that?
19:51 What is Data skewness ?
20:16 What is Caching ?
21:38 How do you test your spark code ?
22:17 What are the performance tuning techniques that you use to tune your spark job ?
23:18 What is coalesce and when should we use it ?
24:54 Managed and external tables with a use case
26:28 How do you deploy your spark code ?
27:29 How did you schedule your workflow ?
28:14 What are the version control tools you have used ?
28:49 What is shuffling and why do we need to think of minimising it ?
29:50 One of the Spark jobs you've developed is experiencing slow performance. How would you go about resolving this issue?
31:00 What are the transformations and actions you have performed in the current project ?
32:03 How does spark work ? Explain Spark Architecture ?
33:05 What is lineage in spark ?
33:50 Different types of joins in spark ? Use case on any one of those joins ?
35:25 What is a spark session and how do we initialise it ?
36:33 How to read a parquet file into a dataframe ?
37:37 How can you perform filters on a dataframe?
39:20 How to remove duplicates in a dataframe ?
39:56 Consider a scenario where in dataframe we want to update a column name, So how will you do this ?
40:40 Usage of withColumn ?
41:27 How to remove any column from a dataframe ?
41:50 Have you handled any null values in your dataframe ?
42:37 SQL Coding Question
Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs
she spoke about user memory, executor memory, cache memory which uses off heap memory which does not use garbage collector, which I felt very useful.
Sumit Sir kindly make a video on a person who has transition from non-It to Data Engineering profile it will be really helpful
This series is too good! Keep em coming!
Thanks
Thanks again. I am following these closely and feel that these would be immensely helpful in cracking the interviews. Appreciate it. 👍
definitely
This interview is really very helpful. Thank you so much Sir for this entire series.
Pleasure to share more such content for all my supportive followers!
Really thanks sir for mock interview playlist 🙏🏻
Most welcome
🙏
Insightful interview 👍
thank you
Excellent mock interview 👍
Glad you enjoyed it!
Hi sir good morning it was helpful to us please do make some AWS data engineering interview also instead of azure..
Noted
Yeah please we facing the end to end data pipeline AWS side explanation where use etl used nd which transfer that used and so on.
Hi Sumit Sir
I also want to appear for Mock Interview. Is there any process involved or Can you help me with the process to appear?
Sir continue the python videos
yes
Sir keep mock interviews for gcp data engineer
sure
too many questions
Thanks