Databricks Certified Associate Developer for Apache Spark Exam Questions Dumps Analysis 2024

Sdílet
Vložit
  • čas přidán 11. 03. 2024
  • The Databricks Certified Associate Developer for Apache Spark certification exam assesses the understanding of the Spark DataFrame API and the ability to apply the Spark DataFrame API to complete basic data manipulation tasks within a Spark session. These tasks include selecting, renaming and manipulating columns; filtering, dropping, sorting, and aggregating rows; handling missing data; combining, reading, writing and partitioning DataFrames with schemas; and working with UDFs and Spark SQL functions. In addition, the exam will assess the basics of the Spark architecture like execution/deployment modes, the execution hierarchy, fault tolerance, garbage collection, and broadcasting. Individuals who pass this certification exam can be expected to complete basic Spark DataFrame tasks using Python or Scala.
    AWS Real Time Projects / AWS Cloud Quest (1-24)
    • AWS Cloud Quest | AWS ...
    AWS Real Time Projects / AWS Cloud Quest (25-98)
    • AWS Projects from Begi...
    AWS tutorials for beginners
    • AWS tutorials for begi...
    AWS Certified Cloud Practitioner Real Questions / Dumps 2024 (CLF-C02)
    • AWS Certified Cloud Pr...
    AWS Certified Solutions Architect Associate 2024 Dumps (SAA-C03)
    • AWS Certified Solution...
    AWS Certified Developer Associate Exam Questions Dumps 2024 (DVA-C02)
    • AWS Certified Develope...
    Databricks DAE/Professional Certifications exam questions dumps 2024
    • Databricks Certificati...
    Oracle Database SQL Certified Associate Exam Questions Dumps
    • Oracle Database SQL Ce...
    PCAP Certified Associate in Python Programming Dumps
    • PCAP Certified Associa...

Komentáře • 52

  • @balajia8376
    @balajia8376 Před 14 dny +2

    Question 73, Option is E. storesDF.na.fill(30000, "sqft") is correct. col function will not work in na.fill. Thanks for your great work. 🙂

  • @razi9126
    @razi9126 Před měsícem +6

    I just passed with 100% today. All questions came from the video

  • @05ariba
    @05ariba Před dnem

    Thanks!
    Passed today with 85%

  • @KumarDivyesh
    @KumarDivyesh Před 6 hodinami

    Thank you, I passed with 83%

  • @Ivan-zc9ds
    @Ivan-zc9ds Před měsícem +1

    Thank you for such a hard work! In 81 the answer is D, foreach does not change the list , and since we use collect we return the list, and be carefull if use scala and cast you need to use StringType, not the StringType()

  • @KumarDivyesh
    @KumarDivyesh Před 6 hodinami

    I passed with 83%, Thank you

  • @sebastiano9472
    @sebastiano9472 Před 2 měsíci +1

    Question 119: I'd have gone for the option D
    Garbage collection delays are more likely when JVM heap sizes are large because the garbage collector has to check more objects to see if they are still in use. When executors have a larger heap, the garbage collection process can take longer, which leads to more significant pauses.
    Scenario #1 has the largest executor memory (100 GB), which is likely to face longer garbage collection pauses since there's more heap space to go through.

  • @hindutva_knowledge
    @hindutva_knowledge Před 3 měsíci +1

    I think the appropriate answer to question 118 is A, as repartition(n) uses hash partitioning which will evenly distribute the data as column name is not specified here. And coalesce(n) will always result in an uneven partitioning of data as it combines the data on the same executor. And option D seems incorrect as I don't see how we can compare the efficiency of both these functions when n>current partitions. Let me know your thoughts on this

  • @danialmansoor1035
    @danialmansoor1035 Před 3 měsíci +1

    Thank you for the effort. Passed exam with 2 day prep only!

  • @gowtham-reddy0616
    @gowtham-reddy0616 Před 2 měsíci +1

    Thank you for the video! I cleared with 86 percentage! Helped me to revise the concepts before exam! Appreciated your efforts!

  • @HarshSingh-cw3pw
    @HarshSingh-cw3pw Před měsícem

    split() -> function is working for both Col("col_nm") and "col_nm" as str. Even in documentaion its written column or str. So both C and D are correct for 25 Question

  • @sebastiano9472
    @sebastiano9472 Před 2 měsíci +1

    Question 135, according to the doc, it should be the E and not D, please check

  • @dineshwaran8174
    @dineshwaran8174 Před 3 měsíci

    Q 112 - says to find an error in a code block. but the code block is not given and the options also are not inline with the "find-the-error" type question.

  • @user-ni2mi1ef2h
    @user-ni2mi1ef2h Před 3 měsíci

    Please can you share the ILT document that you have shown in this video

  • @dineshwaran8174
    @dineshwaran8174 Před 3 měsíci

    2:44:30 - is this python or scala? I could not find documentation for getAS. Please help

  • @HarshSingh-cw3pw
    @HarshSingh-cw3pw Před měsícem

    explode() -> function is working for both Col("col_nm") and "col_nm" as str. Even in documentaion its written column or str. So both A and E are correct for 26 Question. Can you advice which to pick in exam.

  • @HarshSingh-cw3pw
    @HarshSingh-cw3pw Před měsícem

    mean() can accept both col() and "str". For question 32 Option A is wrong and Option E is correct. Becasue mean is an alias of avg() it given in document provided by dataricks for the exam. So we can calculate mean using avg also

    • @exot4ch
      @exot4ch Před 3 dny

      just checked the practice exam provided by Databricks and the correct answer is A

  • @dineshwaran8174
    @dineshwaran8174 Před 3 měsíci +2

    @2:08:02 The df.fillna() and df.na.fill() both are same, right? - so both C and D seem to be correct. Could you please clarify?

    • @kiranchitla
      @kiranchitla Před 2 měsíci

      you right, both are correct. On the page he shows, it clearly written the Spark documentation

    • @Rxneem
      @Rxneem Před 2 měsíci

      No, it's not the same because df.fillna() is used with pandas data frames and df.na.fill() is used with sql data frames

    • @dineshwaran8174
      @dineshwaran8174 Před 2 měsíci

      @@Rxneem you should see the pyspark documentation

  • @dineshwaran8174
    @dineshwaran8174 Před 3 měsíci

    122 - The format in question shuld have been CSV. Not JSON I believe.

  • @Purnimareddy-vv3gg
    @Purnimareddy-vv3gg Před měsícem

    Can we expect same set of questions for scala also?
    Can anyone reply Asap

  • @dineshwaran8174
    @dineshwaran8174 Před 3 měsíci

    Hi, This is immensely useful. When are you planning to upload the quiz video?

    • @sthithapragnakk
      @sthithapragnakk  Před 3 měsíci +1

      Once this reaches atleast 500 views

    • @dineshwaran8174
      @dineshwaran8174 Před 3 měsíci

      @@Atroxx393 you can just go watch the video again.. pause at every question and check if you can answer correctly. You don’t have to wait for the quiz video.

    • @MrLenzi1983
      @MrLenzi1983 Před 7 dny

      @@sthithapragnakk its 5.8k already cmon =)

  • @neha7502
    @neha7502 Před 2 měsíci

    Sir, please also give databricks data Engineer associate latest dump.

  • @dineshwaran8174
    @dineshwaran8174 Před 3 měsíci +3

    Thanks for your efforts. I have passed the exam today. 52/60.

    • @sthithapragnakk
      @sthithapragnakk  Před 3 měsíci

      Congratulations 🎊🎉 and Thank you for the support. Much appreciated 🤛

    • @Atroxx393
      @Atroxx393 Před 3 měsíci

      Congratulations, can you tell me how many questions from this video was on your exam?

    • @dineshwaran8174
      @dineshwaran8174 Před 3 měsíci +1

      @@Atroxx393 almost all of them. May not be exact questions but slight variations of the questions included in the video. It is important to understand the concept behind the questions and answers. The spark documentation provided during the test is also helpful but it is too small on the screen and not searchable. That makes it less usable.
      If you have seen this video once the questions would look familiar and you would be able to complete in 40 mins.. then take the rest of the time to validate any questions that you have marked for review using the documentation.

    • @user-ni2mi1ef2h
      @user-ni2mi1ef2h Před 3 měsíci

      @@dineshwaran8174 What other documentation or videos have you followed

    • @rafaelpunzel6786
      @rafaelpunzel6786 Před 15 dny

      Congratulations!, did u choose python or Scala?

  • @sebastiano9472
    @sebastiano9472 Před 2 měsíci

    Question 103, I don't find the method getAs() in the pyspark documentation

    • @jl45000019
      @jl45000019 Před měsícem

      i think the answer is A because it takes a sting or integer, as does "first"

  • @Saeed-tu8rg
    @Saeed-tu8rg Před 2 měsíci +1

    Please, This is dumps in python or scala?

  • @kavi626
    @kavi626 Před měsícem

    Is there a discount for taking up this exam for free of cost ?

    • @sthithapragnakk
      @sthithapragnakk  Před měsícem

      Unless your employer sponsors it, I dont think its going to be free.

  • @kiranchitla
    @kiranchitla Před 2 měsíci +1

    your answer to Q92 and Q111 are self conflicting. In 92, you say storeDF("Col1") wont work since it should be storeDF(col("Col1"))..which is what the option Q111.D.
    If it doesnt work in Q92, it shouldnot work in Q111. Can you explain
    Q111.D wont work, since it should be storeDF["Column1"], instead of storeDF("Col1")...so option D will also fail

    • @sthithapragnakk
      @sthithapragnakk  Před 2 měsíci

      you missed the seq function in 111, when you have seq you dont need col, but in 92 you dont have seq thats why you need col.

    • @kiranchitla
      @kiranchitla Před 2 měsíci

      @@sthithapragnakk
      Q111. Option D : storesDF("Column1") === employeesDF("Column1"), this is same as Q92 Option C i.e. storeDF("storeID") === employeesDF("storeId")...both are same.
      In Q92, you said option C will work. But the same option in Q111, doesnt work. SOrry I dont get why.
      In these two options i.e. Q111 (D) and Q92 (C), there is no seq function. So, I dont understand, if both options are same, then why it works at one place and doesnt work at other place.
      Kindly explain
      Please note, I understand Seq is the cuplrit. But, I am not questioning that. My question is simple, is Q92, option C correct, if yes, that means it work in Join. And if that is correct, then how will it work in Q111

    • @sthithapragnakk
      @sthithapragnakk  Před 2 měsíci

      @@kiranchitla I think you got question 92 wrong, that question is about which option is wrong, hence we chose C because it is wrong.

  • @gowtham-reddy0616
    @gowtham-reddy0616 Před 2 měsíci +1

    can you please verify question 73.

  • @vikasrajput1957
    @vikasrajput1957 Před 2 měsíci

    Well, if you could tweak it a little its good, otherwise it’s illegal man!