08 Working with Strings, Dates and Null

Sdílet
Vložit
  • čas přidán 26. 07. 2024
  • Video explains - How to use Case When in Spark ? How to manipulate String data in Spark DataFrames? How to cast dates in Spark ? How to extract date portions in Spark ? How to work with NULL data in Spark ?
    Chapters
    00:00 - Introduction
    01:08 - How to use Case When in Spark?
    04:30 - String Regex Replace
    06:00 - How to convert string to date in Spark?
    08:10 - How to add current date or timestamp in Spark ?
    10:07 - How to drop NULL records in Spark ?
    10:50 - How to transform NULL Columns in Spark ?
    12:18 - Fix DataFrame
    14:00 - Bonus Tip
    Local PySpark Jupyter Lab setup - • 03 Data Lakehouse | Da...
    Python Basics - www.learnpython.org/
    GitHub URL for code - github.com/subhamkharwal/pysp...
    Documentation Spark Functions - spark.apache.org/docs/latest/...
    Documentation Date/Timestamp Patterns - spark.apache.org/docs/latest/...
    The series provides a step-by-step guide to learning PySpark, a popular open-source distributed computing framework that is used for big data processing.
    New video in every 3 days ❤️
    #spark #pyspark #python #dataengineering

Komentáře • 17

  • @vijayvavilapalli1002
    @vijayvavilapalli1002 Před rokem +2

    Wonderful.. I ever seen these kind of teaching.. thankyou bro!! Please add more videos.

  • @anonymous-ze5fg
    @anonymous-ze5fg Před rokem +1

    great content, Please keep adding more videos, very helpful.

  • @marimuthukalyanasundram3151
    @marimuthukalyanasundram3151 Před 2 měsíci

    You're a very awesome guy. Your explanation is straightforward to understand. I have a few clarifications. Why do we have to import the libraries for each function? Is there an option to import the main libraries and achieve the same? For example, for the date conversion, you import date_format and the_date. I believe we can use Import *

    • @easewithdata
      @easewithdata  Před 2 měsíci +1

      Hello, Thank you. Please share this with your network over LinkedIn ❤️
      And for the second part, yes you can import as per your choice. Only importing required functions make it more neat and optimized.

    • @marimuthukalyanasundram3151
      @marimuthukalyanasundram3151 Před 2 měsíci

      @easewithdata, definitely I will do that. Keep following this energetic training. You have a very bright future in the IT world.

  • @passions9730
    @passions9730 Před rokem

    Good content

    • @easewithdata
      @easewithdata  Před rokem

      Thanks 👍 Please make sure to share with your network 🛜

  • @irannamented9296
    @irannamented9296 Před 17 dny +1

    need to understand one thing why yyyy and dd not in capital letter is there any reason for that

    • @easewithdata
      @easewithdata  Před 16 dny

      Spark follows the following datetime pattern format (mostly resembles to Unix formats)
      spark.apache.org/docs/latest/sql-ref-datetime-pattern.html

  • @aryans4519
    @aryans4519 Před 2 měsíci

    Can we use na.fill to fill missing values, instead of coalesce?

    • @easewithdata
      @easewithdata  Před 2 měsíci

      coalesce is used for condition handling for nulls. na.fill will do the genaric fill for the columns.

    • @aryans4519
      @aryans4519 Před 2 měsíci

      Thanks, this cleared my doubt 😀

  • @pranavganesh1855
    @pranavganesh1855 Před 7 měsíci

    Bro, what is the purpose of using coalesce here??

    • @easewithdata
      @easewithdata  Před 7 měsíci

      It is being used to transform null values. It works sane as nvl in sql. We even have coalesce in SQL.
      I know you might be confusing it with partitioning coalesce. But currently its a column transformation to fix null values. Partitioning one is applied on table level.

    • @pranavganesh1855
      @pranavganesh1855 Před 7 měsíci

      @@easewithdata Thank you..