Deep Dive into the New Features of Apache Spark 3.2 and 3.3

Sdílet
Vložit
  • čas přidán 26. 08. 2024
  • Apache Spark has become the most widely-used engine for executing data engineering, data science and machine learning on single-node machines or clusters. The number of monthly maven downloads of Spark has rapidly increased to 20 million.
    We will talk about the higher-level features and improvements in Spark 3.2 and 3.3. The talk also dives deeper into the following features
    + Introducing pandas API on Apache Spark to unify small data API and big data API.
    + Completing the ANSI SQL compatibility mode to simplify migration of SQL workloads.
    + Productionizing adaptive query execution to speed up Spark SQL at runtime.
    + Introducing RocksDB state store to make state processing more scalable
    Connect with us:
    Website: databricks.com
    Facebook: / databricksinc
    Twitter: / databricks
    LinkedIn: / data. .
    Instagram: / databricksinc

Komentáře • 1