Wes McKinney - Leveling Up the Data Stack: Thoughts on the Last 15 Years

Sdílet
Vložit
  • čas přidán 16. 07. 2024
  • Leveling Up the Data Stack: Thoughts on the Last 15 Years by Wes McKinney
    Visit rstats.ai/nyr to learn more.
    Abstract: In this talk, I will discuss some of my observations about data science tools and related computing infrastructure, both where we have come from and where we may be going in the coming years. I will connect these trends to different projects I’ve been involved with, such as pandas, Apache Arrow, Apache Parquet, Ibis, Substrait, and others. A particular focus will be on the themes of modularity and composability of system components. I will also touch on the rapid evolution of storage and computing hardware and how that may direct future development efforts in open source data software.
    Bio: Wes McKinney is an open source software developer focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow, his current focus. He authored two editions of the book Python for Data Analysis. Wes is a member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is now the CTO of Voltron Data, a new startup working on accelerated computing technologies powered by Apache Arrow.
    Twitter: / wesmckinn
    Presented at the 2023 New York R Conference (July 14, 2023)
  • Věda a technologie

Komentáře • 1

  • @derekborba
    @derekborba Před 11 měsíci +3

    Combining R, Arrow, and DuckDB using dplyr syntax is the best - and S3 integration simplifies life. Interesting ADBC implementation too. Thanks!