Neal Richardson | Bigger Data With Ease Using Apache Arrow | RStudio

Sdílet
Vložit
  • čas přidán 27. 08. 2024
  • The Apache Arrow project enables data scientists using R, Python, and other languages to work with large datasets efficiently and with interactive speed. Arrow is so fast at some workflows that it seems to defy reality--or at least the limits of R's capabilities. This talk examines the unique characteristics of the Arrow project that enable it to redefine what is possible in R. The talk also highlights some of the latest developments in the arrow R package, including how you can query and manipulate multi-file datasets, and it presents strategies for speeding up workflows by up to 100x.
    About Neal:
    Currently Director of Engineering at Ursa Labs / RStudio. Previously led product and engineering at Crunch.io. Ph.D. in Political Science from the University of California, Berkeley.

Komentáře • 3

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 Před 10 měsíci

    Interesting.

  • @rvprksh
    @rvprksh Před 3 lety

    Great project.

  • @The1enzo2
    @The1enzo2 Před 3 lety

    loving the lowkey shade between arrow and data.table, as per data.table last update news, where all i can ask is to integrate both