Apache Arrow Explained by Voltron Data's Matt Topol - Subsurface

Sdílet
Vložit
  • čas přidán 4. 07. 2024
  • Apache Arrow is a powerful and efficient tool for data storage and processing. In this video, Matt Topol - the author of the first book on Apache Arrow - explains exactly what it is, why you should use it and what are the best use cases for it. Learn when to choose Arrow over message passing formats like Protobuf or JSON, or storage formats like Apache Parquet, Apache ORC or CSV. Apache Arrow provides tools to optimize memory usage for better performance.
    If you’re looking for more information about data lakehouse solutions, head to Dremio’s Subsurface page. There you can find content about data lakehouses, data warehouses and data lake engines. If you would like to stay up to date with all of Subsurface’s events and content, join their meetup community at www.meetup.com/subsurface-glo....
    Connect with us!
    Twitter: bit.ly/30pcpE1
    LinkedIn: bit.ly/2PoqsDq
    Facebook: bit.ly/2BV881V
    Community Forum: bit.ly/2ELXT0W
    Github: bit.ly/3go4dcM
    Blog: bit.ly/2DgyR9B
    Questions?: bit.ly/30oi8tX
    Website: bit.ly/2XmtEnN
  • Věda a technologie

Komentáře • 6

  • @multitaskprueba1
    @multitaskprueba1 Před měsícem +1

    You are a genius! Fantastic explanations! Thanks!

  • @jchidley
    @jchidley Před rokem +2

    Good explanation of what arrow is and what it is used for

  • @chrajeshdagur
    @chrajeshdagur Před 4 měsíci +2

    Apache Arrow is an in-memory representation of data to avoid copying for repeated serialization and deserialization.

  • @ecmiguel
    @ecmiguel Před měsícem

    Great!!!

  • @rkeval1
    @rkeval1 Před rokem

    How filters on data will work in arrow as it is columnar db?

  • @stripedfbds
    @stripedfbds Před rokem

    It's a shame, really: the explanation is good and the topic is interesting, but the sound quality is absolutely atrocious. Sometimes it's difficult to even distinguish which tech is he referencing...