Polars is the Pandas killer / Igor Mintz (Viz.ai)

Sdílet
Vložit
  • čas přidán 6. 04. 2024
  • While pandas is the de-facto dataframe solution for python, Polars is competing head to head with scale, speed and ease of use.
  • Věda a technologie

Komentáře • 22

  • @z.r.777
    @z.r.777 Před 3 měsíci +9

    Why did they keep interrupting??

  • @moose304
    @moose304 Před 3 měsíci +6

    Previously used Pandas but the speed and syntax are just so much better and more consistent in Polars (after the initial learning curve) I only switch back to Pandas if HAVE to (usually GIS related). Also, one update since this talk was given, it was announced that Polars will also be getting GPU support via Rapids (same project that brought GPU support to Pandas)

    • @IgorMintz
      @IgorMintz Před 3 měsíci +3

      You're right! I've just seen the update regarding rapids-polars-GPU a few days after the lecture. Now I need to update the slides lol

  • @gorillaglued
    @gorillaglued Před 3 měsíci +4

    Lazy processing with scan_parquet and sink_parquet is pretty good, use it all the time. I usually have 10's, or even 100 GB's of csv's. Just scan through it dump to parquet (still bigger than I could keep in memory), then break it apart with sink/scan and duckdb.

  • @danielthompson2561
    @danielthompson2561 Před měsícem

    I wish it could lazy scan from a database - the lazy frame function is excellent, but for data security reasons, I’m working from secure database and not parquet files.

  • @twentytwentyeight
    @twentytwentyeight Před 3 měsíci +1

    The polars learning curve was tougher for me than dask but performance is consistent 👍🏾

  • @galaxia_fe
    @galaxia_fe Před 2 měsíci

    Great presentation. It seems some of the audience members didn’t quite grasp what Polars is from the questions. Either way, they now have heard of it

  • @Sean_neaS
    @Sean_neaS Před 3 měsíci +3

    I was trying to learn pandas just as Polars was coming out and I'm glad I switched to Polars. I was so frustrated with pandas nonsensical quirks and inconsistencies. It didn't seem to follow any conventions I'd ever seen. Maybe there was some statistics or math standard that I'm not familiar with, but as a programmer, Polars has a beautiful API that does what I expect it to do.

  • @sweealamak628
    @sweealamak628 Před 3 měsíci +1

    I'll take Polars seriously when employers ask for certification. Till that happens, I see enterprises unable to uproot themselves from pandas.

    • @ringpolitiet
      @ringpolitiet Před 3 měsíci +2

      Are they asking for pandas certification now?

    • @sweealamak628
      @sweealamak628 Před 3 měsíci

      @@ringpolitiet No of course not. There is no pandas cert, but cert in ML or Data Analytics using pandas.

  • @fburton8
    @fburton8 Před 3 měsíci

    Sectarianism aplenty!

  • @user-bt6pp1dt4w
    @user-bt6pp1dt4w Před 3 měsíci +2

    Video is very basic. If you have tried polars already don’t waist your time

  • @mokus603
    @mokus603 Před 3 měsíci +7

    Polars might be faster but the syntax is so bad, it's extremely uncomfortable.

    • @davidmas26694
      @davidmas26694 Před 3 měsíci +8

      Omg syntax in pandas is way worse

    • @marco_gorelli
      @marco_gorelli Před 3 měsíci

      It's beautiful once you get used to it - I'd suggest reading the blog post "The Expressions API in Polars is Amazing", if you search for it you'll find it

    • @ringpolitiet
      @ringpolitiet Před 3 měsíci +1

      Do you have an example of something you find uncomfortable in polars that is better in pandas?

    • @patericktran
      @patericktran Před 3 měsíci +1

      😂 i think u did not know sql

    • @chobblegobbler6671
      @chobblegobbler6671 Před 2 měsíci

      Better than pandas, same shit as pyspark lesser than sql
      Databricks, Spark Sql best

  • @BUY_YT_VIEWS_m044
    @BUY_YT_VIEWS_m044 Před 3 měsíci +2

    Wow, this is the kind of content, keep me visiting youtube.