Vincent Warmerdam - Keynote "Natural Intelligence is All You Need [tm]"

Sdílet
Vložit
  • čas přidán 25. 07. 2024
  • In this talk I will try to show you what might happen if you allow yourself the creative freedom to rethink and reinvent common practices once in a while. As it turns out, in order to do that, natural intelligence is all you need. And we may start needing a lot of it in the near future
    I've met a lot of authoritative people in my field who pass out advise that sounds like this:
    Working on recommenders? Collect all the data! Sessions!
    Working on text classification? That's a solved problem! Bert!
    Working with embeddings? There's a library for that already!
    Working on tabular data? XGBoost for the win! GridSearch!
    In short: "this is how you do data science, don't go and reinvent the wheel".
    If you spend 5 minutes thinking about "the invention of the wheel" though, then you may start to rethink. After all: the wheels on a bike are different from the wheels on an airplane, just like the wheels of a tractor. And for Pete's sake: that's a good thing! If we hadn't reinvented those wheels, we're be stuck with wooden horse carts.
    So ... what might happen if we take the time to rethink a few things?
    Specifically, this keynote will discuss the following topics:
    text classification
    fraud detection
    product recommenders
    active learning
    embeddings
    I hope you'll join me for some new ideas as well as some live demos.
    Bio:
    Vincent Warmerdam
    Vincent D. Warmerdam is a software developer and senior data person. He’s currently works over at Explosion to work on data quality tools for developers. He’s also known for creating calmcode.io as well as a bunch of open source projects. You can check out his blog over at koaning.io to learn more about those.
    ===
    www.pydata.org
    PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
    PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
    00:00 Welcome!
    00:10 Help us add time stamps or captions to this video! See the description for details.
    Want to help add timestamps to our CZcams videos to help with discoverability? Find out more here: github.com/numfocus/CZcamsVi...
  • Věda a technologie

Komentáře • 12

  • @oniricosoy
    @oniricosoy Před měsícem +1

    very inspiring 😄

  • @alessandroceccarelli6889
    @alessandroceccarelli6889 Před 7 měsíci +6

    His sessions are ALWAYS inspiring! Congrats

  • @omarelsayed247
    @omarelsayed247 Před 7 měsíci +9

    you know shit is about to be good when this guy talks

    • @patrick_bateman-ty7gp
      @patrick_bateman-ty7gp Před 6 měsíci

      Heck yeah ! He's amazing! He questions the most basic aspects of data science, I love that about him. He's the one who goes in a crowd "why didn't you use simple linear regression? Why use this neural network for everything?!"

  • @pmbaumgartner
    @pmbaumgartner Před 8 měsíci +5

    Timestamps (Generated by Whisper & GPT-4):
    00:00 - Introduction to Keynote and Talk Preparation
    00:36 - Article Discussion and Smartwatch Data Set Overview
    01:16 - Statistics Course Case Study with the Data Set
    02:01 - Data Visualization and Analysis Methodology
    03:04 - Insights from Data Set and the 'Gorilla' Concept
    03:13 - Real-World Application: Recommender Systems for Used Cars
    05:04 - Shifting Strategies: Classifier Over Recommender
    06:01 - Innovative Approach: Recommender System Reversal
    07:01 - Influence of the Netflix Prize and Kaggle on Problem-Solving Approaches
    08:05 - Concept of Reinventing the Wheel in Data Science
    08:36 - New Data Set on Credit Card Fraud and Algorithmic Approaches
    10:03 - Rethinking Algorithmic Approaches and Visualization Techniques
    11:00 - Demonstration: Analyzing the Credit Card Fraud Data Set
    13:44 - Utilizing Visualization for Predictive Analysis
    14:17 - Interactive Data Exploration and Simplification
    15:18 - Comparing Different Algorithmic Approaches
    16:11 - Rethinking the Use of Random Forests in Fraud Detection
    17:10 - The Importance of Human Learning in Data Analysis
    20:30 - Transition to Word Embeddings and Conceptual Understanding
    23:01 - Advanced Techniques in Natural Language Processing
    25:06 - Exploring Phrase Embeddings for Enhanced Contextual Understanding
    27:07 - The Importance of Rethinking Traditional Approaches
    28:07 - Finding Inspiration in Unconventional Data Sets
    30:10 - Building a Classifier for Novel Data Sets
    32:12 - Rethinking Annotation and Classification Strategies
    33:43 - Innovations in Data Annotation and Model Training
    38:04 - The Optimality Trap in Data Science and Machine Learning
    40:40 - Avoiding Monoculture Thinking in Data Problem Solving
    42:43 - The Role of Doubt in Creative Problem Solving
    44:50 - Encouraging Creativity and Independent Thinking in Data Science
    45:48 - The Future of Data Science: Independence Over Tool Dependence
    46:06 - Final Thoughts and Invitation to Workshop

  • @dirknbr
    @dirknbr Před 8 měsíci +1

    Great talk and message. I agree we need doubt and curiosity. Always ask your DS, why did you choose this model?

  • @GoingData
    @GoingData Před 8 měsíci +7

    I really loved this talk, man how could i work with you ?

    • @DarjoScn
      @DarjoScn Před 8 měsíci

      You should volunteer for PyData, lots of interesting and smart people to work with :)

    • @josephbolton8092
      @josephbolton8092 Před 8 měsíci

      I was thinking exactly the same thing!

  • @jincui448
    @jincui448 Před 8 měsíci +1

    Very interesting session. Any chance you can share the Juypter Notebook for the credit card dataset?

  • @Mehrdadkh87
    @Mehrdadkh87 Před 8 měsíci

    29:00 agricultural photography