Vincent Warmerdam: The profession of solving (the wrong problem) | PyData Amsterdam 2019

Sdílet
Vložit
  • čas přidán 25. 07. 2024
  • This is a talk on failures of solutions: natural, artificial and "intelligent". It'll be a list of stories. Some of them will include:
    how I got an A+ for making the wrong decision on a thesis
    how a kaggle competition got hacked
    how DeepLearning nearly ruined a recommender
    how a redefinition of a problem saved people from hunger
    how a timeseries problem could not be solved until it got ignored
    My goal is to show that in order to solve a data problem, it might be good to take a step back once in a while and to try to see the bigger picture.
    www.pydata.org
    PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
    PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
    0:00 - Welcome
    0:10 - The Profession of Solving (the Wrong Problem)
    0:40 - Let's start with a true story
    1:00 - Introduction - Assignment for a first year course in statistics
    1:25 - Data from a gig at a local theatre
    2:23 - Statistically significant result
    3:04 - Realizing what went wrong
    5:00 - First story - Designing a recommender system for the NPO
    7:00 - Step 1: DrawPictures not DeepLearning
    9:00 - Step 2: A/A test this setup!
    9:51 - Step 3: alg(B) = entropy
    10:41 - Step 4: More Measuring: the order!
    11:37 - Step 5: Is it time for DeepLearning?
    11:55 - Step 5: not DeepLearning, but DumbThinking
    13:13 - Vincent's Lemma: Serendipity (The solution)
    16:14 - Gotta Love Heuristics: Lessons
    17:45 - Second story - 261 days: a billboard lottery
    18:59 - Mathematical formulation of the billboard problem
    20:21 - Rephrasing the billboard problem
    21:21 - Applying rephrasing to a WHO food problem
    25:08 - Third story - Kaggle's Santa Problem
    26:52 - Mathematical formulation of the problem
    27:56 - Everyone was solving the wrong problem
    29:42 - Lessons learnt
    30:55 - Last story: the condom factory
    32:26 - Solution to the condom factory problem
    33:30 - Parting words
    34:40 - Advice
    S/o to github.com/avvorstenbosch for the video timestamps!
    Want to help add timestamps to our CZcams videos to help with discoverability? Find out more here: github.com/numfocus/CZcamsVi...
  • Věda a technologie

Komentáře • 3

  • @AMetalheadsJourney
    @AMetalheadsJourney Před 2 lety +10

    As a senior data scientist and having held the position of manager of data science & ML, I can say the following. Often times, people are more caught up in doing the next 'cool' thing and elevating their own ego rather than solving the problem. That is, it's cooler to do it using a newer technique and they are actually after that accolade more than solving the problem. I've often had managers come to me and go, yeah you could use deep learning for this and my response is, "You do not need deep learning to do this." That would be overkill.

  • @straumits
    @straumits Před 2 lety +3

    I'm not convinced the a+s=t trick is correct. The total number of letters t is not a constant. Changing the number of letters you send will increase the total t. The trick will only work if other people send less letters if you send more, thus keeping t constant. If you assume s

  • @won20529jun
    @won20529jun Před 2 lety

    What a great talk! Love the simple serendipity score