Anomaly detection in time series with Python | Data Science with Marco

Sdílet
Vložit
  • čas přidán 20. 07. 2024
  • A hands-on lesson on detecting outliers in time series data using Python.
    Full source code: github.com/marcopeix/youtube_...
    Dataset can be found here: github.com/numenta/NAB/blob/m...
    Labels can be found here: github.com/numenta/NAB/blob/m...
    Chapters:
    Introduction - 0:00
    Get the data - 4:11
    Robust Z-score method - 9:08
    Robust Z-score method (code) - 13:12
    Isolation forest - 20:48
    Isolation forest (code) - 22:33
    Local outlier factor - 27:16
    Local outlier factor (code) - 31:21
    Thank you - 34:01
  • Věda a technologie

Komentáře • 27

  • @user-ms4et7nv9p
    @user-ms4et7nv9p Před 6 měsíci

    Hi Marco!! Thank you so much for making great videos on "Anomaly detection". Great Great work! Please keep sharing! 🙏🙏🙏🙏

  • @pabloarriagadaojeda6452
    @pabloarriagadaojeda6452 Před 9 měsíci +2

    hey Marco!! This is the first time I've watched one of your videos, and after 5 minutes of starting the video, I quickly went through your entire channel, looking at your content. It's AMAZING! Thank you for all your efforts to share your knowledge with the community. A hug from Chile!!

  • @nathaliegrand4018
    @nathaliegrand4018 Před 10 měsíci

    Excellent presentation. Very clear explanation. Would be great to have more info on the impact of the context and wich one of the methods is expected to work best in wich context.

  • @joaovict007
    @joaovict007 Před 4 měsíci

    Very interesting content, thank you!

  • @EngMAli-vk3nz
    @EngMAli-vk3nz Před rokem +4

    Thanks for this
    We Hope to make Some One For MultiVariate Time Series Anomaly Detection

  • @eladiomendez8226
    @eladiomendez8226 Před rokem +1

    Great video

  • @pinoyguitartv
    @pinoyguitartv Před rokem

    Thanks for this 🤘

  • @user-ou5yb7uk3g
    @user-ou5yb7uk3g Před 5 měsíci

    Hello Marco, thank you so much for such a great video. Can you please make a video on anomaly detection for time series data using pycaret.

  • @littlepigywigy
    @littlepigywigy Před 4 měsíci

    nice and clear

  • @salafghaniedsa669
    @salafghaniedsa669 Před rokem

    awesome!

  • @hoanhvong
    @hoanhvong Před 5 měsíci

    🎉 thank you a lot

  • @code2compass
    @code2compass Před 2 měsíci

    Hello!! quick question, why is the threshold 3.5 any reason please?

  • @user-qz3nx4xy8c
    @user-qz3nx4xy8c Před 4 měsíci

    how about random cut forest ?

  • @jannoona123
    @jannoona123 Před 8 měsíci

    Hi Marco! I'm working on a project and this has a lot of components I need. I noticed the specification of the data said that it was being recorded every 5 minutes, could you create a tutorial on how to retrieve a stream of live data and pass it to the algorithm in a somewhat real-time fashion? I hope this is similar to what I understood from your data collection in the video

    • @jiretkatharpi1099
      @jiretkatharpi1099 Před 7 měsíci

      Hi I wanted to work on the same thing, did you get anything?

  • @harrishvar7677
    @harrishvar7677 Před 9 měsíci

    Hey !, Is it possible to identify and flag anomalies within a continuous numerical attribute?

    • @datasciencewithmarco
      @datasciencewithmarco  Před 9 měsíci

      If by continuous, you mean at a very high frequency, then yes, I don't see why not!

    • @harrishvar7677
      @harrishvar7677 Před 9 měsíci

      Thanks !, If possible, can you make a video on that, it would be really helpful !@@datasciencewithmarco

  • @isabelahorta3063
    @isabelahorta3063 Před 5 měsíci

    Hi! Do you recomend any video for pattern-wise anomaly detection?

    • @datasciencewithmarco
      @datasciencewithmarco  Před 5 měsíci +1

      I don't know any, but you can look at the library TOAD for anonaly detection in time series. They do pattern-wise detection if I remember well

  • @gouthamkarakavalasa4267
    @gouthamkarakavalasa4267 Před 7 měsíci +1

    Anomaly detection is unsupervised, how did you get to if a point is anomaly or not, even before training the model ?

    • @datasciencewithmarco
      @datasciencewithmarco  Před 7 měsíci +2

      The dataset is labeled. That way, we can measure the performance of each anomaly detection methods.

    • @devanshtalapa1416
      @devanshtalapa1416 Před 4 měsíci

      We got a few positive labels in cross validation

  • @ananya_kathak
    @ananya_kathak Před 11 měsíci +1

    What is the accuracy?

    • @datasciencewithmarco
      @datasciencewithmarco  Před 11 měsíci +1

      Here, accuracy is really not a good idea, because there are so few anomalies. A simple baseline could achieve 99% accuracy, even though there is no "learning". That's why we use the confusion matrix here to see if we can actually identify anomalies.