Two Effective Algorithms for Time Series Forecasting

Sdílet
Vložit
  • čas přidán 25. 07. 2024
  • InfoQ Dev Summit Boston, a two-day conference of actionable advice from senior software developers hosted by InfoQ, will take place on June 24-25, 2024 Boston, Massachusetts.
    Deep-dive into 20+ talks from senior software developers over 2 days with parallel breakout sessions. Clarify your immediate dev priorities and get practical advice to make development decisions easier and less risky.
    Register now: bit.ly/47tNEWv
    ----------------------------------------------------------------------------------------------------------------
    In this talk, Danny Yuan explains intuitively fast Fourier transformation and recurrent neural network. He explores how the concepts play critical roles in time series forecasting. Learn what the tools are, the key concepts associated with them, and why they are useful in time series forecasting.
    Danny Yuan is a software engineer in Uber. He’s currently working on streaming systems for Uber’s marketplace platform.
    This video was recorded at QCon.ai 2018: bit.ly/2piRtLl
    For more awesome presentations on innovator and early adopter topics, check InfoQ’s selection of talks from conferences worldwide bit.ly/2tm9loz
    Join a community of over 250 K senior developers by signing up for InfoQ’s weekly Newsletter: bit.ly/2wwKVzu
  • Věda a technologie

Komentáře • 176

  • @sriramsrinivasan2769
    @sriramsrinivasan2769 Před 4 lety +2

    I like this very much. Short and packed with actionable information. Thank you!

  • @user-fy5go3rh8p
    @user-fy5go3rh8p Před 3 lety +4

    Wonderful presentation, very clear, very precise. Thank you!

  • @nicoyou11
    @nicoyou11 Před 4 lety +35

    I learned so much in 14min! Thank you for sharing your knowledge and experience!

  • @EveryFConcept
    @EveryFConcept Před 5 lety +140

    1:14 Decomposition
    3:49 FFT
    14:19 Seq2seq

  • @bilguunbyambajav6252
    @bilguunbyambajav6252 Před 4 lety +5

    I cant explain it like this. This guy trully explains it. Thanks for awesome video

  • @johnhammer8668
    @johnhammer8668 Před 6 lety +6

    Thanks for the talk. Mind opening.

  • @nz7467
    @nz7467 Před 4 lety +3

    awesome vid! thank you for posting

  • @ebendaggett704
    @ebendaggett704 Před 4 lety +2

    Outstanding presentation. Thank you.

  • @TheDawningEclipse
    @TheDawningEclipse Před 4 lety +2

    I'm glad I actually watched. This is AMAZING

  • @JordanMiller333
    @JordanMiller333 Před 3 lety +57

    Using FFT for forecasting the future: that's just like repeating the past with extra steps...

    • @rushthedj306
      @rushthedj306 Před 3 lety +17

      On behalf of all of us who have tried to use FFTs to forecast the stock market and had a rude awakening: this is spot on.

    • @piotr780
      @piotr780 Před 3 lety +2

      @@rushthedj306 there is no periodicity of companies prices, so I dont see the point...

    • @rushthedj306
      @rushthedj306 Před 3 lety +1

      @@piotr780 yes that's the point

    • @k98killer
      @k98killer Před 3 lety +2

      @Aden Jett why would you post that here in a discussion thread about FFTs and misguided attempts at market prediction?

    • @aadityarajbhattarai5475
      @aadityarajbhattarai5475 Před 3 lety +8

      Fourier just extends the past, and nothing fits.
      Actually no algorithm i have used fitted, and people go wild if a single tick actually matches.
      I give up on predicting futures 😆

  • @HafidzJazuli
    @HafidzJazuli Před 6 lety +5

    Got it, Thank you very much!
    :)

  • @siloquant
    @siloquant Před 4 lety +1

    Thank you for the good talk.

  • @usbhakn9129
    @usbhakn9129 Před 4 lety

    thank you, very well explained

  • @k98killer
    @k98killer Před 3 lety +20

    If you want your model to have real validity, you need to find some way to use cross validation, else the model will be over-fit rather than generalized. Without some measurement of expected error ranges and confidence intervals, you're just trusting that a computer is able to replicate the past and therefore can predict the future.
    For time series, you could break it into chunks where the model is fit to the first 70% of the data and the remaining 30% is used for validation to simulate what would have happened had you built the model and used it to predict things from that 70% point forward. Another option is to randomly drop out 30% in segments of contiguous data and use those data as your measurement metric.

    • @k98killer
      @k98killer Před rokem +1

      @@LawrenceReitan I have not used TensorFlow, so I do not know. When I do AI/ML stuff, I tend to write everything myself.

    • @stephen7715
      @stephen7715 Před rokem

      @@k98killer 💀

  • @shandou5276
    @shandou5276 Před 3 lety

    This is very well explained!

  • @mrkzmusic
    @mrkzmusic Před 4 lety +2

    love it!

  • @maiarob2
    @maiarob2 Před 4 lety +2

    Very good tutorial. Thank you for sharing!

  • @DB-MH11
    @DB-MH11 Před 4 lety

    Great lecture!

  • @VictorSilva-gq3fq
    @VictorSilva-gq3fq Před 4 lety +1

    Great video!!

  • @Robson-dh3un
    @Robson-dh3un Před 3 lety +3

    Perfect presentation Sr. ! I"m very interested to learn more about. Can you indicate a literature or a code to do what you told on 4:21, the outages ? How can be done to compensate the mentioned problem on 4:21 ? I'm trying to figure out how to code it
    With all respect.
    Robson

  • @beibit-ds
    @beibit-ds Před 4 lety +8

    If you can't explain it in simple words you didn't understand it. This guy nails it perfectly that even my kid would get it

  • @jflow5601
    @jflow5601 Před 3 lety

    Excellent presentation.

  • @mingc3698
    @mingc3698 Před 6 lety +1

    very nice!

  • @pinakinchaudhari95
    @pinakinchaudhari95 Před 2 lety

    This is extremely good example, however i would like to see one which there more irregularities, it may have about 1-2 year long periodicity and we may not have sufficient data etc.
    Actually I came across such problem recently. Time series decomposition had very large values in the error part.

  • @andreashadjiantonis2596
    @andreashadjiantonis2596 Před 4 lety +1

    Can someone provide the name of the paper from which the very last bit was taken? (The prediction with the encoder-decoder NN)

  • @YazminAbat
    @YazminAbat Před 3 lety +1

    This is true Gold!! thx :)))

    • @lighttheoryllc4337
      @lighttheoryllc4337 Před 3 lety +1

      Assalamualaikum brothers and sis, I am seeking Software Engineers, whom can code in Tensor flow and RNN Time series. I am paying.. I am based in USA.. Please check out lighttheory.page/ or email us at info@LightTheory.tech SALAM

  • @faridabubakr5412
    @faridabubakr5412 Před 4 lety +1

    excellent!

  • @ipelengkhule9263
    @ipelengkhule9263 Před 4 lety

    I recommend this.

  • @tanweermahdihasan4119
    @tanweermahdihasan4119 Před rokem +1

    How did you provide that error feedback in the forecast? In order to calcualte the error, you will need the timeseries in the future daterange. The error feedback will help you get better decomposition, I agree. But you have the future timeseries itself, then what are you forecasting?

  • @nonamenoname2618
    @nonamenoname2618 Před 3 lety +18

    Would be great to have some code examples for FFT or seq2seq. Much appreciated, if someone can provide them!

    • @Eta_Carinae__
      @Eta_Carinae__ Před rokem

      Not code, but Reducible has an excellent video breaking down FFT in detail.

  • @ksrajavel
    @ksrajavel Před rokem

    Amazing!

  • @mahadmohamed2748
    @mahadmohamed2748 Před 2 lety

    Couldn't a wavelet transform be used instead of a FFT to catch high frequency signals?

  • @azazahmed1842
    @azazahmed1842 Před 2 lety

    is there a playlist for all related videos

  • @dadisuperman3472
    @dadisuperman3472 Před 4 lety

    Wow!

  • @adamdude
    @adamdude Před 4 lety +9

    So in that first example, is there any forecasting actually being done at all? Because it seems like he already has the real data and is just doing an FFT on it and filtering out the high frequency components. Where is the past data and where is the future data? It seems to be all past data. Also, when you are able to take the error between the real data and predicted data, then it's no longer forecasting because you have the real data right? So how can you use the described error technique if, in a real forecasting example, you wouldn't have the real data yet so you wouldn't know the error until later? I think this first algorithm isn't explained in a way where we can use it easily. There's a lot of missing explanation around it.

  • @ahsamv1992
    @ahsamv1992 Před 3 lety

    Of you master this you'll have the edge

  • @bhutanisam
    @bhutanisam Před 4 lety +1

    Please help in prediction time circle in STOCK MARKET.

  • @ryanptt6834
    @ryanptt6834 Před 4 lety +4

    5:14
    is bottom line the error? between the red and black curves?
    it seems the error varies along time, but why the bottom line looks almost horizontal?

    • @ztepler
      @ztepler Před 4 lety

      maybe this is errors after removing high freq noise?
      +1 to the question

    • @Hexanitrobenzene
      @Hexanitrobenzene Před 4 lety

      Hm, seems like he applied a threshhold to his errors, i.e., error0 = fitted - original; strange_error = {0, when abs(error0) =thr}

  • @prateekjain3813
    @prateekjain3813 Před 4 lety +6

    Can we decompose the time series using Seq2Seq?

    • @ORIGASOUP
      @ORIGASOUP Před 4 lety +1

      i think it cannot

    • @ransu9930
      @ransu9930 Před 3 lety

      not in the additive way you saw in FFT. But if you implement attention in the encoder, you could interpret attentions weights similar to Auto Regression weights

  • @MrAlket1999
    @MrAlket1999 Před 4 lety +2

    When you say "predict for each component of the time series decomposition and then you recombine the results" how exactly do you recombine them?

    • @zhangshaojie9790
      @zhangshaojie9790 Před 4 lety +1

      Same way they decompose them

    • @MrAlket1999
      @MrAlket1999 Před 4 lety

      @@zhangshaojie9790 After a bit of reasoning I reached the same conclusion. The method works very well in particular for reducing the maximum forecasting error in a time series. I found that Facebook Prophet does also a good job in reducing the maximum error.

  • @tamirgt25
    @tamirgt25 Před 4 lety +1

    You deserve the like bro

  • @bhumikalamba186
    @bhumikalamba186 Před 5 lety +2

    Does anybody understand the part between 13:20- 13:58 . Don't really understand how encoder decoder things works. How exactly does the historical data in the encoder can be used in the decoder ?

    • @fupopanda
      @fupopanda Před 5 lety

      This lecture series on recurrent neural nets by Andrew Ng should help. It explains encoder and decoder, and many other aspects of recurrent neural net architectures. The lectures assume that you already understand the basics of neural nets (a standard feed-forward neural network).
      czcams.com/play/PLkDaE6sCZn6F6wUI9tvS_Gw1vaFAx6rd6.html

  • @axe863
    @axe863 Před 2 lety +2

    I fell like this is way too simplistic for modeling financial time series data. Under extreme financial stress and boom regimes, memory is both very short-lived and extremely long-live i.e. a function of subsequence proximity to said regimes (stress or boom).

  • @wtan1851
    @wtan1851 Před 2 lety +1

    The question remains: Why can't it beat the stock market?

  • @Ruhgtfo
    @Ruhgtfo Před 3 lety

    How about transformer?

  • @edansw
    @edansw Před 4 lety +8

    so the solution is deep learning again

    • @acidtears
      @acidtears Před 4 lety +1

      It always will be. There is nothing more ideal than an algorithm that can teach itself through trial and error, what its supposed to do. The more precise you apply the algorithm, the better the results. I.e. using seq2seq for language comprehension.

    • @claudiusandika5366
      @claudiusandika5366 Před 3 lety

      ikr lol

  • @alexanderoneill6160
    @alexanderoneill6160 Před 3 lety +2

    It might work somewhat in a sideways market but otherwise time series data is non periodic.

    • @paturch7201
      @paturch7201 Před rokem

      Yea, you are right. This is because some trend market are often manipulated by market makers and such trends usually falls outside the 2 standard deviation move of price. I am glad that he talked about it. I see that his idea might work in a range bound market but not in a trend market.

  • @hanst7218
    @hanst7218 Před rokem

    Linear regression

  • @lincolnguo9562
    @lincolnguo9562 Před 4 lety +2

    Well that’s funny. Almost everything advanced(or seems to be advanced) belongs to ‘deep learning’. In my opinion, this is just the state space model or hidden Markov models, isn’t it?

    • @josephgeorge8054
      @josephgeorge8054 Před 3 lety

      No markov decision processes are pretty simplistic and don't utilize all the sequence data to calculate the probabilities

  • @arthurzhou6069
    @arthurzhou6069 Před 4 lety

    好不容易找了几个有效的例子 实际中没啥用啊

  • @ewg6200
    @ewg6200 Před rokem

    THESE differences, not THIS differences.

  • @posthocprior
    @posthocprior Před rokem

    There appears to be a contradiction in the lecture. One advantage of FFT, he says, is that recursion can be used to fit errors into the model. Then, he says it’s difficult to incorporate “new curves” with FFT. So, why isn’t recursive error fitting using FFT a “new curve” - and what exactly is the definition of it?

  • @HK-fq6vh
    @HK-fq6vh Před 3 lety

    I'm not understanding anything I'm new to this field

  • @teebone2157
    @teebone2157 Před 3 lety

    Sucks being new to this stuff lol

  • @smsm314
    @smsm314 Před 5 lety

    Good evening my Professor,
    Please sir, if we have the Yt series. To study the stationarity of
    this series, we can do the following decomposition (or filtering):
    Yt=F(t)+Ut, such that F(t) is a continuous function according to the
    trend (linen, nonlinear). And if we find the series Ut it is stationary,
    it implies that Yt is stationary, and the opposite is right?
    B.w

  • @Mario-gq2xb
    @Mario-gq2xb Před 4 lety +7

    These methods are interesting however they over complicate the forecasting process. A simple SARIMA model would do the trick, maybe even a Holt's Winter seasonal model. If u want to utilize Fourier terms a dynamic harmonic regression or sinusoidal regression might have been better.

    • @CliRousWriterz
      @CliRousWriterz Před 4 lety

      that decomposition called multiplicative classical time series analysis, it's pretty simple, not too complicated.

  • @fransmulder9326
    @fransmulder9326 Před 5 lety +26

    I am sorry my friend , using a fft for forecasting is methodological nonsense
    De implicit assumption of a fft is that
    the timeseries is periodic
    Why would it be?

    • @emiliod90
      @emiliod90 Před 5 lety +4

      Maybe he is working with data that they can casually infer is periodic or seasonal trend data unlike perhaps the FOREX or stocks? - I think he states when to use it @6:02

    • @ColocasiaCorm
      @ColocasiaCorm Před 4 lety +3

      Why wouldn’t it be?

    • @MoreFoodNowPlease
      @MoreFoodNowPlease Před 4 lety +2

      Yes, it predicts an infinite repetition of the input by definition. That's why you have to remove mean and tend before application. Better to use this technique to remove noise, and then forecast by other means.

    • @bhisal
      @bhisal Před 4 lety

      Exactly

    • @MoreFoodNowPlease
      @MoreFoodNowPlease Před 4 lety

      @@ColocasiaCorm Because it would violate stationary.

  • @jpsouza
    @jpsouza Před 4 lety +7

    learn statistics and stochastic processes, at least

  • @unclemax8797
    @unclemax8797 Před 6 lety +33

    when i was young, we were supposed to learn, arima arfima, arch, armax, state space filter, and all these tools useful for time series.......... nowadays, no need for any skill, just do deeplearning, use tensorflow and/or lstm, and all the problems will be fixed ( whatever the problem, supplychain, wheather, finacial forecasting, ...)........ and that's the same for multidimensional analysis, econometrics, and so on........
    sad.......... really sad
    i just made a comparison between a state space model, and a lstm... no need to say who was the winner, who gave a result nearly immediately, who did not need coding and debugging too much, who....

    • @swanknightscapt113
      @swanknightscapt113 Před 6 lety +10

      Are you saying that the various types of neural nets aren’t getting the job done as well as the old algorithm, or are you saying that they are inefficient in terms of man-hours (debugging and tuning) and computation? If the case is the former, then we should revive the old techniques, if the latter, then it’s completely sensible to keep the trend moving. Apart from nostalgia, I can’t say the abacus is the better application than the calculator on my phone or that paper-books are better for my learning than e-books on my G-Drive. Technology will keep on progressing, and we humans must adapt. If ARMA and GARCH must now live in museums, so be it! We who have had opportunities to work with them can always go visit on our own time.

    • @jacobswe3344
      @jacobswe3344 Před 6 lety +11

      There are still some definite advantages to the "old" style of model building. With deep learning, all you get is the result. Using techniques like ARIMA, GARCH, and partial differencing, you can get a lot more insight into the behavior of the data. You know what kind of AR and MA models you have, what that implies about the source, etc. With deep learning though, all you get is the point forecast and maybe an interval. It might be better for forecasting with a stable information set, but to be able to predict the impact of outside factors using explicit weighted coefficients and models is a lot more helpful for less informationally stable data sets.

    • @unclemax8797
      @unclemax8797 Před 6 lety +3

      i'm saying both of them.........
      for the computational time, i have no doubt that you gave it a try.......
      for the accuracy, i try things when i have got time.... i have no idea of the experience you have in the field of forecasting things in the ' non academical' field........ when i taught at the university, the first thing i said was what i was about to show was on data ' calibrated' for students to understand... when you show data in the real life ( retailers, to name things) , things are differents ........ erp and aps integrate ' forecasting tools' as welll...... they always give a result..... mmmmm is it useful? well... it depends.......
      most of the time when you want a solution, you have to build it on your own, step by step.......
      just for the fun i tried an lstm model on the tuberculosis data in china, which is a dataset you will find on the web..... make the comparison with a model from the sixties (!!!!!), and we will probably discuss some more about nostalgia! :-).......... eventhough when the lstm is properly calibrated, the old man runs faster and longer! ( ok, not much...).........
      last but not least, i hope you have the mathematical background.... so you know that AI often reinvents the wheel, and gives anew name to existing things ( ok ok, i was not talking about cnn)
      never forget that the use of genetic algorithms, tabu search, or simulated annealing was supposed to dramatically improve things...... well.....
      cheers mate

    • @unclemax8797
      @unclemax8797 Před 6 lety +4

      definitely.......... most of the people forget that ann are blacboxes.......... when they work, it's perfect....... in my professionnel life, i never met anything perfect ( well, in more than 20 years, i had the time!)........ with an ann, trying to see were the problems are is impossible, the only thing you can do is change your model ( ann, cnn, kohonen, counterpropagated, and so on) or change the parameters ( learning rate, noise, number of layers, hidden units...)
      with old models, you can see step by step what's happening.....
      last but not least, in the real life, most of the time, when you want to fix a problem, you try several kind of tools.... a tool suitable for a probelm will not work for another one, which seems to be similar.......... you can't do that with neural networks...
      as an example, let's point out what you said.... your time series is not stable in it's parameters ( i'm not talking of integrated modelling...)
      tests will tell you there is something ' to monitor'........ nothing like this with an ann.... as new data are forecasted you can just see that the more you forecast, the bigger the error..... and you ' retrain' it ' to learn' ( well well well !!!)
      cheers

    • @NaveenVenkataraman
      @NaveenVenkataraman Před 6 lety

      Jacob Swe Great insight. What learning materials (books, courses) will you recommend to develop the understanding of time series methods?

  • @jacekwodecki3530
    @jacekwodecki3530 Před 4 lety +1

    1. To perform Fourier analysis on a dataset it has to be L1-integrable. In the presented example, the time series is not L1-integrable. This method is good for 1st-year students, not for serious people. In such an example you should use proper models.
    2. Did he just hugely overcomplicate the idea of autoregressive modeling?

    • @mitalpattni1977
      @mitalpattni1977 Před 4 lety

      I am trying to understand what L1-integrable means, and in this context are you trying to say that there is no trend?

    • @jacekwodecki3530
      @jacekwodecki3530 Před 4 lety +1

      @@mitalpattni1977 There is no trend but that is not what I mean. I'm trying to say that he has no idea that he is not allowed to use Fourier analysis for this type of data, because the data is not fulfilling the basic formal conditions required for Fourier analysis, such as L1-integrability (I hope this is the word, English is not my native language :) ). It looks like he has no education in calculus, signal theory or forecasting, and he is excited about the first data analysis tutorial that he found on youtube, but he doesn't even know that methods cannot be used freely without concern, you have to follow the rules.

    • @mitalpattni1977
      @mitalpattni1977 Před 4 lety

      @@jacekwodecki3530 hey jacek I believe you vocab is correct I just googled it, but I couldn't find a good explanation except for some definitions of L1 - integrable. It would be great if you could share some material on it.

    • @jacekwodecki3530
      @jacekwodecki3530 Před 4 lety

      @@mitalpattni1977 You say that you have found definitions of L1-integrability, so what is that you need beyond that? Happy to help.

    • @mitalpattni1977
      @mitalpattni1977 Před 4 lety

      @@jacekwodecki3530 what I meant to say was some legitimate article on it instead of stack exchange answers.

  • @danielsckarin574
    @danielsckarin574 Před rokem +1

    That's why it doesn't work in trading.

  • @edpowers3764
    @edpowers3764 Před 3 lety +1

    He forgot the most effective and fastest algorithm for forecasting tabular data

  • @safishajjouz166
    @safishajjouz166 Před 4 lety

    What's different from Generalized Additive Models where smoothers are used? What he presents is a special case. Not very interesting, neither general. Plus, is a very different thing to find good techniques for data fitting from trying to understand what drives the stochastic process. Economists are far better to tell you about the second, without doing worse than people who do data fitting for forecasting purposes (call them statisticians, financial engineers, etc).

    • @stefanosgiampanis4898
      @stefanosgiampanis4898 Před 4 lety

      Aren't GAM's linear models? seq2seq models are non-linear.

    • @safishajjouz166
      @safishajjouz166 Před 4 lety

      @@stefanosgiampanis4898 no, they can be non-linear. If they were only linear wouldn't bother to comment at the video.

    • @stefanosgiampanis4898
      @stefanosgiampanis4898 Před 4 lety

      @@safishajjouz166 Can you point me to a reference about how a GAM can be non-linear? en.wikipedia.org/wiki/Generalized_additive_model

    • @safishajjouz166
      @safishajjouz166 Před 4 lety

      @@stefanosgiampanis4898 the well known book by Hastie. Otherwise, in a GAM model use f(X) = a+bx+ b^2x^2 ... As a very simple example of a nonlinear model

    • @stefanosgiampanis4898
      @stefanosgiampanis4898 Před 4 lety

      @@safishajjouz166 That makes sense, thank you. The non-linear f(X) is, however, specified (not entirely learned as in the case of a NN). Not criticizing GAMs or arguing in favor of NNs.

  • @xialishanqing
    @xialishanqing Před 3 lety +1

    Overfitting isn’t forecasting...it’s not necessary and useless

  • @yilei1051
    @yilei1051 Před 3 lety

    This is outdated even at the time of release... seq2seq is no longer the best sequence model. Not easy to train, not accurate enough result.

    • @block1086
      @block1086 Před 2 lety +1

      Attention is all you need

  • @ustcatrice
    @ustcatrice Před 4 lety +1

    Time wasted.

  • @mitrus4
    @mitrus4 Před 4 lety

    RNNs are dead