I Day Traded $1000 : Autoregressive (AR) vs. Recurrent Neural Network (RNN)

Sdílet
Vložit
  • čas přidán 4. 09. 2024

Komentáře • 103

  • @imperiumgraecum9126
    @imperiumgraecum9126 Před 6 měsíci +42

    My guess is that RNNs appear to have lower absolute error for extreme values than AR/-MA models at 13:27 because financial data, like daily stock returns, aren't normally distributed, but rather they follow heavy-tail distributions, like Cauchy.
    Since NNs are universal approximators, they can be used to capture the parameters of any distribution, whereas using AR/-MA models implies that the data are generated by a stationary gaussian process; thus they miss out on the extreme values of the true distribution.

    • @ritvikmath
      @ritvikmath  Před 6 měsíci +3

      Thanks for that analysis ! Really insightful

    • @josepeeterson6681
      @josepeeterson6681 Před 5 měsíci +3

      Hmm, then RNNs might be more suitable to model stocks with higher volatility and AR more suitable for lower volatility stocks.

  • @mquant001
    @mquant001 Před 6 měsíci +49

    the vast majority of content available on youtube is related to the application of ML models to price prediction when those of us who are dedicated to this know perfectly well that the area where it is most successfully applied is in portfolio rebalancing and the classification of market regimes. It would be great to make a comparison of different models such as RNN, GMM, HMM, RF, DBSCAN and KNN + Lorentzian distance, to see which of them is able to better classify market regimes using as features the first n PCA extracted from inputs such as volatility, LR slopes, rsi, stochastic, etc.

    • @ritvikmath
      @ritvikmath  Před 6 měsíci +6

      Great suggestion!

    • @nikkatalnikov
      @nikkatalnikov Před 6 měsíci

      100% regime shift discovery is the true alpha

    • @darioc3833
      @darioc3833 Před 5 měsíci +1

      @@ritvikmathLike that idea too!

  • @cornagojar
    @cornagojar Před 6 měsíci +23

    All the predictions are positives. Both models have simply learned that stocks tend to go up. This is a common results when you train your model on stocks listed in the SP500, which introduces survivorship bias (you are implicitly introduction lookahead bias, because those are stock that are still listed and ignored those which disappeared).

    • @ritvikmath
      @ritvikmath  Před 6 měsíci +7

      Hey so the 10 stocks chosen for each model were indeed chosen because they had predicted positive returns. However, if we look back to the overall distributions of predictions, there are an equal number of positive and negative predicted returns by all the models. I do agree with the point of training on stocks in the sp500 and it might make sense to broaden that set for next time. Thanks for the suggestion!

    • @SchoolBusTrading
      @SchoolBusTrading Před 6 měsíci

      getting your hands on a survivorship bias free dataset from the S&P500 is terribly hard to do. There are some websites that offer these sets, but they tend to cost a few thousand $ per year. Gathering them independently is near impossible when you've not got good connections at the NYSE or with other data providers. But if you ever intend to do it please let me know i need it haha@@ritvikmath

  • @fhashim
    @fhashim Před 6 měsíci +37

    Videos related to trading are quite informative and really helpful. Would like to see more of such content.

  • @sharks1349
    @sharks1349 Před 6 měsíci +14

    I would've liked to see the base line you usually use. The return that you would've made if you simply just invested X$ every day over the course of the experiment. Overall great video though, love your work

    • @ritvikmath
      @ritvikmath  Před 6 měsíci +7

      Hey great callout; I think I need to be better about using the same baselines in all the videos in this series. Glad you liked it!

    • @kanewilliams1653
      @kanewilliams1653 Před 6 měsíci

      Yes it would be great to see a follow up video with the baseline. If investing in a random scattershot of 10 stocks nets you 0.4% increase, then it's evidence the random market hypothesis is true and fitting models to the stock market doesn't really do anything useful!@@ritvikmath

    • @mgx2077
      @mgx2077 Před 6 měsíci

      Exactly, like comparing it to a random sample. Is it better than random?

  • @learning_with_irving4266
    @learning_with_irving4266 Před 6 měsíci +10

    Best video on these two models I've ever seen. Simple, clear, concise, with calculus and theory

  • @ntgladd1781
    @ntgladd1781 Před 6 měsíci +4

    Only recently discovered your channel and enjoy following your work.
    You show the distribution of predicted returns for the three models. I would have liked seeing another plot showing the actual realized returns. I expect those realized returns would look very much like the AR returns - highly leptokurtic. It appears to me that the RNN models are hugely overestimating the market volatility. To me, it seems an effective predictive model would have to match the distributional properties of the returns for one to have any confidence in its application to individual stocks.
    A possibly interesting experiment would be to generate some artificial stock data with an AR process and see how well those dynamics are captured by the RNN model.
    Keep up the good work.

    • @ritvikmath
      @ritvikmath  Před 5 měsíci +1

      Thanks for the analysis and suggestions!

  • @abudhabi9850
    @abudhabi9850 Před 6 měsíci +5

    Nice video, love your content! I liked how you aligned the x axis on the video, but I think you forgot to align the y-axis on density.

    • @ritvikmath
      @ritvikmath  Před 6 měsíci

      Thanks! That’s a good call out for sure. I usually align the density axis only when I’m plotting multiple distributions on the same chart since, in this case, we mainly wanted to know which was widest. It is more correct though to align the density axis as well to make it more clear these distributions all integrate to 1 and are over the same observational units.

  • @SchoolBusTrading
    @SchoolBusTrading Před 6 měsíci +3

    Great video idea!. BUT I'd love to have a comparison with the index (How much would a direct investment in the S&P have made) and maybe some metrics for preformance (sharpe, sortino, ...). Keep up the good work.

  • @posthocprior
    @posthocprior Před 6 měsíci +2

    There was no mention of the vanishing/exploding gradient in your RNN model. That is, because there was a difference in prediction results between the AR model and the RNN model, it would have helped to see how close or far the largest eigenvalue is in the activation function is from 1. Intuitively, it would have helped to isolate the two models to one stock's time series to see if the RNN model has either an exploding or vanishing gradient. Specifically, the AR model can be iterated in discrete steps to model the time series. This, then, should be able to be replicated by the RNN model. If the discrete steps veer towards a sink or an attractor, that is, towards infinity or zero, then the problem is inherent with the RNN model and not with your data.

  • @ShaileshAcharya-zj3sj
    @ShaileshAcharya-zj3sj Před 5 měsíci +2

    This is highly informative. More of such videos!

  • @fbf3628
    @fbf3628 Před 5 měsíci +1

    Wow, first video i have seen from you and i really liked the way you explained the concepts in a easy to understand yet detailed way!

  • @benwilcox1192
    @benwilcox1192 Před 5 měsíci

    Great video! I'm going to college next year for data science + machine learning and your whole channel so informative!

    • @ritvikmath
      @ritvikmath  Před 5 měsíci

      Love to hear it; wishing you well for college!

  • @robertdemka4110
    @robertdemka4110 Před 6 měsíci +1

    Would love a video on the actual maths behind estimating the AR model parameters (e.g. Yule-Walker etc.). The model form is easy to grasp but how it actually works - not so much

  • @philwebb59
    @philwebb59 Před 6 měsíci +6

    Four things: 1. Would a weighted average of the two models have the same outcome? It looks like GEHC is the only overlap. 2. Would stacking the two models improve the result? 3. Would adding another year or two of training data improve either model? That would probably dampen the response, but could help pick up seasonality. 4. It would be interesting to run a naive regressor (like next value = previous value) on the data to see if there's a general population trend.

    • @ritvikmath
      @ritvikmath  Před 5 měsíci +1

      1-3. Interesting idea and something I’d like to experiment with for future videos
      4. Definitely, am trying to be more diligent about using common baselines for future videos in this series

  • @Sam-tg4ii
    @Sam-tg4ii Před měsícem

    If you want to make these much more realistic, get the returns per period (instead of one single avg) and calculate some measure of risk for them, especially maximum drawdown. From my data analysis, my models may not improve my returns over buy and hold strategy that much but they do decrease the max drawdown significantly.

  • @pipertripp
    @pipertripp Před 6 měsíci

    So what I would love to see is your thought process behind the the setup. How did you choose the particular AR(IMA) model that you ran against the RNN models. Why did you choose a regularised RNN model? While I'm not interested in stocks, I do find time series stuff really interesting, so thanks for the entire playlist you've been working on. I found that it really supplemented the course content that I was working with.

  • @Ramu_Arjun78965
    @Ramu_Arjun78965 Před 6 měsíci

    Hey, it's just marvellous...please upload more! If possible can you show the above concept by doing it on Python etc It will be really wonderful. Thanks a lot, I really learnt a lot from your videos they are amazingg 💯

  • @freek633
    @freek633 Před 5 měsíci

    Very nice, i wonder how well this performs in a falling market though. Since from march 2023 - march 2024 the sp500 increased quite a bit! (And more than .4%!)

    • @ritvikmath
      @ritvikmath  Před 5 měsíci

      Interesting question! Will keep that angle in mind for future videos in this series

  • @Levnerad
    @Levnerad Před 6 měsíci +2

    Can you please make a video on prediction for cryptos??

  • @fosheimdet
    @fosheimdet Před 6 měsíci +1

    Great video, very interesting and informative! I was wondering if you would mind sharing the code by chance? I want to try something similar but need some inspiration on the coding front.

  • @alvaroromo5885
    @alvaroromo5885 Před 5 měsíci

    What was the time horizon used to forecast? As someone else already mentioned, I think it would also be nice if you could explore using tree based methods (XGB, LGBM, RF) for time series forecasting and compare te results to these kinds of methdods (AR, RNN).

  • @danielwiczew
    @danielwiczew Před 6 měsíci

    Seems that AR predicts pretty well stocks with low-volatility. Then the question is why to even predict, if a basic quant analysis can do it ?

  • @christusrex334
    @christusrex334 Před 5 měsíci

    could you possibly do this again for a VAR model or ARIMA model if you haven't already? I would be interested in seeing a comparison to some more model analysis

  • @renzomenos2528
    @renzomenos2528 Před 5 měsíci

    Well done.

  • @W0genius1
    @W0genius1 Před 5 měsíci

    Very nice! Im not at all adequately equipped to critique or question the RNN model strengths and weaknesses so I’ll just pose this as a question instead. Wouldn’t another weakness be that it inputs previous predictions? Meaning if previous predictions were poor, this will negatively impact future predictions?

    • @ritvikmath
      @ritvikmath  Před 5 měsíci +1

      Very interesting question; this is certainly true when using any kind of time series model to predict more than one time step into the future. The error accumulates as you move through time. This model however predicts just one day ahead at a time and so any error comes from just the current time step. Of course, given it’s the stock market, this error is already very significant.

    • @W0genius1
      @W0genius1 Před 5 měsíci

      @@ritvikmath Thanks for taking the time!

  • @kaganozdemir4332
    @kaganozdemir4332 Před 5 měsíci

    Excellent video! Does the equation on the right apply to LSTMs as well?

  • @AB-zv6dz
    @AB-zv6dz Před 6 měsíci +1

    How long did you run the system on real money for? Many models are positive at the start becuse the training data resembles the current market - then as soon as dynamics or regime changes they fall off a cliff. So I would be skeptical to conclude anything from the fact it was briefly positive - but good vid. Where did you get those models from btw? (the equations) Also regarding overfit, have you heard of k fold cross validation? Its super simple - you can do basic cross validation in 15 minute in py.

    • @ritvikmath
      @ritvikmath  Před 6 měsíci

      Hey great intuition about models performing better the closer you are to the validation data. I didn't get into the specifics here but the method was basically using N-1 days for training and then validating on the next day's return. We roll a window like that forward through time to understand the average performance of these models. The equation for the AR model is the usual one you'd find in any econometrics textbook. The equation for the RNN is the way the SimpleRNN class is implemented in Keras and is a bit different from how you might have seen RNN's usually taught. Specifically, in Keras the hidden layer and output layer are the same but in more theoretical approaches they are distinct.

  • @MrMoore0312
    @MrMoore0312 Před 6 měsíci

    Love it!

  • @user-zx8oy2jf4d
    @user-zx8oy2jf4d Před 5 měsíci

    The real take-away is that in the US, $4 buys you only half a cup of coffee.

  • @vineetbhagwat4256
    @vineetbhagwat4256 Před 5 měsíci

    13:14 Aren't you conditioning on the outcome variable here? The actual return the next day is unknown at the time of prediction. I think it would be more useful to condition on whether the *predicted* return was >1% to know ex-ante as an investor whether the prediction from AR or RNN is more reliable. Informative video though, thank you!

    • @ritvikmath
      @ritvikmath  Před 5 měsíci +1

      Interesting question. I think they’re both useful measures in the same way that precision (which calculates correct predictions among all positive predictions) and recall (which calculates correct predictions among all positive labels) are both useful measures. The charts I showed help us understand how well each model does on truly extreme instances and the charts you propose help us understand how well each model does on truly extreme predictions.

  • @nastrimarcello
    @nastrimarcello Před 5 měsíci

    The models traded for a single day. I suppose that's not a good way to evaluate their capabilities, right?

  • @Gingeey23
    @Gingeey23 Před 5 měsíci

    Great video - I've been using LSTMs for over 2 years to try and create a robust model capable of predicting movements on the market, trained on the SP500 stocks, not dissimilar to your video! Would be interested to see your setup for pre-processing, feature enhancement, model architecture and lookback parameters. Also subscribed!

    • @ritvikmath
      @ritvikmath  Před 5 měsíci

      That’s awesome to hear; always love hearing about folks trying to model the market with ML methods! And thanks for the sub 🙏

  • @MatallicA_one
    @MatallicA_one Před 5 měsíci

    .4 is 40%, .04 is 4%, 4% of $1000 is $40. --To only make $4, the result would be .004 not .4…. Did this do better than you thought or was there a mistake in the graphic for the video? Because the result of .4 means the you made $400.

    • @ritvikmath
      @ritvikmath  Před 5 měsíci

      So it’s 0.4% which means we made $1000 * 0.4 / 100 = $4

  • @surfingbilly9654
    @surfingbilly9654 Před 5 měsíci

    great vid :)

  • @zaurenstoates7306
    @zaurenstoates7306 Před 5 měsíci

    I'm curious what kind of training/validation loss you were training to?
    I'm a physics major that has been playing around with different model architectures for my own stock prediction model.
    Mine looks at 120 day time steps and trys to predict the next two weeks of closing prices. Ive seen some good results in training my transformer model, I've also found that the more training data I've added the better my model performs.
    Takes about 2hrs per epoch on my laptop though 😵‍💫

    • @lennartv.1529
      @lennartv.1529 Před 5 měsíci

      you could look into batch normalisation layers and dropouts to improve time. and did u try to experiment with batch-sizes in general? sometimes performance doesnt really go down but the training process speeds up by alot by varying the parameters like batch-size, learning-rate, etc. do you already use your gpu to run tensorflow? you could also think of using some kind of provider like google colab to run it on their infrastructure aswell

  • @MrPyro91
    @MrPyro91 Před 5 měsíci

    at 12:00 you said you used regularization but sound like you gave description of normalization, can you clarify? thanks.

    • @ritvikmath
      @ritvikmath  Před 5 měsíci

      so the regularized model puts constraints on how big or small the coefficients can get which attempts to stop the model from learning the training data too well

    • @MrPyro91
      @MrPyro91 Před 5 měsíci

      @@ritvikmathL2 or L1?

  • @muhammadrezwanislam5188
    @muhammadrezwanislam5188 Před 6 měsíci +1

    Would it be possible to share the code?

  • @marcogelsomini7655
    @marcogelsomini7655 Před 5 měsíci

    I wonder the performance of RNN in forecast weekly return

    • @ritvikmath
      @ritvikmath  Před 5 měsíci

      Great direction for future videos!

  • @gagangayari5981
    @gagangayari5981 Před 6 měsíci +1

    What are the features that you considered? Is it only the stock price itself.?

  • @vineetbhagwat4256
    @vineetbhagwat4256 Před 5 měsíci

    Also can you share the code you used to make these models?

  • @kylehodgson2182
    @kylehodgson2182 Před 6 měsíci

    Is there a mistake in the way you wrote the AR model sum at the start, as the derivative w.r.t x_i wouldn't quite be \beta_i ?

    • @ritvikmath
      @ritvikmath  Před 5 měsíci +1

      Thanks for pointing it out, yes the notation needs to be adjusted a bit

  • @MegaMatzzz
    @MegaMatzzz Před 6 měsíci

    could you please do a video on bayesian structural time series!!

  • @TheOlderSoldier
    @TheOlderSoldier Před 6 měsíci

    What would the error look like if you only used the picks where both models concur? I wonder if that would help or hurt your returns in the end 🤔

    • @ritvikmath
      @ritvikmath  Před 6 měsíci +1

      Love that idea! I actually just tried it and if we only considered stocks where the AR and RNN both had positive return predictions and the absolute difference between their predictions was less than 0.001, the experiment return would have been basically flat (we lost $0.54 on the $1000 investment). Super awesome idea for future vids!

  • @vazgene
    @vazgene Před 5 měsíci

    Is there any way to message you directly?

  • @_XoR_
    @_XoR_ Před 6 měsíci +1

    Why not marry both and use the deepAR architecture? :)

  • @antoineparadis9959
    @antoineparadis9959 Před 5 měsíci +1

    Github code share? 😊 please

  • @notu483
    @notu483 Před 6 měsíci +1

    Have you tried transformers?

  • @rmb706
    @rmb706 Před 5 měsíci

    Gotta pay taxes on that $4 though 😅😅

    • @ritvikmath
      @ritvikmath  Před 5 měsíci +1

      Mannn make that a quarter coffee 😂

  • @joshualogan7345
    @joshualogan7345 Před 5 měsíci

    Or maybe, y’know… everything makes money when the market goes up 🤷🏻‍♂️

  • @GoingData
    @GoingData Před 6 měsíci

    Can you tell us which tech stack you used for the automatic trading ?

    • @ritvikmath
      @ritvikmath  Před 6 měsíci +1

      hey this was coded in python, specifically using the yfinance library to get all the relevant stock price data

    • @bleizthomas4707
      @bleizthomas4707 Před 6 měsíci +1

      @@ritvikmath but then when you got the model's outputs, did you place those predictions manually or did you use some application to do it automatically? surely it must have been the latter, no?

    • @carrier_pigeon214
      @carrier_pigeon214 Před 5 měsíci

      @@bleizthomas4707I don’t think he did this in actual live market environment; wasn’t it just an experiment? The actual infrastructure you would need to execute this strategy safely are most likely beyond the means of your average retail trader. There are some retail brokerage platforms that will allow you to execute trades with an API but I would seriously avoid them due to the many risks involved (I.e., not only can normal mishaps with your infrastructure kill you but it’s also highly likely you see price slippage that kills your trades as well). You should never execute intraday trading on your own - leave it to the RenTechs, Two Sigmas, etc. of the world that can afford the tools to do this correctly. Just my two cents.