Predicting Stock Prices with LSTMs: One Mistake Everyone Makes (Episode 16)

Lazy Programmer

zhlédnutí 79 348

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 7. 07. 2024
VIP DISCOUNT for Financial Engineering and Artificial Intelligence in Python: deeplearningcourses.com/c/ai-...
VIP DISCOUNT for PyTorch: Deep Learning and Artificial Intelligence: deeplearningcourses.com/c/pyt...
VOTE for my next course (note: if you already got this survey, you don't have to do it again):
forms.gle/iEWwGjGsYGc4hLaA9
Věda a technologie

Komentáře • 102

@MZak-js7oy Před rokem ⁺¹⁹
You pinpointed exactly what I was wondering about. As a person who worked in the financial market for more than a decade and just learned ML. Min-Max scaling was a big question mark for me. First you never know what is the max of a certain price especially in a market (like gold) is always doing higher highs. Also, minimum price is not actually known throughout the data set, so unless you are a bank who has 80-150 years of recorded data for minimum, your data set size will never reflect the true lows or true highs. this leads that most ML models in YT tutorials just plainly panic and fails when the price is doing new historical highs or lows (according to the data set it was trained on, not actual historical highs or lows). scaling and standardization is crucial no doubts, but MinMax technique is fundamentally wrong and reflects absolutely an ignorance about the market dynamics and the core principles of training a ML model.
@digitalnomad2196 Před 6 měsíci ⁺³
just normalize the data, also to his point if anyone has an actual edge why post it online. If he has an edge and makes millions why youtube and why courses eh ?
@sanchuro7140 Před 5 měsíci ⁺¹
On top of your points, some tutorials even fit min-max scaler to both train and test . This is sooooo wrong that using future data to fit the scaler. Most of the tutorials are really just trash.
@adwaitdathanr4040 Před 2 lety ⁺¹⁷
This video is a gem. I saw a lot of blogs and tutorials repeating the mistakes you had mentioned.
@azrulfyz1162 Před 2 lety
could you please add a link to where I could find those topic, much appreciated. Thanks
@masterquiroga Před 2 lety ⁺⁷
This video is so underrated. I happen to see the same errors. I would also advise that instead of using a min-max scaler, to use a power transform and standardize.
@anoriginalnick Před 3 lety ⁺¹⁰
I believe it's important to keep things in the same scale because the algorithms apply the same learning rate to all feature dimensions.
@ahmedhamza3939 Před 6 měsíci
it's important for models who based in estimating weights using gradient because you will most likely get low weight for features that have high range and high for the opposite
@kartikpodugu Před 2 lety ⁺¹
awesome...i saw so many examples, with this mistake, but always, i felt what they are doing has some flaw. But, was unable to reason it myself. Thanks for the clarification.
@plasmaflare5217 Před 7 měsíci
Thank you so much for explaining these concepts properly, it can be seen that you have a lot of experience in this subject. I started learning machine learning techniques for analyzing economical data but I could not figure out the best method in order to forecast stock prices.
@0xVantwoutMaarten Před 3 lety
It took me a while to grasp, but thank you a lot. Mistake number 5 should be all over the internet! Everybody, if you are using a training window does not mean you are using a sequence, it is about the sequence of training windows!!!!!
@grabngoinfo Před 2 lety ⁺¹
LOVE this video! Cannot help laughing when watching the virus part, but it is so true! I am really glad that I didn’t use min max scaler in my time series tutorials. Thank you for your contribution to the machine learning community, sir!
@anandtiwari1281 Před 3 lety ⁺¹
What happen if we use rolling returns instead of just returns?
@cenobit0815 Před 3 lety ⁺²
sequence length of 1 is fine if the lstm is stateful (hidden state from prev period is used as input aswell). if the lstm is statless, you need to pass the whole sequence (and zero hidden state as input). so it basically depends what kind of lstm you are training. (stateful or stateless) but lstms are still useless for price prediction, because they tend to output the last price of the input sequence. thats what i learned when playing around with lstms for stock price prediction.
@cenobit0815 Před 3 lety
pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html (shows both approaches aswell)
@LazyProgrammerOfficial Před 3 lety ⁺¹
"Stateful" is just another way of saying that you are passing in a sequence... the hidden state h(t) is derived from the inputs x(1), ..., x(t).
If you are holding the state from past samples then there's something seriously wrong.
@cenobit0815 Před 3 lety
@@LazyProgrammerOfficial did you had a chance to evaluate the performance of transformes (multiheadattention) on stock price prediction? i am thinking on giving it another try :)
@gamingsaloon7731 Před 2 lety ⁺²
You’re right but we can use price and minmaxscaling locally to find patterns I usually apply it locally when sampling data and not on the whole data
@digitalnomad2196 Před 6 měsíci
ya exactly, there is a local min max if you define a timeframe
@gastonvilches5851 Před 2 lety
Thank you very much for this video, I was starting to think I was worng until I saw this video. There are tons of mistakes out there, specifically on this topic.
@linknero1 Před 3 lety ⁺¹
Thank you, I was going to make this as a final project in a course :V thank you so much, I'll definitively go to the course
@kyloren2093 Před rokem
Hey, thanks for the great content,
i am using R, do you think it's as good as python for this kind of analysis ?
@emmang2010 Před 2 měsíci
Thank you very much. I recommend not saying that when scaling the ideas is to have values be "small". People who might take you literally will think you mean very small values (ex. 1.2x10^-20). I would also introduce stationarity at your timestamp for "Stock returns instead of ..." since this is a step towards that.
@hughdbrown Před 3 lety ⁺²
In this video you comment that using prices is wrong but using returns is correct. Does using logs of prices have the same problem? (I ask because logs of prices are commonly used in finance because they have the property that adding logs gives the return over a period of time.) Logs of prices have no min or max, so I imagine they are similarly wrong.
@LazyProgrammerOfficial Před 3 lety ⁺¹
Yes, the same logic applies to log prices
@linknero1 Před 3 lety ⁺¹
I have a question: is it possible to use that idea to find patterns in hours instead of days? I mean, there are some observable patterns, like: "some stocks gain or lose right before they close and begin the day up (and lose) or low and increase over the day". Is it possible?
@SimonIslit Před 12 dny
Are these courses applied machine learning or advanced machine learning in depth of its working mechanism to object layer?
@BoHorror Před měsícem
Very insightful, very true the bit about using 1 sequence not multiple.
@danieledicesare9447 Před 3 lety ⁺¹
Nice video. Keep up this series :)
@mastermind2362 Před rokem
Where are the other videos? are they coming?
@alaincheong7275 Před 2 lety ⁺⁴
This is just a promotion video, if you think carefully.
@LazyProgrammerOfficial Před 2 lety
Interesting, please do share
@alaincheong7275 Před 2 lety ⁺¹
@@LazyProgrammerOfficial I classified it as promotion video, as there are various contradicting information.
@LazyProgrammerOfficial Před 2 lety
@@alaincheong7275 I welcome you to elaborate
@onceappuonatime Před 3 lety ⁺⁵
Thanks for this video. I finally took action and bought the course on Udemy. I am broke so I usually find a way to get stuff for free so this was a big step for me. I have been trading for more than 2 years now and wanted to apply ML in ways different than what I have seen online. So, thank you for making this course!
@YaShaheed Před 2 lety ⁺²
How is trading going with deep learning?
@reedoken6143 Před rokem
@@YaShaheed it's a grift, unfortunately.
@MansourAlAkeel Před 3 lety ⁺³
I cannot find you other videos about other mistakes.
I agree about using the return value instead of price as input. However this will result in input range between -1,1. What activation function would you use then ?
@pimpXBT Před 4 měsíci
use a moving average to continually plot a time series graph. theres indicators, theres greeks to measure total risk. returns aint normal, they are lognormal, so you'll obv have skew. Point of the video is form sequences of multiple models that create your strategy, then feed it in so it evolves over time (ML is basically used for parameter optimization, so you can get the best timeframe for a strategy, or use a timeframe to figure out the best moving avg window, rsi levels bla bla, so sequences are supposed to be multiple dimensional, not a scalar)
@pimpXBT Před 4 měsíci
../and all of these are already derived from the underlying returns and standard deviations, so they are normalized to fit the mean/variance of the underlying position/portfolio. you can just add them to your expected profit at face value, so theres your expected returns.
@LazyProgrammerOfficial Před 4 měsíci
Check my website for a link to all videos. Using returns would not limit the range. The range of returns is unlimited.
@doragababa3433 Před 16 dny
@@LazyProgrammerOfficial Why the range of returns is unlimited? Doesn't it have also maximum value in your train data ?
@LazyProgrammerOfficial Před 16 dny
@@doragababa3433 Any fixed set of data would have a min and a max, that's not what is meant by "unlimited". Unlimited refers to the allowable values.
@VonDutchyy Před 2 lety
Really good breakdown, nice one!
@The_Mindful_Scholar Před rokem
I'm working on imports and exports data. I'm using Time Series Generator-LSTM . my training data prediction has r2 error = 0.99 while the testing prediction has -0.39. what parameters you suggest for better results on testing predictions?
@metehan9185 Před 7 měsíci
Ask chatgpt
@Pvtmovies4384 Před 2 měsíci
did you find any solution ? facing the same issue
@TheDeatheater3 Před 5 měsíci
I am a little confused. If I am about to standardize the data then it is i.i.d data no longer sequential. In this case this case does it make sense to use LSTM at all?
@LazyProgrammerOfficial Před 5 měsíci
You can standardize data that is not IID. They are not related.
@M1911Original Před rokem
Holy shit I just standardized the data on one of my LSTM models and I instantly got over 10x less loss
@ariisichoix5795 Před 3 lety ⁺²
I really like your video. I can not agree more with all of those Video / Code example they share on youtube like it really works xD. Thank you for this video tho ! I am currently creating a real AI Trading bot using Deep RNN and I wanted to use LSTM Cells and maybe GRU Cells as well but I ended up not having good results during my training process. Hopefully your video will help me understand a little bit more why I am not able to have a better recall. (yes i am doing a classification prediction)
@Yasinzaii Před 2 lety
any luck with your project ?
@rob9207 Před 2 lety
Hi Arii, please reach out to me if you're still working on this project. Would love to talk with you.
@anshanshtiwari8898 Před 2 lety ⁺¹
Are you planning to explain more about the other mistakes?
@LazyProgrammerOfficial Před 2 lety ⁺²
Yes
@priyanshukumawat4142 Před 3 lety ⁺¹
one of the best mentor I had ever seen !!!!!! RESPECT from INDIA
@axe863 Před 6 měsíci
Integer Differencing is excessive and may significantly erode memory content. There exists some degree of tempered fractional differencing that has minimun information destruction with "good enough" stationary
@AbhishekML Před 6 měsíci
Ha, I think you're trying to sound smart, but first differencing is standard in time series, whilst "tempered fractional differencing" shows not even 1 page of search results.
@axe863 Před 6 měsíci
@AbhishekML Overstationarizing is one of the single greatest deteriments to predictability via reduction in a time series memory content. Dr Marcos de Prado highlights the tradeoff by building models on the weakest degree of fractional differencing that rejects the null of nonstationarity (ADF statistc). The differencing-memory tradeoff is not a universality ( it doesn't hold for all processes)
@axe863 Před 6 měsíci
@AbhishekML Financial time series (especially fragile assets) exhibit semi-long range dependency in the cmeans but especially in cvol even when one accounts for spurious fd via structural breaks. Integer Differencing destroys an excessive amount of predictability to ensure stationarity
@fitybux4664 Před rokem
6:25 "Some people are using a sequence length of 1... Nor is it funny or entertaining"
@saatviksingh Před 2 lety ⁺¹
Ooof please post the other videos soon
@muntedme203 Před rokem
Stationarity with heteroskedasticity....LN rets is fine. How can you normalise with a window that extends beyond the lookback being used??? Lol
@BoHorror Před měsícem
For Min Max scaling why not just use Zero Mean Normalization instead
@aravindkolli Před 6 měsíci
Is it same with lag of prices as inputs?
@LazyProgrammerOfficial Před 6 měsíci
This video is about lagged prices as inputs!
@jonfe Před 4 měsíci
I'd discover a better way of normalize the data for stock prediction.
@russnagel1 Před 3 lety
Why is this episode 16? Where is episode 15? Is this part of your paid for course?
@LazyProgrammerOfficial Před 3 lety
These are not part of a course, these are part of CZcams. You can click on my CZcams channel to see all the videos I've uploaded as usual on CZcams.
@51nibbler Před rokem
is not importent to keep it in same scale. but i made not a prediction of the next N steps of price. i made only buy sell or wait in CFD forex :)
then when you understand you can normalize the input data i have 25200 ticks as input data AUDUSD. but the normalization i m not use a formel from statistic or internet i have my own formel to calculate normalize input data :)
yes you have right. not copie a code.
understand how it work and write it self. and test it. and test it.. and test it.. and when your later version are better you can made more version^^ when not start at begin and learn to understand how Q-learning work^^
greeze from switzerland and yes my englisch is bullshit xD
i made it since 1 year as a hobby.. first version was on 23% off all trades are win trades and atm 32% of all trades are win trades and when i had 34% the ai made win with 50pips TP 20pips SL :P
@51nibbler Před rokem
and NEVER EVER NEVER sue the SAME input DATA for 2 times!! you not want that your KI only can trade only YOUR INPUT data look YT videos 99.9999% only train with the same INPUT DATA so long that the KI the input data KNOWS xD thats bulshit
i use test data from AUDUSD different times USDCAD EURUSD AUDCAD etc etc etc
4 years data and more... in 1 train step to see is this version a version with potenzial or crap but you NEVER know how LONG U must TRAIN to KNOW that it work YOU NEVER KNOW :P
@vinniehuish3987 Před 3 měsíci
@@51nibblerWtf are you saying you Indian.
@125errorz Před 2 lety ⁺¹
why arent priests talking about this?
@spinLOL533 Před 3 lety
6:05 lmao
@dzel774 Před 3 lety ⁺²
I just discovered your channel and I’m interested in the VIP course.
@LazyProgrammerOfficial Před 3 lety ⁺³
Welcome! You can find links to the VIP versions of my courses via my website, lazyprogrammer.me
@dzel774 Před 3 lety
@@LazyProgrammerOfficial thank you
@jdaniele Před 3 lety ⁺¹
50 euros? I will wait for a 9.99 offer, thanks.... :)
@spinLOL533 Před 3 lety
not every course goes to $9.99 lols
@jdaniele Před 3 lety
@@spinLOL533 that's true, as much as, not every course will sell... :)
@datascienceprofessor Před 2 lety ⁺¹
@@jdaniele
$9.99 course: predict stock prices with LSTMs!
$50 course: pointing out all the mistakes in the $9.99 course.
I rather pay more at least the instructor is honest ;)
@jdaniele Před 2 lety
@@datascienceprofessor
Yes, maybe. So we need a $150 course pointing out "all" the mistakes of the $50 course. 😋
And we'll need a $500 course pointing out "all" the mistakes of the $150 course.... OMG😮.. 😋😋
If you go through the process, it's a asymptotic curve.
Then, if we can afford it, the best is to buy a $10.000 course. hahahah😂
Will it cover all the errors? Who knows....🤔
So a $50 course could still have many errors, right?
If, for example, a $9.99 course offers 85% of right information and a $50 course reaches the 92%, that 7% more (actually 8.2% more if compared to 85%), costs me 500% ($50/$10), a bit too much.
So I should pay +400% to have just +8.2% more. Will it worth?
Anyway, most of the discounted ($9.99 for just few days at year) courses on Udemy, are usually sold between $30 and $200.
Following your reasoning, a $200 course should be better than a $50, right?
So I think if we buy a $100-$200 course discounted to $9.99, it has the best value for that money, even if it is not perfect yet, for sure better than a FIXED $50 priced course!
Fixed priced courses just pissed me off... sorry! 😅
@datascienceprofessor Před 2 lety ⁺¹
@@jdaniele You're just reaching and making up fictitious examples. Lazy is well known for having actually studied this type of material and applying it day to day. The others are obviously just marketers trying to capitalize on trends like ML and crypto. If you can't tell the difference, then you're probably not the target audience for this kind of course.
@dineshkrishnasamy1628 Před rokem
Hi. How to get VIP materials please
@barrard Před 2 měsíci
Discount?
@LazyProgrammerOfficial Před 2 měsíci
Can be found at my website!
@oberstvontoffel Před 2 lety
lstm is old. use transformers
@Rvl734 Před 3 lety
Sir i need a project of stock price prediction lstm model (back propagation algorithm) and maa website or web app or using streamlit i will pay you reply to this comment
@calendr13 Před rokem
I am in the sector, The only useful video about stock price prediction !
@kilocesar Před 5 měsíci
I'm creating my own library with GPT now, I don't have to rely looking for scrapes of others coders.
@LazyProgrammerOfficial Před 5 měsíci ⁺¹
Unbeknownst to you, GPTs are trained using Github code and therefore make the exact same mistake. I covered examples in one of my courses.
@kilocesar Před 5 měsíci
I use it to implement the initial structure to same time, but I'll know what you mean.@@LazyProgrammerOfficial
@kabokbl2412 Před 2 lety ⁺¹
hmu when he makes the course free, i dont have money to buy it
@petemoss3160 Před rokem
heh heh heh
@MasamuneX Před 6 měsíci
tldr just use min/max on indicators on bounded quantities like the outputs of some indicators not on the price itself and dont use it on price action because you "Cap" it at the maximum value that Could easily be growing still

Další v pořadí

Automatické přehrávání

Common Mistakes in Stock Price Prediction: Prices As Targets (Episode 21)