This was extremely helpful!! Between my 3 econometrics textbooks (Griffiths, Greene, and Wooldridge), the information on MA models was sparse. This really cleared up the mindset behind this model!
Thank you so much for explaining this so well! My professor and textbook explain this concept very mathematically which is hard to understand for beginners, they should really give a simple example and then dive into the details as you did.
Gemini 1.5 Pro: This video is about moving average model in time series analysis. The speaker uses a cupcake example to explain the concept. The moving average model is a statistical method used to forecast future values based on past values. It is a technique commonly used in time series analysis. The basic idea of the moving average model is to take an average of the past observations. This average is then used as the forecast for the next period. There are different variations of moving average models, and the speaker introduces the concept with moving average one (MA1) model. In the video, a grad student is used as an example. The grad student needs to bring cupcakes to a professor's dinner party every month. The number of cupcakes the grad student should bring is the forecast. The professor is known to be crazy and will tell the grad student how many cupcakes he thinks were wrong each month. This is the error term. The moving average model is used to adjust the number of cupcakes the grad student brings based on the error term from the previous month. The coefficient is a weight given to the error term. In the example, the coefficient is 0.5, meaning the grad student will adjust the number of cupcakes he brings by half of the error term from the previous month. For example, if the grad student brings 10 cupcakes in the first month, and the professor says the grad student brought 2 too many, then the grad student will bring 9 cupcakes in the second month (10 cupcakes - 0.5*2 error term). The video shows how the moving average model works through a table and graph. The speaker also mentions that there are other variations of moving average models, such as moving average two (MA2) model, which would take into account the error terms from two previous months.
You have amazing content, and it is very well explained. I love trading and thankfully I was introduced to this awesome trader Brian Branum. He has a trading system that is truly splendid. I enjoy good wins trading with him because he always manages my risks properly. Hence, I always receive my profits into my bitcoin wallet.
I still don't think this makes sense to me why is incorporating past error somehow gives us better prediction in the future in this case. Since this crazy professor will randomly choose an acceptable # of cupcakes, your past error shouldn't help in better predicting in the future.
Event though the professor selects a different number every time, at the end the average is stable. Assume you have a time series of images. Images, due to the unstable environment they're taken in or all other factors that manipulate images nature, are not always the same, although they are taken from the same scene. So, what is the goal here ?to find the mutual information in the images and ignore the noises. These noises are how crazy professor is , and the importance of error, which we can handle by its coefficient. By handling these factors, we can get close to recognising the mutual information. Remember, these are unsupervised models. There are no lable to rely on.
the idea is that you're trying to predict the next value. you get told what the next value is by the professor. if its random then there is no signal in there & the results are still meaningless
Observation: 5:32 Its always centered at 10 because the errors mean was 0 (per 1:02) and error was multiplied by Φ, which will have have a mean of 0. Feeling a little awkward commenting multiple times. Just trying to understand more by thinking aloud, and that someone may correct my understanding. :) Great videos!
How come some MA(1) formulas have x_t = mu + (phi1) error_t + (phi2) error_t-1..... If you predicting at time t then how would you know error at time t (error_t), why are some formulas like this?
How is the average moving though? It was fixed for each prediction! Wouldn't it have to be recalculated each time for it to be moving? Also we didn't seem to use anything related to the error being normally distributed... is there a reason for that? why was it mentioned in the first place?
If a physics student is reading this, just wanna share my intution that this is exactly like a control system . whatever error our model is getting, it is moving to cover it , little bit like PI controller in Electrical engineering :) not sure if it clicks to anyone
Great videos, thank you! I have a question. Period 1 value is our mean value but we don't know what is mean since we just started from point 0. How to calculate residual then? We know the true observation and we don't know the mean. Is it just a guess? But when we use any statistical package it does not ask us to input guess mean value.
Where does the noise in the equation come from? In our data we only have time on the x axis and Y as the target variable. There is no error term. What I mean to ask is does the MA model first regress y on y lag terms like the AR model and then calculate error between the actual and predicted y terms? Then regress y against the calculated error terms(residuals)?
The error is a white noise coming from random shocks whose distribution is iid~(0,1). Ftting the MA estimates is more complicated than it is in autoregressive models (AR models), because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Hope this helps :).
what is the difference between taking the average of first 3 values and calculating the centered average at time period 2 and this method(average+error t+ error at previous time period)
I am trying to get a grip on Moving Average models. Ones I know are: SMA: f_{t+1} = (o_{t} + o_{t-1} + ... + o_{t-n+1}) / n Note: There is no coefficient here, just n. EMA: f_{t+1} = α*o_{t} + (1-α)*f_{t} = f_{t} + α(o_{t} - f_{t}) where 0 < α
This is a great explanation but in many equation they also add the current error (epsilon_t). I just don't get how are we supposed to know our current error if we are trying to forecast a value. Do we simply neglect that current equation for forecasting?
Why in some models the prediction (f hat) is the average of the previous f values. But in some models, it is the error of the previous models that predict f hat.
Hi... I have one doubt.. shouldn't you have plotted the values for ft^ instead of ft in the graph? P.S: Thank you for taking the time to make these videos. It's really helpful.
@@isabellaexeoulitze6544 yeah.. I kinda expected that since it's a old video.. nevertheless the commented my doubt, hoping that someone else watching the video might clarify...
Like he drew the ft line for showing that the time series data is kind of like centered around the mean , but even I have a doubt that why didn't he also draw predicted ft along with real ft
Thanks this is a really clear explanation. My only question is when you are calculating your f_t column, why are you including the error from the current time period? Shouldn't you only be including the 0.5*e-t-1?
i wish my professor had explained it exactly like u just did
Thank you very much for making a vague concept so clear.
This was extremely helpful!! Between my 3 econometrics textbooks (Griffiths, Greene, and Wooldridge), the information on MA models was sparse. This really cleared up the mindset behind this model!
Thank you so much for explaining this so well! My professor and textbook explain this concept very mathematically which is hard to understand for beginners, they should really give a simple example and then dive into the details as you did.
Glad it helped!
Never seen a better explanation of MA models. Immediate subscription!
Same here! I knew I would suscribe after 1 minute in the video. Very clear and very useful video. Thank you very much.
I was stuck where is the “error" term coming from. Now I know... it is the error from the past. You explained! I wish you were my professor.
Couldn't be expressed so handsomely! Thanks!
Oh damm!! this is wonderful, Simplified and explained pretty nicely. Keep spreading you knowledge!!
Thank you! Will do!
Thank you so much for making this fun video! Makes so much more sense now (after struggling through my not-so-crazy professor's stats class)
Thank you Sir. You have a great way of explaining things, something I sadly rarely find from my coding/statistics teachers.
This was the best video on MA. The crazy prof made our life easier 😂😂😂
Gemini 1.5 Pro: This video is about moving average model in time series analysis. The speaker uses a cupcake example to explain the concept.
The moving average model is a statistical method used to forecast future values based on past values. It is a technique commonly used in time series analysis.
The basic idea of the moving average model is to take an average of the past observations. This average is then used as the forecast for the next period. There are different variations of moving average models, and the speaker introduces the concept with moving average one (MA1) model.
In the video, a grad student is used as an example. The grad student needs to bring cupcakes to a professor's dinner party every month. The number of cupcakes the grad student should bring is the forecast. The professor is known to be crazy and will tell the grad student how many cupcakes he thinks were wrong each month. This is the error term.
The moving average model is used to adjust the number of cupcakes the grad student brings based on the error term from the previous month. The coefficient is a weight given to the error term. In the example, the coefficient is 0.5, meaning the grad student will adjust the number of cupcakes he brings by half of the error term from the previous month.
For example, if the grad student brings 10 cupcakes in the first month, and the professor says the grad student brought 2 too many, then the grad student will bring 9 cupcakes in the second month (10 cupcakes - 0.5*2 error term).
The video shows how the moving average model works through a table and graph. The speaker also mentions that there are other variations of moving average models, such as moving average two (MA2) model, which would take into account the error terms from two previous months.
I really don't know how to thank you for that great demonstration! I've been trying to understand MA process for years!
God Bless You! I needed a fast way to get some concepts on time series forecasting and you saved me.
Easy, Fast, Complete.
You are spectacularly GOOD in the explanation of the ARIMA! Cheers
I appreciate that!
Thank you so much, I have been reading this concept in an Econometric book...but this is easy to comprehend
Glad it was helpful!
Wow! Great explanation. The professor´s example was very intuitive. Thanks for the content!
This men's explanation is way better than those profs at University.
Thank you so much for your very intelligent explanation to this model!!! i felt so confused about this model before.
Great explanation! I've learned everything that I looked for. Thank you.
So simple yet easy to understand. Thank you!
a year trying to understand this, and I ve just needed 15 minutes thx!!
This explanation gives better understanding why do we need avoid unit root in Time Series predictions
Fantastic, got too caught up in the math in my macroeconometrics course and had no idea what these things actually were. Super helpful conceptually
I was terrified for the mathematical symbols, but you made it so easy to understand! thank you!
You have amazing content, and it is very well explained. I love trading and thankfully I was introduced to this awesome trader Brian Branum. He has a trading system that is truly splendid. I enjoy good wins trading with him because he always manages my risks properly. Hence, I always receive my profits into my bitcoin wallet.
Trading without a broker or even a mentor is among the surest ways to lose everything in your trading account.
I would like to know this guy. At the very least, speak with him personally.
If you can get me a link to speak with him, I'd greatly appreciate it.
This trading strategy helps to manage losses/trades better.
I am certain you will have a positive learning and trading experience with him
Simple and clear explanation, thank you !
Manyt thanks for your clear explanation of the mathematical moving average formula
of course!
Finally ❤️ a video with an applicable and relevant example ❤️🙏
ALWAYS GRATEFUL, THANK YOU FOR THE WONDERFUL CONTENT
Great video. I think the calculation of the 3rd row is wrong. It should've been 9+0.5 = 9.5
No.. Constant term is 10 not 9
Simple Explanation is a Talent - Thanks for this
Finally understood this, thank you so much. Highly recommend!
OMG, this is brilliant , amazing ,wonderful ,thank you
Awesome explanation! Thank you so much.
Explained with the Cup Cakes it makes perfect sense, thumbs up!
Thanks you so much.
Had I watched your series earlier would have saved me $3000 :(
I love this video, so simple but effective
I still don't think this makes sense to me why is incorporating past error somehow gives us better prediction in the future in this case. Since this crazy professor will randomly choose an acceptable # of cupcakes, your past error shouldn't help in better predicting in the future.
I think the student naively believes the crazy professor will stick to his prior t-1 position (the student is unaware of the professor's craziness)
Everything in time series assumes that you can use past info to predict future info
Event though the professor selects a different number every time, at the end the average is stable. Assume you have a time series of images. Images, due to the unstable environment they're taken in or all other factors that manipulate images nature, are not always the same, although they are taken from the same scene. So, what is the goal here ?to find the mutual information in the images and ignore the noises. These noises are how crazy professor is , and the importance of error, which we can handle by its coefficient. By handling these factors, we can get close to recognising the mutual information. Remember, these are unsupervised models. There are no lable to rely on.
Thanks man. You're doing a suberb job.
Great explanation. Keep up the good work!
Nice example super easy to understand the concept!
Thank you very much! Such a clear explanation!
Great video! Thanks for sharing!
Brilliant explanation, thank you!
this is really helpful and so easy to understand!!!
How do we know what the "error" is there is if there is no "true value" given a random realization of data.
the idea is that you're trying to predict the next value. you get told what the next value is by the professor. if its random then there is no signal in there & the results are still meaningless
Let's use an example that is sligtly more natural to us -- so here's this crazy professor. :D
Exceptionally useful videos for actuarial exams. Thanks for helping me pass🙂(hopefully)
Observation: 5:32 Its always centered at 10 because the errors mean was 0 (per 1:02) and error was multiplied by Φ, which will have have a mean of 0.
Feeling a little awkward commenting multiple times. Just trying to understand more by thinking aloud, and that someone may correct my understanding. :)
Great videos!
I saw the same thing, think it was just his mistake in calculation
How come some MA(1) formulas have x_t = mu + (phi1) error_t + (phi2) error_t-1..... If you predicting at time t then how would you know error at time t (error_t), why are some formulas like this?
How is the average moving though? It was fixed for each prediction! Wouldn't it have to be recalculated each time for it to be moving?
Also we didn't seem to use anything related to the error being normally distributed... is there a reason for that? why was it mentioned in the first place?
Exactly right, I am also having same query, Average not moving
Did you get any other source where this explained clearly
If a physics student is reading this, just wanna share my intution that this is exactly like a control system . whatever error our model is getting, it is moving to cover it , little bit like PI controller in Electrical engineering :) not sure if it clicks to anyone
Or a thermostat.
LOVE IT. Thank you.
Of course!
Excellent explanation
Thanks!!! Perfect explanation :)
God Bless you.
Fantastic!
Great Presentation...
Glad you liked it!
Extremely well explained
Great videos, thank you! I have a question. Period 1 value is our mean value but we don't know what is mean since we just started from point 0. How to calculate residual then? We know the true observation and we don't know the mean. Is it just a guess? But when we use any statistical package it does not ask us to input guess mean value.
Hi, great explanation! One question, how do you guess the mu value (the average cupcake you bring) for the fist time?
thanks! Really helpful
Amazing explanation man
Greatly explain!!! Thanks
Does MA model assume et (lagged residuals) are pure white noise ? Mean =0, constant variance , and no autocorrelation of residuals ?
So not natural.. it is why you are so good in teaching
THANK YOU SO MUCH
Thank you❤❤❤
Perfect!
Thank you for the video, how should we choose the 0.5 coefficient in front of the error term from last period in the regression model?
Where does the noise in the equation come from? In our data we only have time on the x axis and Y as the target variable. There is no error term. What I mean to ask is does the MA model first regress y on y lag terms like the AR model and then calculate error between the actual and predicted y terms? Then regress y against the calculated error terms(residuals)?
The error is a white noise coming from random shocks whose distribution is iid~(0,1). Ftting the MA estimates is more complicated than it is in autoregressive models (AR models), because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Hope this helps :).
Amazing explaination
you are just amazing
You can see how the crazy professor gets hungrier month by month
Well explained ❤
Thank you 🙂
Wonderful example.
thanks!
THANK you
You're welcome!
Thank you. Love your video tutorials! Just one question: shouldn't the curve at 5'58'' be f_t? And c(10,9,10.5,10,11) be f_(t-1)?
how do we find the coefficient for the moving average model?
Algorithms use the entire time series to get as close as possible to the true value of the coefficient (often with a maximum likelihood estimator).
Hi. The mean of et is not 0. For time interval 5, you need to write -1.
you are too good
Great explanation! Third row shouldn't it be 9.5 rather than 10.5?
No, 10+1/2=10.5
@@wenzhang5879 Yeah, got it. Thanks
does miu have to be a constant? can we use a rolling window to calculate the average? will this yield better predictions?
Hey amazing Content Bravo !
Can you add to that a video talking about random walk ?
That would be great .
how is it possible you can explain this stuff so easily!
what is the difference between taking the average of first 3 values and calculating the centered average at time period 2 and this method(average+error t+ error at previous time period)
What you are describing is MA smoothing, which is used to describe the trend-cycle of past data
My professor's idea of a monthly party is 5k run 200 pushups 200 squats and 30 pullups
I am trying to get a grip on Moving Average models. Ones I know are:
SMA:
f_{t+1} = (o_{t} + o_{t-1} + ... + o_{t-n+1}) / n
Note: There is no coefficient here, just n.
EMA:
f_{t+1} = α*o_{t} + (1-α)*f_{t} = f_{t} + α(o_{t} - f_{t}) where 0 < α
God-like!
Thanks 🙏
This is a great explanation but in many equation they also add the current error (epsilon_t). I just don't get how are we supposed to know our current error if we are trying to forecast a value. Do we simply neglect that current equation for forecasting?
thank you so much
thank you so so much
You're welcome!
Great video. Do you always start with the mean as your first guess for f hat? Also, how do you fit an MA(q) model?
Why in some models the prediction (f hat) is the average of the previous f values. But in some models, it is the error of the previous models that predict f hat.
I have the same doubt, sometimes he added the half of the error to f ,and sometime to f-hat
Hi... I have one doubt.. shouldn't you have plotted the values for ft^ instead of ft in the graph?
P.S: Thank you for taking the time to make these videos. It's really helpful.
I was about to ask the same thing but I don't think the instructor responds to questions.
@@isabellaexeoulitze6544 yeah.. I kinda expected that since it's a old video.. nevertheless the commented my doubt, hoping that someone else watching the video might clarify...
Like he drew the ft line for showing that the time series data is kind of like centered around the mean , but even I have a doubt that why didn't he also draw predicted ft along with real ft
Really good explaination!
Maybe I'm stupid for asking this...
If one was to write an MA filter, how do you determine M?
Thanks this is a really clear explanation. My only question is when you are calculating your f_t column, why are you including the error from the current time period? Shouldn't you only be including the 0.5*e-t-1?