@@DigitalSreeni it would be great if you can show us how to combine a different set of features, like GLRLM with CNN feature or LBP, or how to use multiple classifiers on a specific feature set, and thank you for all the good work, I and my classmates come to your channel whenever we're stuck and we always learn something from you.
Excellent Explanation. Thank you so much. One question however. So you are saying when I use Adam optimizer I dont have to explicitly define the learning rate right? but what happens when I do - optimizerr = tf.keras.optimizers.Adam(learning_rate=5e-5) . Now what does that mean? My understanding is that the Adam optimizer starts with a learning rate of 5e-5 and it will take it from there? Is that so ? TIA.
Hi Sreeni, thanks for the video. Regarding the default values, in the TensorFlow description of Adam, they wrote "The default value of 1e-7 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1". Does it make sense to test several values here? Also, I wondered whether it makes sense at all to pass a learning rate schedule to Adam?
I am not sure why 1e-7 would not be a good default for epsilon. This hyperparameter is just there to prevent division by zero. A value of 1.0 is too large and may be used in special cases. There are a lot of hyperparameters that you can worry about but epsilon is not one of them, for a typical application. If you are engineering your own networks, like coming up with Inception like network, you can tune your own parameter. Still, I am not sure if they mentioned why 1.0 was better value than 1e-7 and if so by how much did it improve their results.
or when we talk about optimisation, are we talking about finding the best parameters? E.g. similar to how it's done with hyperparameter tuning for RF, DT, etc...
Hello sir, it is very much informative for beginners. if possible make tutorial on stacked denoising autoencoder for intrusion detection also positively
Hi Sreeni, I am a beginner with python, just learning the hooks. I thought every ML model tries to reduce the error anyway (e.g. linear regression by fitting the line and reducing the residuals...) So, what do we need optimizers for then? I don't get it. Can anyone explain?
Optimizers are the ones helping with minimizing the loss function (error). For example, for linear regression your goal is to minimize the mean square difference error. How does the job of minimizing this error? How does the system know that the error is increasing or decreasing when parameters are changed? You can use the gradient descent optimizer for this task. The optimizer calculates the gradient of the loss function, updates the parameters by taking a step in the opposite direction of the gradient, and repeats the process until convergence or a maximum number of iterations is reached. Basically, Optimizers use different algorithms to update the model's parameters (e.g., weights of the neural network. ).
@@DigitalSreeni thanks Sreeni. I understand that optimizers are there for reducing the error whenever the parameters are changed. It does this with the graident descent optimizer in this case. It's just quite theoretical, always need to see the context and numbers behind it. Anyways, I may just have another look at the video. Thanks!
Sorry, I don't understand the role of the optimizer. We know the whole objective function is derivable. I thought we are just moving in the opposite direction of this derivative. Why did you say that the optimizer keep testing directions? Thanks!
The role of the optimizer is to adjust weights and biases such that the loss get minimized. May be this video helps fill some gaps in your understanding? czcams.com/video/KR3l_EfINdw/video.html
Thank you for your video! Love the analogies with the blind folded hiker and the ball, really makes sense to me now!
Very good video. learned the functioning of optimisers in just 8 minutes.
Thank you for explaining the concepts so clearly.
Your videos are great. Thanks a lot!
You are the BEST teacher. Thank you!!! All the best for you sir Sreeni.
I love your content!
Thanks
Thank You so Much
Hallo There , so like the way you explained tthe concept
YOU ARE A LIFE SAVER !!!
I am glad you think so :)
@@DigitalSreeni it would be great if you can show us how to combine a different set of features, like GLRLM with CNN feature or LBP, or how to use multiple classifiers on a specific feature set, and thank you for all the good work, I and my classmates come to your channel whenever we're stuck and we always learn something from you.
nice explanation..
Great explanation
Glad you liked it
Excellent Explanation. Thank you so much. One question however. So you are saying when I use Adam optimizer I dont have to explicitly define the learning rate right? but what happens when I do - optimizerr = tf.keras.optimizers.Adam(learning_rate=5e-5) . Now what does that mean? My understanding is that the Adam optimizer starts with a learning rate of 5e-5 and it will take it from there? Is that so ? TIA.
Hi sir, could you upload slides for all videos you posted ?
Is it possible, please, to attach for us a link to the research paper that talks about Adam optimizes?
Hinge is loss function or optimiser?
Hi Sreeni, thanks for the video. Regarding the default values, in the TensorFlow description of Adam, they wrote "The default value of 1e-7 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1". Does it make sense to test several values here?
Also, I wondered whether it makes sense at all to pass a learning rate schedule to Adam?
I am not sure why 1e-7 would not be a good default for epsilon. This hyperparameter is just there to prevent division by zero. A value of 1.0 is too large and may be used in special cases. There are a lot of hyperparameters that you can worry about but epsilon is not one of them, for a typical application. If you are engineering your own networks, like coming up with Inception like network, you can tune your own parameter. Still, I am not sure if they mentioned why 1.0 was better value than 1e-7 and if so by how much did it improve their results.
or when we talk about optimisation, are we talking about finding the best parameters? E.g. similar to how it's done with hyperparameter tuning for RF, DT, etc...
Phenomenal
Thanks
2:25 Doesn't TF transform the equations used for the input into the respective derivative? It's mathematically different from probing 2 points.
its a simplification bro
Hello sir, it is very much informative for beginners. if possible make tutorial on stacked denoising autoencoder for intrusion detection also positively
Noted
Hi Sreeni, I am a beginner with python, just learning the hooks. I thought every ML model tries to reduce the error anyway (e.g. linear regression by fitting the line and reducing the residuals...) So, what do we need optimizers for then? I don't get it. Can anyone explain?
Optimizers are the ones helping with minimizing the loss function (error). For example, for linear regression your goal is to minimize the mean square difference error. How does the job of minimizing this error? How does the system know that the error is increasing or decreasing when parameters are changed? You can use the gradient descent optimizer for this task. The optimizer calculates the gradient of the loss function, updates the parameters by taking a step in the opposite direction of the gradient, and repeats the process until convergence or a maximum number of iterations is reached.
Basically, Optimizers use different algorithms to update the model's parameters (e.g., weights of the neural network. ).
@@DigitalSreeni thanks Sreeni. I understand that optimizers are there for reducing the error whenever the parameters are changed. It does this with the graident descent optimizer in this case. It's just quite theoretical, always need to see the context and numbers behind it. Anyways, I may just have another look at the video. Thanks!
Sir please make tutorial on image processing and segmentation with deep learning.
I have a bunch of videos on deep learning, please look for them in my channel.
Sorry, I don't understand the role of the optimizer. We know the whole objective function is derivable. I thought we are just moving in the opposite direction of this derivative. Why did you say that the optimizer keep testing directions? Thanks!
The role of the optimizer is to adjust weights and biases such that the loss get minimized. May be this video helps fill some gaps in your understanding? czcams.com/video/KR3l_EfINdw/video.html
@@DigitalSreeni I'll take a look. Thanks for answering