The Kernel Trick in Support Vector Machine (SVM)

Visually Explained

zhlédnutí 230 179

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 8. 05. 2022
SVM can only produce linear boundaries between classes by default, which not enough for most machine learning applications. In order to get nonlinear boundaries, you have to pre-apply a nonlinear transformation to the data. The kernel trick allows you to bypass the need for specifying this nonlinear transformation explicitly. Instead, you specify a "kernel" function that directly describes how each points relate to each other. Kernels are much more fun to work with and come with important computational benefits.
---------------
Credit:
🐍 Manim and Python : github.com/3b1b/manim
🐵 Blender3D: www.blender.org/
🗒️ Emacs: www.gnu.org/software/emacs/
This video would not have been possible without the help of Gökçe Dayanıklı.
Věda a technologie

Komentáře • 129

@nicoleta-vw3ql Před 7 měsíci ⁺⁴⁶
I listened to my lecturer and I was convinced that not even she understood something of her lecture...You clarified a lot for me only in three minutes...
@technosapien330 Před 3 měsíci ⁺²
More common than you'd like to think, I'm also convinced my lecturers don't understand it either.
@TheKrasTel Před 2 lety ⁺⁴⁶
Your vidoes are absolutely amazing. Please keep making these, eventually the serious view numbers will come!
@PowerhouseCell Před rokem ⁺²²
Wow! I can't believe I didn't find this channel until now- your videos are amazing! As a creator myself, I understand how much work must go into this, so HUGE props!! Liked and subscribed 💛
@Baldvamp Před měsícem ⁺¹
Incredible video, no messing around with long introductions, not patronising and easy to follow. It should be used as a guide for other people making educational videos!
@jeffguo490 Před rokem ⁺²²
This video is perfectly clear. I learnt SVM in class while I was confused by the lecture, and it is much clearer now.👍
@peaceandlove5855 Před 2 lety ⁺⁶
seriously !!!! simplest explanation of Kernel in SVM ever seen , just wow
thank you so so much bro for the hard work you are doing to make such great videos ;*
@LH-et7of Před rokem ⁺¹
The best and most concise tutorial on Kernel tricks and SVM.
@sunritpal9596 Před 2 lety ⁺¹²
Absolutely incredible explanation 👏
@satyamverma4726 Před rokem ⁺¹
Please keep making these contents. They are so intuitive. I don't understand why such channel don't grow on the other hand shitty contents are growing exponentially.
@judgelaloicmoi Před 2 lety ⁺³
so amazingly simple and clear explanation, thank you so much !
@angelo6082 Před 6 dny
You saved me for my Data mining exam tomorrow
🙏
@tomashorych394 Před 2 lety ⁺⁴
Thanks a lot! I struggled to understand the difference between conventional dimension lifting and the "trick". Now it's crystal clear! Great explanation.
@VisuallyExplained Před 2 lety
Wonderful!!
@srinayan7839 Před 2 měsíci
Please don't stop doing videos you are helping a lot of people
@lucaspecht9105 Před 2 lety ⁺²
excellent video, short but informative!
@vil9386 Před 3 měsíci
understood fully. thank you. giving a code sample is like a bonus. Awesome explanation.
@mukul-kr Před 2 lety
just subscribed after watching this video only. Hoping to find more good content as these in your channel
@amardeepkumar1267 Před rokem
Great Visuals and explanation. Got it in One go. Thanks
@think-tank6658 Před 10 měsíci
I went through class video for 1 hour didnt understand a thing..thank god you thought me in 3 min ..you are a legend bro
@negarmahmoudi-wt5bg Před 23 dny
Thank you for this clear explanation.
@0xabaki Před 3 měsíci
super smooth explanation. Thanks!
@0-sumequilibrium421 Před rokem
Very good video! Nice visualization! :)
@jonahturner2969 Před 2 lety
Wow, I am sharing this everywhere bro. Fantastic videos, we will grow together !!
@5MassaM5 Před 2 lety ⁺¹
Simply AMAZING! Thanks a lot!!!
@HeduAI Před rokem
Such an amazing explanation!
@archer9056 Před 2 měsíci
Amazing videos by explaining different concepts in simple words.
Please have long vides as well..
@Bibhu_Nanda Před 2 lety ⁺²
Keep doing the Good work 👏
@AshishOmGourav Před 2 lety
Wow...this is for free.
Amazing visuals!
@Sameer-jv5vx Před rokem
thankyou so much you have nailed it it is crystal clear about kernel after watching your video thanks again
@batikanboraormanci Před rokem
Wow, very nicely explained
@jamesr141 Před rokem
Gonna try the Kernel Trick!
@letsplay0711 Před 2 lety
Amazing Man.. 😍
Just Awesome...
@user-wi7st7ln9p Před rokem
Wow! Great video, thanks :)
@JI77469 Před 8 měsíci ⁺⁴
Fantastic video, but I think you should mention at the end why the "kernel trick" isn't practical with lots of data (i.e. why deep learning is used much more than the "kernel trick" in this age of big data): given a kernel, you need to store the entire matrix of data inputs into the kernel. There are ways to mitigate this a bit (for example Random Fourier Features and Nystrom method) but still this is a huge issue that no one has seemed to figure out how to fix.
On the other hand, if you have a small amount of complicated data then the kernel trick is very useful! For example a medical researcher might only have access to the horrifically complicated genomic data of patients at their hospital.
@user-ej1qf3jz1l Před rokem
incredible explanation!
@gunduqrkdxhdrkd94 Před rokem
I am korean so i am not good at english, but your teaching is very clear and easy to understand. Thank you teacher!
@lip6792 Před měsícem
Super good explanation
@cleverclover7 Před 5 měsíci
fantastic contribution
@GamerAvi33 Před 6 měsíci
Thank you!! Nice video :)
@gama3181 Před rokem
i love it ! thanks for the explanation
@dialgos7574 Před 5 měsíci ⁺³
At 2:11 the shown Kernel and Transformation function do not match. The transformation function is missing the element ,1 as its last component and needs a scaling factor of sqrt(2) innfront of first 3 elements
@tongwei3527 Před 3 měsíci
👍
@harithm8172 Před 9 měsíci
Amazing explanation
@lucasw8032 Před 5 měsíci
Thank you so much, excellent content.
@sophiaxiao5071 Před 10 měsíci
I really like this video. Thanks!~
@MamourBa-dc3fv Před měsícem
Fantastic video
@rokibulhasan2005 Před 9 měsíci
Great Video Sir
@marishakrishna Před rokem
Great.....Great presentation...This is what I mean when I say use Visuals graphics to explain a concepts
@Infinium Před 2 lety ⁺¹
Really great video, thanks for sharing! Out of interest, what do you use to produce your videos?
@VisuallyExplained Před 2 lety ⁺³
Thank you!! I use Blender3D and manim
@benaoudamamchaoui8384 Před rokem
amazing !! subscribed
@vaishnavi4354 Před 11 měsíci
that's so powerful to understand
@faridkhan5498 Před rokem
Something that took days to understand was well understood within a few mins. Cannot emphasize enough the importance of visualizations in MATH.
@rishisingh6111 Před rokem
Awesome! Thanks a ton!
@algiffaryriony Před 2 lety
great video!
@waseemahmed7919 Před rokem
great work
@real7876 Před 2 lety
to the point, concise, easy to understand, and even with code sample
thanks!
@TalGalili Před 2 lety
I wish I could like this video twice.
@DarkZeuss Před rokem
Just fantastic explanation, i was wondering how much time takes to make such a high quality video, and what software he is using to do it. ? ? anyone knows ?
@Mutual_Information Před 2 lety
Holy shit these are good videos!
@Mindreaderli Před 2 lety
Wonderful clear
@footballistaedit25 Před 2 lety ⁺²
Thanks, Sir. Your explanation is incredibly amazing
@VisuallyExplained Před 2 lety
So nice of you! You are most welcome!
@footballistaedit25 Před 2 lety
@Visually Explained It helps so much. I'm waiting for the next video
@imalive404 Před rokem
i have never understood kernel trick in SVM better.
@photogyulai Před 2 měsíci
Bravo! Great lecture! How the hell do you do this interactive function animations?
@abdulelahaljeffery6234 Před 2 lety
Hmmmm!
Grant Sanderson is getting a serious contender right here
@saxonharries9033 Před 2 lety
Fantastic video, thankyou
@VisuallyExplained Před 2 lety ⁺¹
Thank you too!
@man49231 Před rokem
awsm video please make a video explaining K-Nearest Neighbors Algorithm also
@hantiop Před 7 dny
Quick question: How do we choose the gamma parameter in the RBF kernel at 3:00? By, say, cross validation?
@pomegranate8593 Před rokem
AWESOME!!
@tlchannel2359 Před rokem
i hope u come back , i really like this content , pls
@flo7968 Před rokem
perfect video
@fortunebezetshali7468 Před 2 lety
really a superbe video ! Thank you.
@VisuallyExplained Před 2 lety
So nice of you :-)
@raven5165 Před 2 lety ⁺¹
Amazing video! How do you animate your videos?
@VisuallyExplained Před 2 lety ⁺²
Thanks! I use manim and Blender3D
@JulianHarris Před 6 měsíci
Thanks! How do you know which gamma to use?
@PortesNew Před 2 lety
Excellent
@Gibson-xn8xk Před rokem ⁺¹
THANK YOU!!!
@leogacitua1926 Před rokem
Great video! How does this differ from kernal ridge regression?
@JI77469 Před 8 měsíci
kernel ridge regression is just that: regression using the kernel trick. Namely, instead of a hyperplane of best fit, you do the kernel trick to implicitly nonlinearly map the data into a high dimensional (sometimes infinite dimensional) space. But like with SVM, we don't need this map, and just need the kernel matrix at all the data points to practically perform algorithms.
"Ridge" just refers to adding an l2 penalty to avoid overfitting. "Lasso" refers to l1 penalty, and I think in practice people even use l1+l2 penalties.
@light_rays Před 2 lety
wow! that's so cool
@michele23 Před 2 měsíci
Can you make a video on unique game conjecture / unique label cover ? That would be very helpful
@chaddoomslayer4721 Před 2 lety
What kind of software did you use to create such beautiful illustrations?
@VisuallyExplained Před 2 lety
I have added a list to the video description
@chaddoomslayer4721 Před 2 lety
@@VisuallyExplained appreciated!
@imanelamnaoir590 Před 6 měsíci
Amazing
@selvabalan5464 Před 3 měsíci
excellent
@julienblanchon6082 Před 2 lety ⁺²
What did you use for visualization ?
@VisuallyExplained Před 2 lety
Blender, manim library, and after effect
@sisialg1 Před 2 lety
you are amazing, you saved me !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@VisuallyExplained Před 2 lety
YAYY!
@fortherest8490 Před rokem
Im confused, how to display the hyperplane with polynomial kernel? Help me please?
@fizipcfx Před 2 lety ⁺¹
did you use the Unity to produce this video?
@alexradu1921 Před 2 lety
he said blender
@fredericnataf7927 Před 2 lety
Crystal clear
@pacomermela6497 Před 2 lety
Thank you
@poulamibanerjee3040 Před rokem
So easy to visualize
@ProfBeckmann Před 7 měsíci
thanks!
@-________9732 Před 29 dny
In witch programmulka can doing this
@rohanjamakhandi5461 Před 7 měsíci
How to I use this when my decision boundary needs to be spiral?
@CrispySmiths Před rokem ⁺¹
What software do you use to make these videos? and why have you stopped making videos!? and why did you start?
@user-wr4yl7tx3w Před 7 měsíci ⁺¹
why did you have "1 + " term in your polynomial kernel?
@BlueBirdgg Před 2 lety
Ty!
@lancelotdsouza4705 Před 2 lety ⁺¹
Loved your video..can you Mentor me?
@ttiigogo5897 Před rokem
Still don't really understand, but I'm closer, thanks!
@peskarr Před rokem
i dream about this visualisations
@owonubijobsunday4764 Před rokem
I'd like to subscribe again
@AmitYadav-ss7hj Před 2 lety
Great visualization.
@cy-ti8ln Před měsícem
As I see, kernel-based regression is a type of symbolic regression
@frederickaltrock5018 Před rokem
I really would have liked to know why you claim that we only need to computer inner products. Does it arise from the dual problem? If i remember correctly that problem features such scalar products. And why is that better?
@VisuallyExplained Před rokem ⁺²
If you write the dual problem (e.g., page 4 of www.robots.ox.ac.uk/~az/lectures/ml/lect3.pdf) you can see indeed that it only depends on inner products of the training points. There is, however, a more "intuitive" way to see why SVM only cares about the inner products without actually writing the dual problem. If I give you the matrix X = (xi^Txj) of pair-wise inner products of n points, you can recover the coordinates of the n points "up to rotation" by computing the matrix-square root of X for example. In other words, all sets of n points that have the inner product matrix X are obtained by rotating the n points whose coordinates are the columns of sqrtm(X). Now, if you want to separate the n points with a hyperplane, your problem doesn't fundamentally change if all the points are rotated in some arbitrary way (because you can just rotate the separating hyperplane in the same way). So the knowledge of the matrix of inner products X is sufficient for SVM to do its job.
As to why that's helpful, let's say have 100 points, and each point has a million features (which can easily happen if you "lift" to the data). That's a 100 million numbers you need to store. However, the matrix of inner products will be 100x100 only, which is a huge saving!
@crush3dices Před rokem
@@VisuallyExplained this is my other Account. The first one is for work, didn't realize i used that one.
Thanks for taking the time to answer :). Interesting, not quite trivial, that the matrix contains all relevant information.
I have already read some more sources on it now. Since you seem to understand involved math, maybe u can help me with another Question. The Question is, why exactly we use the kernel trick instead of simply using a usual transformation into another vectorspace and then use usual linear svms. This seems like it would work so there has to be a motivation for the kernel trick. I already read that this has better performance. But even the book "hands on machine learning" only says that it "makes the whole process more efficient" which says practically nothing about the motivation. One thing one can easily notice is that since the dual problem optimizes only for the lagrange multipliers,, we have to calculate the kernel only once before training. This also seems to be the reason why the kernel trick only works for the dual problem. But i was wondering weither this is the whole Motivation or if there is some more magic that i missed here?
@VisuallyExplained Před rokem ⁺²
@@crush3dices There are basically two main reasons. The first that you already alluded to, is for performance reasons. It's more efficient to compute k(x,x') than the transformation f(x) if f is very high dimensional (or worse, infinite dimensional). The second reasons is practical: sometimes, it is easier to choose a kernel than a transformation f
@crush3dices Před rokem
@@VisuallyExplained alright thanks.
@benmoyal6107 Před 11 měsíci
wow!

Další v pořadí

Automatické přehrávání

Support Vector Machines: All you need to know!