The Beauty of Linear Regression (How to Fit a Line to your Data)

Sdílet
Vložit
  • čas přidán 25. 11. 2022
  • In this video, we'll explore the concepts surrounding linear regression. Linear regression is very useful in math, science, and engineering, and is a gateway to other kinds of regression, and optimization problems in general.
    Download the Linear Regression Example Code here: pastebin.com/7cgh951s
    Thanks to fesliyanstudios.com for the background music! :)

Komentáře • 256

  • @RichBehiel
    @RichBehiel  Před rokem +42

    Hi everyone, this video has been getting a lot of views lately so I just wanted to say thank you, and I really appreciate all the positive feedback. It’s great to see such a positive response, and I’m glad that so many people are enjoying linear regression! :)
    I also appreciate the constructive criticism! A few of you have pointed out that the music is distracting, the motion is too repetitive, and the pace is a bit slow. I didn’t see that when posting the video, but I can totally see where you’re coming from, so I’ll definitely take that into account when making future videos. This was one of my earlier videos and I was still figuring things out. So I really appreciate your feedback, and I hope these videos will get better over time.

    • @myetis1990
      @myetis1990 Před rokem +2

      You are not only teaching math stuff but also teaching how to think, thank you very much for the great video.
      Really inspiring, glad i discovered this channel, waiting for the videos about jacobian , translation , rotation, quaternions

    • @ehfik
      @ehfik Před 8 měsíci

      the constant animation loop gets a bit annoying. reversing, stopping and changing the animation from time to time would be a solution (and your newer videos are even better anyway!)

    • @RichBehiel
      @RichBehiel  Před 8 měsíci +1

      I agree. Honestly I look back on this video and cringe at a few of the details, like how the animation loop goes on and on and is a bit nauseating, and music is too loud. But you live and learn! 😅 When I first started making these videos I really had no idea what I was doing.

    • @phenixorbitall3917
      @phenixorbitall3917 Před 6 měsíci

      @RichBehiel 18:19 on the left hand side you used Laplace Symbol instead of Nabla Symbol.
      But except that => great video! 👍

    • @atticmuse3749
      @atticmuse3749 Před 2 měsíci +1

      With regards to pacing, I want to say that I really enjoy your general presentation style. You're not simply reading a script and getting the perfect take, you're actually doing a "live" presentation and I really appreciate the way you ad lib or go off on little tangents. I burst out laughing in your buoyancy video when you read the integral "zndS" phonetically.

  • @patricktanoeyjaya4430
    @patricktanoeyjaya4430 Před rokem +69

    I really love how calmly you speak and how the lines you say feel unscripted. Makes it feel very personal.
    You also speak so clearly and concisely. I was able to get the gist of this with only high school calculus!
    This is making me like math again.

    • @RichBehiel
      @RichBehiel  Před rokem +2

      I’m very glad to hear that! :)

  • @TheRiverNyle
    @TheRiverNyle Před rokem +70

    As an Applied Math (Stats/Probability Theory focused) major, this really got me excited!

  • @user-pw5do6tu7i
    @user-pw5do6tu7i Před rokem +20

    unbelievably crisp explanation of gradient decent. It is remarkable to see it play out in those dimensions. Thank you

    • @whannabi
      @whannabi Před rokem +2

      And he repeats the animation so we can assimilate what's going on instead of quickly switching to the next thing. Very relaxed explanation which is nice.

  • @matteokimura1449
    @matteokimura1449 Před rokem +28

    Another beautiful way to get a linear regression formula is to take the vector space of all real-valued functions that are defined for the x values, choose the hypothetical ideal function that maps all of the x's to their y's, and orthogonally project that hypothetical function onto the subspace of linear functions. By defining the inner product as the cartesian dot product between the output of the functions due to the x values, you'll see that the distance the projection minimizes is the error between the linear function and ideal function.

  • @andreiimbru6835
    @andreiimbru6835 Před rokem +16

    As an Econ Major, you have no idea how much this helped me understand the behind the scenes of regression lines and everything I've done in Statistics this semester, I've learned soo many new techniques with equation manipulation so, thank you!

  • @zeyogoat
    @zeyogoat Před rokem +9

    A rare video that's technically adept and, most importantly, not condescending or pedantic! Well done, from a chemist and educator =)

  • @mroygl
    @mroygl Před měsícem

    This is a piece of art, a captivating blend of deep understanding of the matter, beauty of plain graphics, voice acting, matrices, and "simple" software.

  • @MattHudsonAtx
    @MattHudsonAtx Před 2 měsíci +1

    I saw the calculus approach coming a mile away but it's great to see the linear algebra done so clearly. I need to take that again.

  • @johnstuder847
    @johnstuder847 Před 6 měsíci +3

    Thank you! This is definitely one of CZcamss math gems! Ties so many ideas together. I would love for you to do a video on Fourier Epicycles. For reference, GoldPlatedGoofs ‘Fourier for the rest of us’ is a great starting point. I’m sure you could do a beautiful refined version showing how the Inner Product, Fourier, QM, function spaces and Art all come together in a beautiful way.
    Thank you so much for your sharing your videos!

    • @RichBehiel
      @RichBehiel  Před 6 měsíci

      Thanks for the kind comment, John! :) I touch on Fourier analysis in my upcoming video on relativistic QM, the Klein-Gordon equation. Hoping to upload it within a week.

  • @tommyproductions891
    @tommyproductions891 Před rokem +12

    great video! I love how at the start you explain the equation of a straight line and by the end it's multivariable vector calculus

  • @Liberty5_3000
    @Liberty5_3000 Před rokem +23

    It's so beautiful! Thank you a lot! I hope your channel is gonna grow fast soon

  • @berndkopera7723
    @berndkopera7723 Před rokem +7

    Absolutely beautiful visualization! Simple, smart and intuitive.

  • @M.KRISHNAKANTACHARY
    @M.KRISHNAKANTACHARY Před měsícem +1

    Thanks a lot for clearly explaining the concept of fitting a linear regression so beautifully.

  • @simonleonard5431
    @simonleonard5431 Před rokem +4

    Thank you! I've been playing with a spherical geometry problem and there's so much I've forgotten from my school days. This video reminded me of so many things, including ways of expanding my approaches to problem solving. Brilliant 👌

  • @atticmuse3749
    @atticmuse3749 Před 2 měsíci +1

    12:16 "it should keep you up at night"
    Very apropos considering it's almost 4:30 am right now and I've been watching your videos for hours 😅

  • @jiadong2246
    @jiadong2246 Před rokem +1

    Great work! Thank you, and I'm looking forward to your linear regression and gradient decent videos you mentioned at the end of the video

  • @ivopfaffen
    @ivopfaffen Před rokem +2

    Sooo cool! As a cs major struggling with a numerical analysis class, this helped me understand linear regression so much better.
    Thanks man!

  • @xxge
    @xxge Před rokem +2

    Great video! Coming from a linear algebra heavy background I still think taking the singular value decomposition of X, inverting it, and multiplying by y to find b is a more elegant and simple approach especially for multiple linear regressions, but I imagine if you have more experience with physics this approach would be more familiar and easier to digest. Keep these videos coming!

  • @sujalgvs987
    @sujalgvs987 Před rokem +4

    I absolutely loved this video. Please do more videos on regression and machine learning as a whole.

  • @ehfik
    @ehfik Před 8 měsíci +1

    this was SO satisfying! hope to see many more explanations, such a great execution!

  • @Ayesha_F
    @Ayesha_F Před rokem +3

    Oh this was so SATISFYING! I don't think i have ever seen regression explained this way. It's like parts of how i understand it, is being so wonderfully articulated by someone who obviously knows the subject matter well. I have had to teach myself mathematics and statistics, and I've always been drawn to this intuitive and philosophical way of understanding it. Thank you for this!

    • @RichBehiel
      @RichBehiel  Před rokem +2

      Thanks for the kind comment, and I’m glad you enjoyed the video! :)

  • @dadamczyk
    @dadamczyk Před rokem +3

    Great video! With those animations it would be wonderful to see an essay about bayesian linear regression since it is quite different and powerful approach to similar topic.

  • @anthonyrojas9989
    @anthonyrojas9989 Před rokem +2

    This was amazing! So fun to watch and appreciate this concept.

    • @RichBehiel
      @RichBehiel  Před rokem

      Thanks, glad you enjoyed the video! :)

  • @enricolucarelli816
    @enricolucarelli816 Před 2 měsíci +1

    Wow! This is perfection explaining/visualizing complexity and its beauty! ❤❤❤❤ 👏👏👏👏👏

  • @tesstera
    @tesstera Před rokem +1

    Amazing! Thanks for showing us how to solve a Maths problem in a Physics way. Even though this method has been used in nowadays AI already, it is still very interesting to see it works outside AI. The conceptual journey you taken reminds my trial on machine proving, or ATP; and helps most to eliminate the intimidation of numerical analysis. Thanks!

  • @benwinstanleymusic
    @benwinstanleymusic Před rokem +1

    Really enjoyed this, you're great at explaining stuff

  • @jwilliams8210
    @jwilliams8210 Před rokem +2

    Fantastic presentation!

  • @alexkushnir8073
    @alexkushnir8073 Před rokem +1

    Cool music Richard, it opens my mind and makes me understand things better! It's like combining hypnosis and a class;-) I wish my math teacher at school would have explained it to us in that way 🙂

  • @TheScepticalChymist
    @TheScepticalChymist Před rokem +1

    I cannot finish the video because your voice is SO charming and comforting and makes me feel so safe, I just cannot pay attention in the maths

  • @bernard2735
    @bernard2735 Před rokem +1

    Beautifully explained, thank you. Liked and subscribed and looking forward to more.

  • @Cristi4n_Ariel
    @Cristi4n_Ariel Před rokem +1

    This was interesting! Thanks for sharing.

  • @davidandrewthomas
    @davidandrewthomas Před rokem +1

    This is beautifully put together! What a great explanation!

  • @levimillerfandom
    @levimillerfandom Před rokem +1

    I was really stuck on a practical, I have to make a graph of my readings the book Stated that I should get a straight line but instead I got curves was really stressful, but thankfully found your video,
    It really helped❤
    Thanks again

  • @Aziqfajar
    @Aziqfajar Před rokem +1

    This is beautifully explained and visualized! I'm glad to be on the first wagon for the ride of this video.

    • @RichBehiel
      @RichBehiel  Před rokem +1

      Thanks, I’m glad you liked the video! It’s one of my favorite mathematical concepts, so it’s great to see others enjoying it too :)

  • @user-hl8sv1if7j
    @user-hl8sv1if7j Před rokem +1

    wow. So well explained. Thank you

  • @coreymonsta7505
    @coreymonsta7505 Před rokem +1

    I love code and taught calc 3 a couple times, which is my favorite class, but never learned about this topic in school (only hear of its name a lot). That was really interesting

  • @benjaminshropshire2900
    @benjaminshropshire2900 Před rokem +2

    IIRC there *is* a way to leverage that outer product observation: If D is a matrix where each column is [xᵢ 1] and Y is another matrix where each row is [yᵢ] then the entire left Σ becomes DDᵀ and the entire right Σ becomes DY.
    also (I think) this actually generalizes to linear equations with more terms by adding the data as more rows in D. And the data can also be functions of existing simpler terms (e.g. Nth powers of x to get polynomial fits, sin(nx)/cos(nx) to get discrete Fourier transforms, etc.).

  • @andytroo
    @andytroo Před rokem +4

    introducing the Jacobean could be a nice extension - the shape of best fit is an ellipse, which can make converging towards the best solution hard, as many of the gradient directions in the top half of your example are not pointed towards the best solution, simply towards that valley of best fit. Reshaping the gradients to make that ellipse a circle allows much quicker conversion

    • @RichBehiel
      @RichBehiel  Před rokem

      Great idea! I’d love to do a video on that someday.

  • @CarlosHlavacek
    @CarlosHlavacek Před rokem +1

    Really beautiful class.

  • @TranquilSeaOfMath
    @TranquilSeaOfMath Před rokem

    I really like all you put into this video. It helps connect ideas in interesting ways. Thank you for including the Python code.

  • @Lado916
    @Lado916 Před rokem +1

    Great video! I absolutely love the visual and dynamical proofs in math.
    I just wanted to add that there is a beautiful point-line duality between the two spaces:
    While a dot in parameter space corresponds to a line in real space, a line in parameter space defines a family of curves in real space that intersect at the same point.
    Moreover, if you map your datapoints to their corresponding dual lines, the center of mass of these lines will be a dual point to the best fit line of the data!
    Hope you find this as cool as I do.

    • @RichBehiel
      @RichBehiel  Před rokem +1

      That’s really cool! I’ve read about that kind of thing in an intro to differential geometry book, but hadn’t connected the dots in the context of this video. Thanks for a very interesting comment :)

  • @wishIKnewHowToLove
    @wishIKnewHowToLove Před rokem +1

    he just dropped the most beautiful linear regression video and thought we wouldn't notice

  • @zeb4827
    @zeb4827 Před rokem +1

    very cool video, this connected some dots that I've been struggling to reconcile

  • @marktahu2932
    @marktahu2932 Před rokem +1

    Really very helpful - and I'm no professional in any of these fields, but just an old technician who is being reminded of all those brain neurons that have lain dormant for decades,

  • @AfroNyokki
    @AfroNyokki Před rokem +1

    Great explanation, loving it so far. I'm majoring in applied math with a focus in numerical analysis, so this stuff is always fascinating haha. I noticed around 18:20, you started using delta instead of del. Thought it might be a typo but just wanted to check!

    • @RichBehiel
      @RichBehiel  Před rokem

      Yeah that’s a typo, sorry! 😅 Thanks for pointing that out.

  • @kalaiselvan6907
    @kalaiselvan6907 Před rokem +1

    ❤️❤️❤️This is Gold ❤️❤️❤️ Thank you

  • @chrislau9835
    @chrislau9835 Před rokem +1

    Very good explanation 👍🏻👍🏻

  • @RocaSeba
    @RocaSeba Před rokem +1

    This video is genius. Subscribed.

  • @8megabitz706
    @8megabitz706 Před rokem +1

    Ive been waiting for this for too long 10:17

  • @IAmTheFuhrminator
    @IAmTheFuhrminator Před rokem +1

    Such a great video! I had a lecture about this years ago in my engineering analysis class in undergrad, but I took such poor notes that I was never able to reproduce this function. Now as homework I'm going to take your process and solve for other functions like parabolas or cubics which will require me to use 3 and 4 dimensional parameter spaces. Thanks again for the great video!

    • @RichBehiel
      @RichBehiel  Před rokem +1

      That’s awesome, I love to hear that! Challenge for you: can you solve it for a general N-degree polynomial? Like with some kind of recursive algorithm. I actually don’t know if this is possible but it seems like a fun puzzle!

    • @IAmTheFuhrminator
      @IAmTheFuhrminator Před rokem +1

      @@RichBehiel that would be a fun problem to solve! And even if it can't be solved, I'm sure proving or disproving the possibility of a solution would make a great paper!

  • @mskiptr
    @mskiptr Před rokem +1

    The parameter space is a super powerful concept. Especially in computer vision, where you can take a bunch of pixels and quickly detect all the lines they approximately form

  • @maxfitzkin9422
    @maxfitzkin9422 Před rokem +1

    I really loved how you put this video together! What did you use to animate and edit everything? It was really clean!

    • @RichBehiel
      @RichBehiel  Před rokem

      Thanks! :) I used matplotlib in Python.

  • @michahejman6712
    @michahejman6712 Před rokem +1

    Great video! 30 minutes felt like 5 :) Thanks!!!

    • @RichBehiel
      @RichBehiel  Před rokem +1

      Thanks, glad you enjoyed the video! :)

  • @ABKW119
    @ABKW119 Před rokem +4

    Why do your videos only get recommended to me at 1am, they send me straight down a rabbit hole 😂

  • @micahwithabeard
    @micahwithabeard Před rokem +1

    i just liked, subbed and commented :D i don't think i can be any more "violently complementary" than that. this was excellent thanks!

  • @pickle.taesan
    @pickle.taesan Před rokem +1

    Great video! I never thought parameter space with 'Error Force'.

  • @user-pn1lm3pi6p
    @user-pn1lm3pi6p Před rokem +1

    Very good!

  • @ydl6832
    @ydl6832 Před rokem +1

    Yeah, this is a nice explanation. Neural network is just a more sophisticated version of line fitting with more parameters.

  • @GradientAscent_
    @GradientAscent_ Před rokem +1

    Very cool animations

  • @williamfurtado1555
    @williamfurtado1555 Před rokem +4

    This video is wonderful. How did you create the interactive visualization with the "Parameter Space" and "Real Space" subplots? I'd love to be able to create one on my own.

    • @RichBehiel
      @RichBehiel  Před rokem +8

      Thanks William! :) For this video I used Python, specifically matplotlib. You can use that by downloading Anaconda, which will install Python and some scientific modules, then call “from matplotlib import pyplot as plt”. After calling that line, you can use things like plt.figure() and plt.plot() to make a figure and plot things. In this case the parameter space and real space are two subplots in a figure. They’re refreshing at 60 frames per second in a loop which sets the dot’s position in the parameter space while making the line in the real space, based on the current a and b values. To turn on the error landscape, I also added some code to evaluate the error metric (objective function) at all points in the parameter space for each a and b. Then for the error force I calculated and plotted the negative gradient of that. For the part where the dot descends down the gradient, I used F = ma - kv with mass parameter m and friction-ish parameter k to make the dot roll down the hill and then stop at the optimal point.
      I’ll be more careful in future videos to post the source code of the animations too. Well, at least for videos after the one I’m going to post this week; for that one, and the previous videos, I was very sloppy with the code and it wouldn’t be too helpful to see them. But there have been a few comments now about how these animations were made, so I figure the best answer is the code itself. In the future I’ll be better about writing cleaner animation codes and sharing them.

    • @rocknroll909
      @rocknroll909 Před rokem

      ​@@RichBehiel wow, you're awesome for such an in-depth reply to this. Thank you, I might try this on my own

  • @scienceuser4014
    @scienceuser4014 Před rokem +1

    Perfect video

  • @nooks12
    @nooks12 Před rokem +1

    Satisfying video. Took me back to University.

  • @torquencol
    @torquencol Před rokem +1

    Lmao thank you for this, this video just came into my recommendations when I needed it most: I've been stressed these last few days just doing laboratory reports, where I have to use a lot the regression line 🛌 It made me hate it less

  • @rouninph6349
    @rouninph6349 Před rokem +1

    It looks like you are trying to hypnotize your listener. 😂 Great explanation btw. Using physical arguments to explain a mathematical concept, I like that.

  • @Osniel02
    @Osniel02 Před rokem +1

    just gorgeous!!!

  • @alexander_adnan
    @alexander_adnan Před rokem +1

    Thank you 🙏 ❤❤❤

  • @PatrickDoolittle
    @PatrickDoolittle Před rokem

    I like Sujal Gupta watched this video because I am studying machine learning. I have been studying simple linear regression for the past couple weeks now! Just yesterday I started to think about how the moore-penrose psuedo inverse generalizes the idea of an inverse to situtations where the matrix is not square. I call linear maps to a higher dimensional space "embeddings" and linear maps to a lower dimensional space "projections". For a square matrix, which is neither an embedding or a projection but a linear operator in the same dimension, we can undo the linear mapping by finding the inverse X^-1. In the case of projections, there are many high dimensional vectors that can be projected down to a given low dimensional vector, so there is no unique inverse. However we can solve the system Xb=y for b using the Moore-Penrose *psuedo* inverse: (X^T X)^-1 X^T. When we apply the moore-penrose psuedo-inverse on the vector of response variable y, we project y onto the row-space of X, which is formed by the row vectors, which are linear combinations of the parameters. By projecting y onto every data point (row vector) and adding it up(in essence projecting onto the entire row space), we get our coefficients, and that is the beauty of the moore-penrose psuedo-inverse!

    • @davidmurphy563
      @davidmurphy563 Před rokem

      I code DNNs too. Um. I understood your words but not your point. Genuinely curious here.
      So we can calculate the inv matrix. Take the reciprocal of the determinant and multiply it by the matrix with the diagonal swapped and the upper/lower negated. This spits out a new matrix with the property that if you multiply that by the original you get the identity (assuming linear independence).
      Ok fine, all very useful. But what's that got to do with the price of fish?

  • @account4345
    @account4345 Před rokem +1

    Just gotta remind myself this is why I must master linear algebra.

    • @RichBehiel
      @RichBehiel  Před rokem

      Mastering linear algebra is a great and enduring source of spiritual fulfillment 🙏

  • @brianli3493
    @brianli3493 Před rokem +1

    electric potential actually helped me understand this omg

  • @DavidCaveperson
    @DavidCaveperson Před rokem +1

    Nice video on OLS. I've often wondered though why lessons on regression focus on OLS rather than Deming Regression, as OLS seems objectively inferior, so to have so many projections based on the inferior model, we are shooting our research methods in the foot from the start

    • @RichBehiel
      @RichBehiel  Před rokem +1

      Good point. Frankly I think it’s because OLS is easier, and gets the job done in most situations. But I agree that there are times when Deming regression is better. Although someone who uses Deming would presumably have learned OLS first. OLS is also conceptually ideal for explaining how calculus can be used to minimize fit error, so it’s a good go-to image to have in mind when solving fancier optimization problems.

    • @DavidCaveperson
      @DavidCaveperson Před rokem

      @@RichBehiel I completely understand, in fact, this subject is making me think about applied mathematics, because if we go deeper, it's not like linear regression in any form is the the best way to actually model most data, so I'm thinking about dividing a function into splines to create a good fit, you can go too far and smoothly fit every point into a function, but then your function is skewed towards the data set, losing the ability for good projections. It's an interesting puzzle (and I hated applied mathematics in college)

    • @turun_ambartanen
      @turun_ambartanen Před rokem +2

      Well, there are quite a few advantages of OLS compared to total least squares fit.
      For one, every measurement where x is tightly controlled and y is the thing you want to learn about, OLS is the right tool. Because there are no or only negligible errors in x, the distance of datapoints to the prediction, dx, doesn't matter and must not be included in the fit.
      And also it works much better with arbitrary functions than total least squares. For an arbitrary function I don't think there even is _any_ way to calculate the total least squares error. Only well behaved functions work, and even then you have to define the derivative to perform a total least squares fit.

  • @StudyEnggFocus
    @StudyEnggFocus Před 2 měsíci

    Hello, Richard! Could you explain what you meant by error metric? Thanks

  • @SD-ni9jh
    @SD-ni9jh Před rokem +1

    beautiful vid

  • @tylerbakeman
    @tylerbakeman Před rokem +1

    Instead of calculating Dy, it might be better to calculate the distance a point is from the line (especially for smaller data sets, where Dy could be large, bur infact the line could be very close).

  • @kummer45
    @kummer45 Před rokem

    Imagine you have a surface with a magnet. That's a game changer.
    Understanding the concept of statistics doing physics is the correct way of UNDERSTANDING mathematics and PHYSICS. However physics has nothing to do with mathematics and mathematics has nothing to do with physics.
    The magic of this is MODELING. Linear regression, average, the gauss curve are concept of fundamental use in statistical mechanics. Eventually higher mathematical physics will launch the student into the field of MODEL MAKING.

  • @trustnoone81
    @trustnoone81 Před rokem +1

    Do I understand correctly that the "valley" in the error landscape is the set of all lines that pass through the point (x-bar, y-bar)?

    • @RichBehiel
      @RichBehiel  Před rokem

      Great question, and I’m actually not sure. Anyone know the answer?

  • @denisbaranov1367
    @denisbaranov1367 Před 8 měsíci +1

    The beauty of: Linear Regression

  • @guslackner9270
    @guslackner9270 Před rokem +13

    This video is a wonderful explainer! You've listed in the description that linear regression is "very useful in math, science, and engineering" to which I would like to add economics, which is what I am studying. This video and Jazon Jiao's work (czcams.com/video/3g-e2aiRfbU/video.html) are the best explanations of the concept that I have seen in video, lecture, or textbook form. I look forward to seeing what else you share on this channel!

  • @einsteingonzalez4336
    @einsteingonzalez4336 Před rokem

    That’s awesome! But what happens if we let N approach infinity where the data points are in a finite domain?

  • @peterwolf8092
    @peterwolf8092 Před rokem +1

    😂 I realy love this and wish my highschool students would understand it so I could share it with them.

  • @flexeos
    @flexeos Před rokem +1

    There is always something that bothers me when the linear regression is approched that way. It is that from the start you consider that y and y are of a different nature: the value of x is known perfectly and the error is on y. This is a pretty strong constraint. I am a metrology engineer and I saw in the comments that you are a metrology engineer too, so you are well aware that in the real world there are errors on both x and y. In which case the error could be the distance from the data point to the line for example

    • @RichBehiel
      @RichBehiel  Před rokem

      That’s true! And there are ways of doing regression with ds rather than dy. Although often x is more precise than y, for example if you have a sensor array or are sampling data at a fast and precise rate relative to the change in your signal.
      For example, if we’re looking at a trend in some signal that drifts linearly over an hour, and sampling one datapoint per second, with error on the order of microseconds, then x is very precise in that context.
      But you’re right that there are some cases where x and y might be similarly varying.

    • @flexeos
      @flexeos Před rokem +1

      @@RichBehiel my world is more the relation between 2 voltages at different location in an analog network so the noise on both are of the same nature.

    • @angelmendez-rivera351
      @angelmendez-rivera351 Před 5 měsíci +1

      @@flexeos I think you are missing the big picture. In most of these data sets (in practice), x(i) is the data set corresponding to the independent variable, the one which you can actually control for much more easily, and y(i) is the data set corresponding the independent variable, and you want to understand y as a function of x, not the other way around, because the other way around is (in every scenario I have seen physicists, engineers, and any other applied S.T.E.M. worker deal with) very impractical and not useful. Now, are there circumstances which are more complicated? Of course there are, but they are the exception, and in those circumstances, the complexities involved are of such a nature that dealing with residuals, as the video does, is not the practical approach anyway.

    • @flexeos
      @flexeos Před 5 měsíci

      @@angelmendez-rivera351 It is not my experience in practice. let 's say that you want to measure a resistor. you inject a current I that you "control" usually using a digital to analog converter and you measure the voltage V at the edge of the resistor and V/I is your resistor. because the world is not perfect, if you want to have a better result, you do the measurement with a bunch of Is and the resistor is now the slope of the best line through the cloud of points V,I. To have a better idea of the exact value of I, while you set it digitally, you have to measure the actual value of it as the translation between the digital value and the actual current is everything but perfect. so in practice you have a cloud of points V,I with the same kind of error (noise, offset, non linearity...) on both V and I. If you assume that I is an independent variable you will end up with a bias. There was a math paper on that bias effect almost 100 years ago that I read but I cannot find the reference just right now. If an electronic example seems too specific, let's look at something that is a typical example given to students like annual income vs age in years. age looks like an independent variable, but in reality by definition there is 1 year uncertainty on it which is not too good as the relative error bar is not even constant. of course in such an example the required precision is not a big problem so you can forget about those subtleties. But in metrology you are tracking few parts per million. Not taking that into account would be like trying to design the GPS without taking the general relativistic effects into account (~accuracy on location becomes > 10kms). my 2 cents

  • @nofalldamage
    @nofalldamage Před rokem +1

    Great video.
    Is the matrix at the end always invertible?

    • @RichBehiel
      @RichBehiel  Před rokem

      Great question! It’s invertible as long as its determinant isn’t zero. Since it has the form [A,B,B,N] where A and B are real numbers and N is a positive integer, so its determinant is AN - B^2. For this to be zero would require that AN = B^2, in other words for the sum of x_i^2 times N to equal (the sum of x_i)^2. I’m not sure if this can happen, feels like it can be proven one way or the other without a ton of work, but I’ve gotta go. So I leave that as an exercise to the reader! :)

    • @nofalldamage
      @nofalldamage Před rokem

      @@RichBehiel I think one of the cases where the matrix is not invertible is if all the points are on a vertical line. Kind of makes sense since then the form y = ax + b doesn't really work.

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq Před rokem +1

    Wow. Seen so many videos, read so many papers and books - but this one takes the cake. Would love to see you doing this but for more complex models with fixed effects and all sorts of other bells and whistles. Impressive!!

  • @jursamaj
    @jursamaj Před 8 měsíci +1

    And you can fit to other curves with simple transforms of one or both axes, like log or exp.

  • @potatochipbirdkite659
    @potatochipbirdkite659 Před rokem +1

    Do you have the blue dot following a Lissajous curve?

    • @RichBehiel
      @RichBehiel  Před rokem

      I forget what I did for that, I think I just had some sines and cosines of different frequency in x and y.

  • @bronga645
    @bronga645 Před rokem +1

    sub, like and comment for your effort, even if you dont make much on yt you are a great mathematician! And i am sure you will make it in life and be a help to humanity as a whole. thank you

  • @kennethtrimble5144
    @kennethtrimble5144 Před 11 dny +1

    excellent

  • @zukofire6424
    @zukofire6424 Před rokem +1

    Beautiful and surprised I never knew some of what you explained. I wanna add something irrelevant : you are so handsome!

  • @akidnag
    @akidnag Před rokem +1

    Great vid, thank you!
    I'm struggling knowing how do you visualize the "parameter space" in python?

    • @akidnag
      @akidnag Před rokem +1

      I did a mesh grid for a and b from -5 to 5 100 points X,Y. Then I calculate the modulus as Z= the sum of the sqrt of the ^2 of each eq and did a contourf(X,Y,Z), but no luck :/

    • @akidnag
      @akidnag Před rokem

      I think the quiver plot is ok as quiver(X,Y,eq1,eq2)

    • @RichBehiel
      @RichBehiel  Před rokem +1

      I did a contourf and a quiver. If the contourf isn’t working, it’s possible the color limits are off? Oh, actually come to think of it, I might have taken the log or sqrt of the error, to flatten out the landscape so it’s easier to see. Basically applying a nonlinear colormap.

    • @akidnag
      @akidnag Před rokem

      @@RichBehiel Thanks a lot! Keep up the great work!

    • @akidnag
      @akidnag Před rokem

      Still no good, I'm sorry.
      So in contourf is V (or log(V) or sqrt(V)) and in quiver is Fa,Fb, with spanned a and b, right?
      Sorry to bother but I feel I understand, but not having the same results make doubt what I'm doing wrong :/
      Is it too much if you share the code for visualizing the parameter space?

  • @sarthakjain1824
    @sarthakjain1824 Před rokem +2

    That was on the level of 3 blue 1 brown videos

    • @RichBehiel
      @RichBehiel  Před rokem

      Thanks! :) Grant is a role model for sure. The aesthetics of his videos are much better than mine though 😅 But I’ll get better over time.

  • @jamesmcfarlane3469
    @jamesmcfarlane3469 Před rokem +1

    Is this method, or something similar applicable to non linear least squares? I did a project over Christmas using non linear least squares regression and this would’ve been super helpful 😅

    • @RichBehiel
      @RichBehiel  Před rokem +1

      The same concept of minimizing a least squares objective function by setting the gradient to zero applies to nonlinear least squares, but there are also extra steps involved.

  • @peterwolf8092
    @peterwolf8092 Před rokem +1

    Is it possible to get a „second best“ valley? A pseudo best solution?

    • @RichBehiel
      @RichBehiel  Před rokem +1

      Not for linear regression, but for fits with more parameters yes. Gradient descent can sometimes get stuck in a local minimum, a valley other than the best one. If there’s an analytic solution, it might involve the roots of a polynomial or something, so you can have multiple values which are locally optimal. In that situation, the height of the objective function at each optimum can be quickly compared, since the list should be pretty short.

  • @badermuteb1012
    @badermuteb1012 Před 6 měsíci

    How did code these interactive plots? Thanks

  • @benandrew9852
    @benandrew9852 Před rokem +1

    holy shit
    I have genuinely never even come close to thinking about it like this
    top marks, no notes

  • @BrunoJMR
    @BrunoJMR Před rokem +1

    When calculating the zero gradient, how do you avoid the local minimum problem? They are also zeroes of the gradient

    • @RichBehiel
      @RichBehiel  Před rokem +1

      True! For more complicated fits, the parameter space becomes more textured and you’ll often have multiple local minima. But with an analytic solution, these minima can be quickly calculated, for example as roots of a polynomial. Then there’s just a small list of points at which the objective function can be evaluated and compared, and the minimum can be chosen from the list.

    • @BrunoJMR
      @BrunoJMR Před rokem +1

      @@RichBehiel Thanks! So the analytic solution gives us all the minima and we then can just check which one is the lowest. Cool

    • @RichBehiel
      @RichBehiel  Před rokem +1

      Yup. There may be some maxima and saddle points in there too, since those also have zero gradient, but those can either be filtered out analytically by solving some second derivatives for additional constraints, or just kept in the list and they won’t be the minimum so it doesn’t matter. In practice, people almost always do the latter. The only exception would be if the data rate is very high and there’s some benefit to solving those equations in exchange for a marginally faster routine. So in super high performance scenarios, the second derivatives are worth looking at.

  • @willbedeadsoon
    @willbedeadsoon Před rokem

    When I run a code in VS code it shows nothing, but if I start debugging line to line at "plt.subplots(figsize=[8, 4.5])" it shows matplot window. Ités weird for me. What's going on here?

    • @RichBehiel
      @RichBehiel  Před rokem

      Hmm… I’m not sure, tbh. Do you have all the modules installed? I’d recommend installing Anaconda, then running the code in Spyder (it comes with Anaconda). That way you’ll have a lot of mathematical and scientific modules already installed. Plus, Spyder looks cool.

    • @SkrtlIl
      @SkrtlIl Před rokem

      Not sure why you get the window in debugging mode, but for normal python scripts you usually have to call plt.show() manually while notebooks would trigger them inside the corresponding cell. So you may also change your .py to .ipynb and run that in vscode

  • @sgtreckless5183
    @sgtreckless5183 Před rokem +1

    Is the direct product in the final formula always non-singlular, and so always has a inverse?

    • @RichBehiel
      @RichBehiel  Před rokem

      I believe so, but I’m not 100% sure actually. As a good exercise in math, you can explore if it might be noninvertible under some conditions, just set the determinant to zero and see what a dataset would have to be like in order for that to happen.
      I’ve done millions, maybe billions, of linear regressions (on data streams) and have never run into this problem though.

    • @sgtreckless5183
      @sgtreckless5183 Před rokem +1

      @@RichBehiel Doing just the quickest amount of working out with a dataset of 3 values, I think the sum of outer product would only be singular if all the x values are the same, which obviously isn't going to happen. It's fairly easy to show that if we have a dataset like this, the matrix is singular (the 1st row of the matrix is just the second multiplied by x_i), though I'm not sure how you'd prove it the other way around (i.e. the matrix is non-singular in all other cases).

    • @RichBehiel
      @RichBehiel  Před rokem

      That makes sense! Btw, these equations are equivalent to a force and torque balance, if the residuals are imagined as elastic springs, so physically it makes sense that it would only be singular if the x values are all the same, or something like that.

  • @agentdarkboote
    @agentdarkboote Před 9 měsíci

    I would love it if you could show why the pseudoinverse recovers this method!

  • @PrismaticCatastrophism
    @PrismaticCatastrophism Před rokem +1

    Could you make similar video about parabolic graphs?

    • @RichBehiel
      @RichBehiel  Před rokem

      I’d like to someday! The procedure is very similar, but ax^2 + bx + c instead of ax + b. It’s a 3D parameter space, but the same techniques work.

  • @m9l0m6nmelkior7
    @m9l0m6nmelkior7 Před 2 měsíci

    But is that matrix invertible if there is more than one extremum ?