6 4 Laplace Approximation | Machine Learning

Sdílet
Vložit
  • čas přidán 7. 10. 2022
  • LAPLACE APPROXIMATION
    One strategy
    Pick a distribution to approximate p(wjx; y). We will say
    p(wjx; y) ≈ Normal(µ; Σ):
    Now we need a method for setting µ and Σ.
    Laplace approximations
    Using a condensed notation, notice from Bayes rule that
    p(wjx; y) = R eeln lnpp((yy;;wwjjxx))dw:
    We will approximate ln p(y; wjx) in the numerator and denominator
    LAPLACE APPROXIMATION
    Let’s define f (w) = ln p(y; wjx).
    Taylor expansions
    We can approximate f (w) with a second order Taylor expansion.
    Recall that w 2 Rd+1. For any point z 2 Rd+1,
    f (w) ≈ f (z) + (w − z)Trf (z) + 1
    2
    (w − z)T r2f (z) (w − z)
    The notation rf (z) is short for rwf (w)jz, and similarly for the matrix of
    second derivatives. We just need to pick z.
    The Laplace approximation defines z = wMAP
    LAPLACE APPROXIMATION (SOLVING)
    Recall f (w) = ln p(y; wjx) and z = wMAP. From Bayes rule and the Laplace
    approximation we now have
    p(wjx; y) = R eeff((ww))dw

    e f (z)+(w−z)Trf (z)+ 1 2 (w−z)T(r2f (z))(w−z)
    R e f (z)+(w−z)Trf (z)+ 1 2 (w−z)T (r2f (z))(w−z)dw
    This can be simplified in two ways,
    1. The term e f (wMAP) in the numerator and denominator can be viewed as a
    constant since it doesn’t vary in w. It therefore cancels out.
    2. By definition of how we find wMAP, the vector rw ln p(y; wjx)jwMAP = 0.
    LAPLACE APPROXIMATION (SOLVING)
    We’re therefore left with the approximation
    p(wjx; y) ≈ e−
    12
    (w−wMAP)T(−r2 ln p(y;wMAPjx))(w−wMAP)
    R e− 1 2 (w−wMAP)T (−r2 ln p(y;wMAPjx))(w−wMAP)dw
    The solution comes by observing that this is a multivariate normal,
    p(wjx; y) ≈ Normal(µ; Σ);
    where
    µ = wMAP; Σ = −r2 ln p(y; wMAPjx)−1
    We can take the second derivative (Hessian) of the log joint likelihood to find
    r2 ln p(y; wMAPjx) = −λI −
    nXi=1
    σ(yi · xiTwMAP) 1 − σ(yi · xiTwMAP) xixi
    #laplace #laplacetransform
    Find videos about :-
    #ArtificialIntelligence #ai #AI #DataScience #MachineLearning #DeepLearning #NeuralNetworks #ArtificialNeuralNetwork #ann #ConvolutionalNeuralNetwork #cnn #RecurrentNeuralNetwork #rnn #LongShortTermMemory #lstm #GatedRecurrentUnit #gru #ComputerVision #NaturalLanguageProcessing #nlp #Nltk #Spacy #Tensorflow #LinearRegression #LogisticRregression #KNearestNeighbour #knn #DecisionTree #RandomForest #SupportVectorMachine #svm #clustering #cluster #pca #ensemble #Sklearn #Python #Django #DjangoRestFramework
  • Věda a technologie

Komentáře • 5

  • @guacamole3109
    @guacamole3109 Před 4 měsíci +1

    Great explanation, better than my lecture notes

  • @salkban2066
    @salkban2066 Před rokem

    Thanks for the clear and concise explanation!

  • @YuchengWang-xh5fw
    @YuchengWang-xh5fw Před 3 měsíci

    thx, it helps!

  • @caocyan4369
    @caocyan4369 Před rokem

    Fantastic explanation! Can’t understand why there’s no one found this video.

    • @kumarpython
      @kumarpython  Před rokem +1

      Thank You ... Discoverability will take more time, i suppose