Probabilistic Programming - FOUNDATIONS & COMPREHENSIVE REVIEW!

Kapil Sachdeva

zhlédnutí 4 717

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 7. 07. 2024
This tutorial explains what is probabilistic programming & provides a review of 5 frameworks (PPLs) using an example taken from Chapter 4 of Statistical Rethinking by Dr. Richard McElreath.
Frameworks (PPLs) reviewed are -
Stan (mc-stan.org/)
PyMC3 (docs.pymc.io/)
Tensorflow Probability (www.tensorflow.org/probability)
Pyro/NumPyro (pyro.ai/)
Turing.jl (turing.ml/stable/)
I also provide the basic review of a great library called arviz (arviz-devs.github.io/arviz/), which can be used for all the above-mentioned PPLs to do Exploratory Data Analysis of Bayesian Models.
Here is the link to the notebook in which I have implemented the example model using the above Frameworks/PPLs
colab.research.google.com/dri...
The Turing.jl implementation was taken from here:
shmuma.github.io/rethinking-2...
The home page for Statistical Rethinking by Dr. McElreath where you will find the link to the implementations of his book in various PPLs
xcelab.net/rm/statistical-ret...
#ProbabilisticProgramming
#BayesianStatisitics
0:00 Introduction
1:11 Why do we need Probabilistic Programming?
2:13 What is a PPL?
6:09 Dataset and Model Description
9:13 Stan
13:44 arviz
16:20 PyMC3
18:27 Tensorflow Probability
21:44 Pyro & NumPyro
24:14 Turing.jl (Julia)
26:22 Recommendations on how to choose a framework (PPL)
Věda a technologie

Komentáře • 27

@MeshRoun Před 2 lety ⁺³
What a helpful overview. Thank you infinitely!
@KapilSachdeva Před 2 lety
🙏
@eugeneL_N1E104 Před rokem ⁺³
vote for TFP and agree with your comments... especially the py zen related: many ways to do the same things is not always a good idea ("There should be one and preferably only one obvious way to do it!")
Tried to learn TFP but looks need more effort than my time budget so had to finish a small project using PyMC. But to be serious we really need TFP.
Industry application heavily relies on Python & TF (server cannot 'conda' at all) but tutorials and standards for TFP are not well established at this moment.
Would be great if you can walk us thru the TFP decently (and show us what is the preferred only one way XD)...
@anuragkumar1015 Před 2 lety ⁺²
Great content. Please keep making videos
@KapilSachdeva Před 2 lety
🙏
@amitk3474 Před 2 lety ⁺²
Great stuff ! Can you do a video on your journey as developer, books , experiences, self-learning to inspire those on similar paths ?
@KapilSachdeva Před 2 lety ⁺¹
Thanks Amit.
Interesting question. Have never given much thought about it but maybe one day!. 🙏
@shankar2chari Před 2 lety ⁺¹
@13.44 - The next thing to worry about is... Kapil, Why should we have to worry about anything when you are at the other end simplifying all esoterics. This is cool.
@KapilSachdeva Před 2 lety
😀😀
@user-or7ji5hv8y Před 2 lety ⁺¹
Great topic
@KapilSachdeva Před 2 lety
🙏
@KapilSachdeva Před 2 lety
Hi C, I saw a comment notification from you but it does not appear here in the feed.
I can see a portion of your question and I believe you were asking if it is possible to have an integration of neural networks when using NumPyro.
The answer is yes. NumPyro uses JAX as a backend and you can use few different neural network frameworks that are built on top of JAX.
Not sure if you deleted the comment or youtube did. I have seen it happen a few times now. Possibly a bug in CZcams!
@arvinddhupkar5158 Před rokem ⁺¹
Your tutorials are amazing! Complexities clear and Clear and simple.! Can I ask a doubt? Which of the packages handle best when every thing is not conjugate? Say likelihood is non-gaussian and prior is non gaussian?
@KapilSachdeva Před rokem ⁺¹
Thanks Arvind.
All of these packages have implementations of many MCMC methods with auto tuning of their various aspects.
Which one is best? The notion of best here could be -
A) is it package (PPL) easy to use
B) speed and accuracy of the inference algorithms implementations
Unfortunately there is no much data. There is an effort from Facebook to create a benchmark ai.facebook.com/blog/ppl-bench-creating-a-standard-for-benchmarking-probabilistic-programming-languages/
That said, IMO, in terms of expressability of your model Pymc3 and Numpyro are quite good.
@user-wr4yl7tx3w Před 2 lety ⁺²
A future video idea on Numpyro?
@KapilSachdeva Před 2 lety
🙏 yup!
@RealMcDudu Před 2 lety ⁺¹
at 18:00 you say 2 chains on each parameter - are you sure it's on each parameter? How do you run MCMC on each parameter on it's own? Even if you focus on a, you have to know the values of b and sigma to calculate the joint, which is needed for MCMC...
@KapilSachdeva Před 2 lety ⁺¹
Thanks for pointing this out. Indeed, saying "you are running or creating 2 chains per parameter can be confusing". Let me try to clarify.
As you figured out yourself, you do need other parameters to compute the joint prob; which means the MCMC algorithm (of your choice) is "sampling" from a multidimensional space.
Now you do not typically run the sampling procedure on the parameter space only once rather you run multiple processes (with different starting points). Let's say you ran the sampling procedure 4 times then "we say you have 4 chains per parameter".
But as mentioned above and rightly pointed out by you, what you have is the trace of samples per parameter that you "separately" plot for visual analysis and do diagnostics.
Hope this helps!
@ssshukla26 Před 2 lety ⁺¹
I suppose some of your videos will let me crack ML interviews. Tysm.
@KapilSachdeva Před 2 lety ⁺¹
Ha ha …glad u find these tutorials helpful. Thanks for always giving a word of encouragement. Means a lot to me.
@Maciek17PL Před rokem
Once I have posterior distribiutions of parameters how do I use them for predictions?
@KapilSachdeva Před rokem ⁺¹
You would use it to get the "Posterior Predictive distribution". See czcams.com/video/Kz7YbxHkVI0/video.html
@Maciek17PL Před rokem
@@KapilSachdeva yeah I saw that video but there was only one distribiution on W now there are 3
@KapilSachdeva Před rokem
Am not sure if I fully understood your concern but based on a gut feeling of what may be bothering you:
When we say posterior distribution it is not only about "one" parameter. It could be many parameters. We would then have a multivariate posterior distribution.
If above is not what concerns you, may be rephrase or elaborate and I will try to answer.
@Maciek17PL Před rokem
@@KapilSachdeva ok so the posterior distribution is multivariet I think I get it, but how do I deal with the integral in posterior predictive? When approximating posterior with Metropolis-Hastings integrals canceled eachother out but in posterior predictive there is only an integral so I'm unable to calculate alpha
@KapilSachdeva Před rokem ⁺²
Good question.
For the posterior predictive you would "estimate" it using the law of large numbers.
Expected value (expectation) can be estimated using the law of large numbers. You would sample from the posterior distribution instead of computing the integral.
The above statement assumes that you have a good understanding of expected value and law of large numbers.
PS:
Perhaps I should create a concrete but simple (code) example (with out using any framework) that illustrates the workflow end to end.

Další v pořadí

Automatické přehrávání

Reparameterization Trick - WHY & BUILDING BLOCKS EXPLAINED!