A Simple Introduction to Copulas

Sdílet
Vložit
  • čas přidán 12. 03. 2021
  • A no-formulas, graphical introduction to Copulas and why they are useful, all using simple Python libraries.
    Join the discussion: dirtyquant.com​
    Github link to the notebook: github.com/tinoproductions/Di...

Komentáře • 140

  • @johneagle4384
    @johneagle4384 Před 2 lety +3

    Videos like yours are very useful. They provide an easy-to-follow introduction to seemingly complicated subjects. Thank you!!

  • @TheAndiii007
    @TheAndiii007 Před 2 lety +19

    The production and editing quality is astonishing and very refreshing and in my opinion, rather uncommon for quant finance videos. I hope your channel will grow!

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      That is one of the nicest things anyone has ever said to me. Thank you very much.
      Glad someone appreciates the hard work. :-)

  • @shubhangipassi2593
    @shubhangipassi2593 Před 3 lety +4

    You've got a subscriber here for sure. I am in the actuarial field and your videos are very succinct and informative. Helped me quite a bit, thanks

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Thanks so much. Glad you found the video useful

  • @ArnauViaM
    @ArnauViaM Před 2 lety +10

    Hi! Thank you for your work! Let me summarize the process to see if I got it, you start from two sets of data points which you know have some dependence (but not what sort of dependence) your do not know the distribution of each data set by itself either. You look at it and try to figure out the distribution that best approximate its behavior, once you have settled on one particular distribution you use its relevant cdf function to transform both of them to a uniformly distributed observations, then you try figure out the correlation between both uniform datasets, but to give a final twist, you use a function instead of a constant value for correlation, and that function is the copula. Hope I get at least some of it right, thank you!

    • @dirtyquant
      @dirtyquant  Před 2 lety +2

      That is 100% correct! Perfect summary

  • @souvikchakraborty3095
    @souvikchakraborty3095 Před rokem +1

    I discovered your channel today and I sincerely hope that you never stop making videos :)

    • @dirtyquant
      @dirtyquant  Před rokem +1

      Thank you. That is very flattering. I hope to make many more

  • @JustCheckingMusic
    @JustCheckingMusic Před 3 lety +4

    Really nice explanatory video of the concept of copulas! Thank you for creating this content.

    • @dirtyquant
      @dirtyquant  Před 3 lety +1

      Glad you found it useful. Appreciate you commenting. Will keep making content!

  • @silvera1109
    @silvera1109 Před 4 měsíci +1

    Great video. I loved "loading word" and "let's play a game" - hilarious! Thanks 🤣👌

  • @kalmanfilter1224
    @kalmanfilter1224 Před 3 lety +4

    Thank you so much for this video ! I had some hard time trying to understand copulas, wish I had seen your video earlier

  • @ghostwhowalks5623
    @ghostwhowalks5623 Před 3 lety +2

    WOW - been searching for this for YEARS!!!!!!!!! Thanks for sharing!

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Glad you found it useful Chet!

    • @ghostwhowalks5623
      @ghostwhowalks5623 Před 3 lety +1

      @@dirtyquant - def! So quick Q - can this be used to generate correlated bernoulli variables? Left more details on your main website(in the forum). Would appreciate your thoughts. thanks!

    • @dirtyquant
      @dirtyquant  Před 3 lety +1

      @@ghostwhowalks5623 hi Chet, I saw your message in the forum. Let me reply there.

    • @ghostwhowalks5623
      @ghostwhowalks5623 Před 3 lety

      @@dirtyquant - no rush sir! Thanks!!

  • @chinhhoang454
    @chinhhoang454 Před 4 měsíci

    The channel name is quite misleading, it should be CLEAN Quant!!! Thank you so much for your works!

  • @munozariasjm
    @munozariasjm Před 3 lety +4

    It was a really good introduction, keep the excellent work on!

  • @lade_edal
    @lade_edal Před rokem +1

    Such brilliant production quality!

  • @mehdiagunaou4424
    @mehdiagunaou4424 Před 3 lety +4

    Good video !! you helped me a lot. You deserve more views.

    • @dirtyquant
      @dirtyquant  Před 3 lety +1

      Thanks for watching. Glad you are enjoying the videos.

  • @rossic7702
    @rossic7702 Před 3 lety +2

    thanks a lot, I've been triying to understand copulas to apply the COPOD() model of ML and now, wow

    • @dirtyquant
      @dirtyquant  Před 3 lety +2

      Makes me very happy to hear this! Well done

  • @corradoforza
    @corradoforza Před 3 lety +3

    Thank you so much, this was very clear and useful! Actually this video save me a lot of time! Keep going 💪🏻💪🏻💪🏻

    • @dirtyquant
      @dirtyquant  Před 3 lety +1

      Very happy you found it useful. Just trying to make copulas more widely used as they aren’t complex once you see them under an easy light.

  • @Sseyedsalehi
    @Sseyedsalehi Před 2 lety +8

    Thank you! that was excellent. I wish you could also put Sklar's theorem
    somewhere so we can relate things to that.

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      Glad you enjoyed it. I try to keep things simple and formula free. Might do a video just in Sklar. Cheers!

  • @anlsen5371
    @anlsen5371 Před 3 lety +4

    Thanks for your great effort. I hope you will achive the audicence that appreciate your work

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Thanks for your kind words! I hope so too :-)

  • @bjoernaagaard
    @bjoernaagaard Před 2 lety +1

    This is my new favorite channel

  • @KohsheenTiku
    @KohsheenTiku Před rokem +1

    brilliant explaination, i spend longgg longg time to understand from other stuff, but this explanation was wow!

    • @dirtyquant
      @dirtyquant  Před rokem

      Ha! Glad it helped you out. I always struggled too but one day it clicked. Now I want to tell people about it :-)

  • @shirishsam08410447
    @shirishsam08410447 Před 3 lety +2

    You got a subscriber, nicely explained :)

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Thank you! Glad you found it useful

  • @vl30.7
    @vl30.7 Před 3 lety +4

    Definitely deserves more views!

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Appreciate it! Hopefully the views will come

    • @17teacmrocks
      @17teacmrocks Před 3 lety +1

      yes, but most ppl will not understand. it's just like there have been real trader channels on youtube that went nowhere. meanwhile fake gurus have millions of subs

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Indeed. It’s a real shame.

  • @HeyCurlyBoy
    @HeyCurlyBoy Před 2 lety +1

    Thank you so much ! this was so well explained!

  • @sylvesterrokuman7520
    @sylvesterrokuman7520 Před 2 lety +4

    Thank you for your video! It was very informative.
    I'm in the civil engineering field and doing hydrology studies for my masters thesis, and like you said in the beginning of your video, most textbooks I came across jump straight into the deep mathematical concepts without giving an overview and intuitive understanding of the subject as you have did here.
    I am currently using R to do this and I haven't learned python very well so do you have the R version of your example and could you share please.
    Thank you!

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      Hi mate. Glad you found the video useful.
      Plenty of places to get dizzy with maths, few places to explain it in plain English.
      Sorry but I don’t use R. It’s well used in the stats community so I think you should be able to take me example in Python and translate it to R.
      If you use Jupyter you can run Python and R so that would be handy!
      Best of luck in your studies mate

  • @Anushdf1
    @Anushdf1 Před 3 lety +2

    Your videos are super helpful. Thank you very much for your time!
    I have been looking for a good book of ML in finance. Do you recomend the one is on your desk?

    • @dirtyquant
      @dirtyquant  Před 3 lety +1

      Hi, I would recommend sebastianraschka.com/books/
      It's not Finance specific, but it's really well written and easy to understand each of the models.
      Advances in Financial Machine Learning by Lopez Del Prado is very hard, and in my view not that practical for most people, but you can take a look if you like.

  • @tiga7659
    @tiga7659 Před 9 měsíci +1

    Amazing video! Thank you so much

  • @qlofficial550
    @qlofficial550 Před 2 lety +1

    Thank you! This is so helpful!

  • @RohitKaran007
    @RohitKaran007 Před 3 lety +2

    Very nice and simple explanation

  • @taotaotan5671
    @taotaotan5671 Před 2 lety +2

    Really thank your excellent video!
    One small question here: the correlation between gamma and beta distribution changed after the transformation. We can only indirectly control for the correlation by specifying the covariance matrix of the multi-Gaussian distribution. I am wondering if there is a way to directly control this correlation.
    Thanks again for providing great resources to the internet!

    • @dirtyquant
      @dirtyquant  Před 2 lety

      Hi Taotao, I am happy you found the content useful.
      What I am trying to show is that by using correlation, we are assuming a linear relationship between the 2 datasets. This is where we get the corr of 0.72, while actually it's 0.8.
      If you are happy with this, and understand that the data has a unique structure, then you can stop there.
      What copulas allow us to do is to use a universal language, the transformation from whatever distribution to the uniform, so we can apply our special copula (which might have strong tail dependence etc).
      It's just a tool to make our life easier. If you are happy with just corr between beta and gamma, then happy days!

    • @taotaotan5671
      @taotaotan5671 Před 2 lety

      @@dirtyquant Thanks for replying! I see what you mean, basically Pearson correlation, which assumes a linear relationship, isn’t a good measurement for distributions like gamma and beta. Instead, other measurements, like spearsman correlation is more appropriate to use here.
      Thanks again! Love this channel!

  • @giovannistephens3107
    @giovannistephens3107 Před 3 lety +4

    Thanks, man! Please, create the next (more advanced) copula video!

    • @dirtyquant
      @dirtyquant  Před 3 lety +2

      Noted. Will do!

    • @vector2789
      @vector2789 Před 3 lety +1

      @@dirtyquant Yes please!

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Let me thing of a good next step. Maybe some copula simulations. I love simulations :-)

    • @giovannistephens3107
      @giovannistephens3107 Před 3 lety +1

      @@dirtyquant Maybe something like a DCC-GARCH vs. a Copula-GARCH. I would love to see both applied to a portfolio optimisation and backtest.

    • @dirtyquant
      @dirtyquant  Před 3 lety +1

      Now you talking my language. Let me see what I can do

  • @yingfeng9404
    @yingfeng9404 Před 3 lety +2

    Thank you for your video :)

  • @devaughntaylor9121
    @devaughntaylor9121 Před rokem +1

    Really helpful!

  • @kanejiang2938
    @kanejiang2938 Před 3 lety +2

    Thank you Dirty Quant,You are nice man.Could u make a vidio about how to select copula model?

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Thank you! Good idea, let me look into that

  • @404username
    @404username Před 2 lety +2

    Hi, could you possibly elaborate why and how you transformed the beta distribution into UDD?

    • @dirtyquant
      @dirtyquant  Před 2 lety +2

      Hi,
      We transformed to Beta distribution to uniform as that becomes the common language that we can use to use copulas. Whatever distribution your data is in, there should "hopefully" be a way to translate it to uniform, that way we have more flexibility in using multivariate distribution. Most Multivariate distributions assume that each of the marginals (the individual pieces of data) follow the same distribution.
      If Data A and Data B are both beta distribution, then you can go ahead and use the Dirichlet distribution, which is the multivariate Beta. But that if Data A is Beta and Data B is normal? Copulas are the answer. In order to make it happen, we need to translate the data into a format which is common to both of them. That's why uniform comes in.
      To transform it we just use the Cumulative Distribution Function (CDF) of that distribution. Simple as that.
      The skill, is to know WHAT distribution our data is in.
      Hope that helps

  • @humarani51
    @humarani51 Před 3 lety +2

    Thankyou so much Sir, its really really helpful... They you explained it superb.
    Well, can you please tell me the about the data you generated? I mean is this R? Or anyother software... Plz help me . I wanna replicate it in the same way as you did . Thankyou

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Hi. This is all done in Python, using Jupiter notebooks. You can get my code on my GitHub page, link in the video description.
      Glad you enjoyed the video :-)

  • @c0mmment
    @c0mmment Před 3 lety +2

    Question - you talk about "fitting" the copula near the end of the video. What do you mean by that? In your example there is no fitting, you just plot one CDF vs another CDF.

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Hi, so with the gaussian copula there is very little to "fit", the correlation value/correlation matrix is all you get out, but it's not fitted in the traditional sense. With Gaussian all you have is the CDF in the uniform space, and from that extract a correlation. So yes, you are indeed correct, but the method is the same for more parametric copulas, the initial steps are the same. Thanks for watching and commenting. Cheers!

  • @tareqahmed181
    @tareqahmed181 Před 2 lety +1

    Thank you for the very nice introduction to marginals and copula! Is there a way to get in touch with you regarding a problem that I have at hand?

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      Sure. Wrote a post on dirtyquant.com
      Glad you liked the video

  • @coco-il4gr
    @coco-il4gr Před 2 lety +1

    Thank you so much !

  • @luzianlechner1884
    @luzianlechner1884 Před rokem +1

    amazing thanks!

  • @kubetail12
    @kubetail12 Před 8 měsíci

    From what I am reading, I am getting an idea of what copulas do, but I am trying to figure how to apply copulas to my problems is the tough part. Plus, I am thinking of publshing my work in a journal that is more measurement science oriented, so I really want to get it right. I am good at math until it comes to proofs and I notice statistics tend use more a mathematician type presentation that say engineering and physics math.

  • @minecraftscienceINC
    @minecraftscienceINC Před 2 lety +1

    Intresting video. You assign the scatter plot in the video to be of the Gaussian type - but what about the clustering around (0,0) and (1,1)? Shouldn't the Gaussian copula have a larger grouping at (0.5,0.5)? I am a little confused about that at least.

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      Hi, can you point to where you are seeing this? I did this video a year ago and I can’t remember it all.
      Thanks for watching

    • @minecraftscienceINC
      @minecraftscienceINC Před 2 lety +1

      @@dirtyquant It is about 13:36 where you show the plot, and at 14:00 you tell that it essentially was a gaussian coupla-like behaviour

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      @@minecraftscienceINC I think our eyes are deceiving us then.

  • @HrUssing
    @HrUssing Před 3 lety +2

    The probability integral transform i what makes the CDF uniform right?

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Yes, you take the original data and transform it into uniform using the CDF of that distribution, which is the probability integral transform.

    • @HrUssing
      @HrUssing Před 3 lety +1

      @@dirtyquant Alright, pretty cool method. Thank you for the video, definitely the best material i have found on this subject!

    • @dirtyquant
      @dirtyquant  Před 3 lety

      That’s great to hear. Trying to make this subject accessible was my goal.

  • @FinianAllen4
    @FinianAllen4 Před 2 měsíci

    great thanks man

  • @ekolytih
    @ekolytih Před 3 lety +3

    good stuff - but can't believe you are running this without a github portal for the notebooks ;)

    • @dirtyquant
      @dirtyquant  Před 3 lety +4

      I do! It’s in the video description. Link to Github for all projects used in the channel. Thanks for watching!

    • @ekolytih
      @ekolytih Před 3 lety +2

      @@dirtyquant who on earth would click 'show more' and find the link lol. That been said, keep good video coming :D

    • @dirtyquant
      @dirtyquant  Před 3 lety +2

      @@ekolytih haha! Indeed. I will mention it on my next video so people are aware. Thanks for that

  • @lukeweidner9110
    @lukeweidner9110 Před rokem +1

    Wow that plot showing the CDF uniform distribution totally clicked for me. I had read about that before but it didn't make any sense until now

    • @dirtyquant
      @dirtyquant  Před rokem

      Very happy to hear you found some value Luke

  • @spearchew
    @spearchew Před rokem

    enjoy the channel but personal preference would be to reduce the background music as you get deeper in to the video, it can be a little distracting.

  • @pinakibhattacharyya7853
    @pinakibhattacharyya7853 Před 2 lety +1

    That's Halperin and Dixon's recent book on your desk.

  • @barsraday9531
    @barsraday9531 Před rokem +1

    👏👏👏

  • @user-or7ji5hv8y
    @user-or7ji5hv8y Před 3 lety +2

    Cool

  • @user-wc7em8kf9d
    @user-wc7em8kf9d Před 2 lety +1

    Woow !!!

  • @Megal0lbanana
    @Megal0lbanana Před 9 měsíci +1

    Excuse me, but what is the presenter referring to by marginals?

    • @dirtyquant
      @dirtyquant  Před 9 měsíci

      Hi mate, marginals is just a fancy word for the distribution of the 2 separate datasets. So time can have a certain distribution and money spent a totally different one. These 2 datasets are the marginals.

  • @Blaze098890
    @Blaze098890 Před 3 lety +1

    Not sure what we are trying to model here. My understanding is that we are looking to better capture relationship between the variables time spent and money spent. What we are doing here seems to me is trying to model the Normals that were used to generate the uniform seeds. Why do we care about the correlation returning back to 0.8 when the correlation of 0.72 captures the data (albeit, doesn't catch non-linearity) more accurately? I just don't see what the transformation to uniform gives us. It is simply a long-winded way for calculating the correlations between the CDFs of the two Gaussians we started with. I see that in the real world we observe the time spent and money spent and then we can use that to find the correlation between the CDFs of the Gaussians but I don't see how that is useful. What am I missing?

    • @dirtyquant
      @dirtyquant  Před 3 lety +3

      Thanks so much for your attentive reply. The core reason to use copulas is to allow you to use different distributions for the marginals, i.e. the individual data sets, and once we have those, allow us to have a non gaussian relationship between them if we want. In my example we have a non linear correlation between the 2 variables, time and money, but by identifying the type of distributions in each, and transforming them to uniform, we now have a linear relationship. We are using a gaussian copula here, because that is all we need here. But It could be the case where the dependence might be really strong in the tails, so big spenders spend alot of time on the site, far more than your average, and now a gaussian copula isn't sufficient any more.
      As you say, "it's a long winded way to get the correlation of CDF", yes indeed, that is copulas.
      The reason why finding the correlation of the CDFs, is because you then have the true, non linear relationship between them, which allows you to simulate data and find probability distributions. so when someone spends 30 mins on the site, how much are they likely to spend.
      The next step after this is to have many variables, each with their own distributions, and then be able to pick the most suitable copula, or type of relationship between the transformed variables.
      Hope that clears it up. This is a basic example, without using formulas, as copula maths can be brutal for newcomers.

  • @samimocni7477
    @samimocni7477 Před 2 lety +1

    Very good explanation!
    Ps. The music is the background is not needed 😀

    • @dirtyquant
      @dirtyquant  Před 2 lety +1

      Glad you found it useful.
      Gotta have beats! That’s the Dirty Quant style!

    • @samimocni7477
      @samimocni7477 Před 2 lety +1

      @@dirtyquant Looking forward to next clips with and without music 😀

    • @dirtyquant
      @dirtyquant  Před 2 lety

      @@samimocni7477 Keep everyone happy!

  • @drachenschlachter6946

    9:26 brain afk...

  • @LunakSocioAnthroLinguist

    okay this not semantics in linguistics, cool video though

  • @vonGameTheory
    @vonGameTheory Před rokem +1

    Would love to understand copulas but couldn't watch this video with the music (and even talking DJ) overwhelming it -- how am I supposed to concentrate on what you're saying? I thought it must be an accident but it was apparently intentional. I guess all your videos are like that? Very strange choice -- unwatchable.

    • @dirtyquant
      @dirtyquant  Před rokem

      Haha. Sorry I can’t please everyone.
      Merry Xmas :-)

  • @neoepicurean3772
    @neoepicurean3772 Před 3 lety

    Please, cut out the background music. For people that watch at a speeded up rate it is a nightmare. I'm trying to watch on x2 and just sounds like someone rattling spoons.

    • @dirtyquant
      @dirtyquant  Před 3 lety +2

      Hi,
      I have looked at the stats and it’s such a small % of people who listen to the video at a sped up rate that I would rather keep my style, with music in the background.
      But thanks for watching.

    • @neoepicurean3772
      @neoepicurean3772 Před 3 lety

      @@dirtyquant It's no surprise that people who use sped up function aren't watching, as people that want to watch at x2 will find a video without background music. That's why I was letting you know, as otherwise your content is excellent.
      With all the online teaching over the past year the speed up function is very widely used now. Most people that I know that use videos as a learning resource only watch videos at sped up rates. It's a life hack that is catching on. My university has gone up to x3 now which is amazing for getting through recorded lectures.

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Totally understand, but I feel like videos without music are so damn boring, that I would rather lose the speed up crew than bore the rest to death.
      Thanks for the feedback. Maybe I will upload 2 versions, one with music and one without, so you can choose.

    • @neoepicurean3772
      @neoepicurean3772 Před 3 lety

      @@dirtyquant Check the stats for yourself, but it's something like 26% of users now watching in sped up mode in 2019. Up 10% from 2018 and expected to be around 50% about now. I study 'futurism', and this is certainly not going to be something that goes away any time soon, People are even watching dramatic content sped up now after Netflix introduced the feature due to popular demand. It's only music that ruins my sped up life! Anyway, do whatever you want. All the best.

    • @dirtyquant
      @dirtyquant  Před 3 lety

      Cool. I will post both and see which one gets more traction. CZcams should introduce that feature, where you can choose the audio track for people that want 2X. Have a good one!

  • @drachenschlachter6946
    @drachenschlachter6946 Před 2 lety

    Simple introduction.... 16 min video.... There you see how complicated this shit is

  • @drachenschlachter6946

    Sorry but is explainied so so bad....really. It was easier to understand the Literatur then this video...

  • @drachenschlachter6946
    @drachenschlachter6946 Před 2 lety

    Sorry but your video. Didn't help at all....