Diffie Hellman -the Mathematics bit- Computerphile
Vložit
- čas přidán 19. 12. 2017
- Correction : as oodles of commenters have pointed out, the clock face should go from 0 to n-1. Also, worth reminding people that Mike has simplified the notation in this video (as he mentions).
Mike explains the mathematics behind one of the most important pieces of computer security. (Simplified version with colour mixing analogy linked below)
Secret key Exchange (Colour Mixing) Video: • Secret Key Exchange (D...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
I think the fact that addition, multiplication, and exponentiation maintain their properties under modulo arithmetics was worth mentioning.
could you please link an explanation to this? i've been trying to understand how g^a and g^b were made public when in actuality it was those numbers _mod n_ that were made public.
also don't understand how you can calculate (g^a)b mod n without first having g^a, which again, wasn't made public because of the modulo term
@@znefas I never saw an explanation myself, but consider the following: imagine we are working in base N, and not binary or decimal. Then, modN just keeps the last digit.
So, since we know that it's possible to start by calculating the last digit when doing long addition/multiplication, as the carry only travels left, we can discard the left digits at any intermediate steps and still have the same result as if we only did it at the end.
Signed addition and negative powers (division) break this, but wrapping subtraction of unsigned integers actually survives afaik.
This is the same reason why we can know the last digit of 7^44444 will be 1, without having to calculate the whole number.
@@qm3ster the only thing i understood is how the carry only traveling left may be useful when trying to calculate a number smaller than the entire expression; i don't understand how the whole mod N thing works
i also hate modular arithmetic notation because "13 = 5 mod 8" is so much more annoying than "13 % 8 = 5" for me as a programmer
@@znefas for all unsigned integers a, b, c, n:
(a + b)%n == (a%n + b%n)%n
(a * b)%n == ((a%n)*(b%n))%n
((a ^ b) ^c)%n == (((a%n)^b)%n)^c)%n
As long as at the end you were dropping the stuff left of n, the result won't be changed by dropping it earlier, as it cannot affect what is on the right.
The intermediate values will in fact be different, which is what makes this a computational saving.
But outside cryptography and adjacent subjects like checksumming you usually want the most significant bits, not the least significant, they're called that for a reason, so this doesn't get used *too* often.
@@znefas g^a and g^b wasn't made public, only g^a%n and g^b%n - Mike got a bit sloppy when presenting (just as we do when we write down the maths). If we knew g^a, and since we know g, it would be trivial to work out a (and similarly for b).
I just took a crypto class in university and my professor got so wrapped up in the specifics of group theory that I didn't even understand how Diffie-Hellman worked, but this made everything so clear without really being any less mathematical.
Alice and Bob only communicate in Pub. Alice and Bob are probably undergrads.
love the way this guy explains.. never imagined i could binge watch cryptography videos
5:02 THIS! IS! CRYPTOGRAPHY! *_*kicks you down a 4096-bit deep well*_*
It’s amazing how beautifully elegant and simple cryptography really is.
I actually understood the math for once!
yeah, modular arithmetics is easy. :D ...
And theeeeres the party pooper.
Didn't work for me a=11 b=13 g=7 n=199
The math is easier to understand, because it's more technical.
Jyoti Das lol.
Great clock analogy. This should be the main video
A Parker clock, one might say!
It should not, as they used the colour analogy there.
The clock is not strictly an analogy, it's a common learning technique for modular arithmetic
It explains why n needs to be a big value.
The clock analogy I think is one of the best ways to explain modulo arithmetic...
Came in expecting some incomprehensible esoteric math but this was very easy to understand. Reminds me of that Doctor Strange quote "It's a simple spell but quite unbreakable"
If you watched Attack on Titan ,one of the character used. Only his words and hacked the entire cosmos to make it seem real at that moment
awesome video, I see some comments point out its actually (((g^a) mod n) ^ g)) mod n, so here is why its the same as (g^a)^b mod n:
lets say:
g^a = k*n+x
where x is an unknown number
because: (u + v) mod n =(u mod n + v mod n) mod n
and because: k*n mod n = 0
we get:
(g^a) mod n = (k*n+x) mod n = x
((g^a) mod n)^b mod n = x^b mod n
i used the binomial theorem nCr to get:
(g^a)^b = (k*n+x)^b = (k*n)^b + (nC1)*(k*n)^(b-1)*x + ... + (nC(b-1))*( (k*n)^(1)*x^(b-1) + x^b
since n appears in every part except in x^b every other part is set to 0 when we take the modulo:
(g^a)^b mod n = x^b mod n
so therefore (g^a)^b mod n = ((g^a) mod n)^b mod n
Yup.
Thanks for this, felt like a logical leap without this proof.
At the beginning there, did you put a g, where you meant a b?
@@nelsblair2667
He did, yes.
Interesting proof. But a few corrections:
First what you are trying to prove is that (g^a)^b mod n == (g^a mod n)^b
You made a mistake, in (((g^a) mod n) ^ g)) mod n . You put an extra "g", it's a "b"
Also to simplify...or clarify your proof, and make it more understandable, organized:
1) supposing: g^a = n*k+x
( *WHERE "x" IS NOT A MULTIPLE OF "n"* , in other words "x" is the "remainder" of the division by "n" )
This is an important purposeful "setting up" so that: n*k mod n == 0 and x mod n == x
2) mod n of any number which is a multiple of n will equal zero --> n*k mod n == 0
3) Proof that: (g^a mod n)^b == x^b
a) Which would mean --> (g^a mod n) == x
a) substitute "n*k+x" for "g^a", from step "1"
(n*k + x) mod n == x --> given that: n*k mod n == 0 (any multiple of "n" will be zero)
b) If (n*k + x) mod n == x, then (g^a mod n) == x,
c) meaning --> (g^a mod n)^b == x^b
4) (g^a)^b mod n
a) substitute (n*k+x) for g^a, given in step "1" --> (n*k+x)^b mod n
b) Expanding (n*k+x)^b, using binomial theorem,
*IMPORTANT MAGIC* : *all terms will be multiples of "n"* except the last term which is x^b
c) Mod n of all terms that are multiples of "n" will be zero, leaving only the last term: x^b
d) Thus (g^a)^b mod n = x^b
5) (g^a)^b mod n == (g^a mod n)^b , as both equal to x^b
.
Wow. I've always thought D-H was some esoteric math magic. It is extremely clever and elegant.
There is quite some math behind it about rings, fields and other group theory. But the plain formulas in crypto are pretty nice and easy. The computation is quite a mess as numbers get pretty big, often even bigger than your memory and power and multiply are not the chips best friend.
I find there are a lot of these types of algorithms; Radix Sort comes to mind... It seems like voodoo, then you look at the algorithm, and probably at first think 'that can't possibly work...?' but viola...
@@MrHaggyythat's still somewhat elegant, square-multiply-modulo is a genius and simple solution to easily calculate such an otherwise gargantuant number. I find the truly hard (and amazing) part is the theory behind which numbers are safe to use, even more fascinatingly and terrifyingly genius are the ways people find to crack it
@@justalonelypoteto square-multiply is a genius algorithm as it has view operations,. Lattice and Russian Peasant are awesome too. A lot more operations but most of them are cheap additions that can be done in parallel. My interest is hard-realtime systems. We want algorithms with fixed calculation times. Square-multiply can vary depending on the numbers. And on cheap and small hardware you might be able to eavedrop with an oscilloscope.
@@MrHaggyy Montgomery's ladder might be up your alley, it performs both operations each time with the only distinguishable difference being a possible timing variation because the needed operation affects which memory address is read, apparently this is fixed by scattering the data around to ensure cache misses
Wow, that is way more simple than I was expecting.
Wouldn't a mod n only return values from 0 to _n_-1? You wrote _n_ on the clock.
Correct that was an error on their part since of course if a divides by n exactly the remainder is 0 not n.
Technically you could have n on the clock but you'd have to omit 0. You can have n or you can have 0, but not both. Having n on the clock would actually be more like counting units of 1 up to n and then starting back at 1 again for the next unit until you run out of units. Same result (specifically in this case because the number itself doesn't matter), but a different perspective on it.
Having a 0 instead of the n is very useful when you work with such a construct though. This clock face is what mathematicians call a cyclic group, and its "neutral element" is 0 (or n, they are the same in the cyclical group). That means that (a+0)mod n=a mod n, so the 0 doesn't change an element by being added.
That is very obvious if you actually write it as a 0 instead of an n.
And multiply by zero also give nice properties. :-)
But clocks are 1-based.
The idea of diffie hellman is actually really clever. Its actually pretty hard to come up with a way to do that, and not have anyone be able to reverse how they did it or needing to share something that can be reversed. I can see why we still use this today it would be very difficult to replace it, and its very ingenious how it works and its not incredibly complicated either.
Lmao everyone calling him out for not stopping at n-1. Give the dude a break!
☞ nope.avi
they need to feel smart
If you're not familiar with what he's explaining, this is enough to confuse you and this is not the only thing he says that can create confusion.
he kinda of implicitly stopped at n-1 when he mentioned there is the element 0 in mod operations
😂
For some reason I understand all his videos but not many of the other tutors on Comphile.. Quite a teacher
You are why I am confident I won't fail my Safety and Security module. Mike is exceptional at communicating these concepts!
This is the most interesting channel about computer science. I’ll never be as much grateful as I would like to be. Thanks a lot for sharing these.
All these years of hearing how "diffie helman is extremely complex" yet such a simple and easily understandable algorithm. Thank you so much for this!
I love these math versions. They help me understand it so much better than just the computer science bits. Please do more of these!
It looks really simple, but it unfortunately doesn't explain whether "(g^a mod n)^b" and "(g^a)^b mod n" are the same. I mean obviously it must be the same for it to work.
Fabian Neundorf (g^a mod n)^b mod n is the same as (g^a)^b mod n, I believe he also mentioned it briefly in the video.
The simple answer as to why they're the same is that Modulo arithmetic=Finite fields, and Finite fields are consistent. Proving that finite fields are consistent would probably be another couple videos...
Short answer: It's because the product in modulo/clock arithmetic is defined that way. If you wanna know more I'll explain it in more detail:
Maybe it's easier if you think of g^a mod n just as a number between 0 and n-1 (actually if we use this notation we should start from 1 and go up to n-1, but it's not extremely important so let's forget about it). Think of the multiplication between two such numbers as a standard multiplication, but if the result is bigger than n-1 we take the remainder.
So basically the product is just:
(a mod n)*(b mod n) = a*b mod n (THIS IS THE DEFINITION OF PRODUCT IN MODULO ARITHMETIC)
Elevating g to the power a is just repeated multiplication, so what you do is apply the definition of product:
(g mod n)^a = (g mod n)*(g mod n)*...*(g mod n) = g*g*g*...*g mod n
You can easly see that we get a similar thing if we elevate g^a to the power b:
(g^a mod n)*(g^a mod n)*...*(g^a mod n) = (g^a)*(g^a)*...*(g^a) mod n = (g^a)^b mod n
So yeah, (g^a mod n)^b = (g^a)^b mod n
Hope it's clear now ;)
The "mod n" isn't an operation in this case. Appending "mod n" to a term just signals that you're calculating in clock arithmetic, meaning you always stay between 0 and n-1. If you do 2+3 in "mod 4", the answer is 5, or 1, or even 9, because in "mod 4" 1 is equal to 5 by definition. Calculating a modulo is not an operation that needs to be performed if you define yourself to be working in clock arithmetic
Actually g^(ab) mod n = (g^a mod n)^ b mod n, but yes, it is the same. The proof is just long and kinda gross which is why he didn't get into it. Here it is if you want to see it though.
Proof:
Any number m mod n is defined as the remainder left after dividing m by n, so we have m=q*n+r where q is the whole number of times n goes into m, and m mod n=r, which is the remainder.
So g^a=q_a*n+r_a, and g^a mod n=r_a. (I'm using the underscore to denote a subscript.)
(g^a mod n)^b=r_a^b=q_1*n+r_1, and (g^a mod n)^b mod n=r_1.
g^(a*b)=q_(ab)*n+r_(ab), and g^(a*b) mod n =r_(ab).
But we also know g^(a*b)=(g^a)^b, and g^a=q_a*n+r_a, so g^(a*b)=(q_a*n+r_a)^b.
Expanding that out into a sum we get
(q_a*n+r_a)^b=sum((q_a*n)^(b-j)*r_a^j,j=0..b)
The last term in this sum is r_a^b and every other term has at least one factor of n, so we can rewrite it as
(q_a*n+r_a)^b=n*sum(q_a^(b-j)*n^(b-j-1)*r_a^j , j=0..b-1)+r_a^b.
Now let q_2=sum(q_a^(b-j)*n^(b-j-1)*r_a^j , j=0..b-1), so we have
(q_a*n+r_a)^b=q_2*n+r_a^b.
Remember that this is equal to g^(a*b). So we have
g^(a*b)=q_2*n+r_a^b.
We already have r_a^b=q_1*n+r_1, so we have
g^(a*b)=(q_1+q_2)*n+r_1
and g^(a*b) mod n = r_1.
We already had that (g^a mod n)^b mod n = r_1 so we have now proved that
g^(a*b) mod n = (g^a mod n)^b mod n.
Hi Mike, Alice or Bob never shared "g to the power of a or b" but they shared "g to the power of a (b) mod n". But as we continued the example we said they shared the power of a or b. Can you please help explain that?
For anyone who got confused with this, we start with the left-hand side of the equation:
(g^a mod p)^b mod p
= (g^ab mod p) mod p [by the property of modular exponentiation]
Now, we move on to the right-hand side of the equation:
(g^b mod p)^a mod p
= (g^ba mod p) mod p [by the property of modular exponentiation]
Since (g^ab mod p) mod p = (g^ba mod p) mod p, we can say that:
(g^a mod p)^b mod p = (g^b mod p)^a mod p
Hope this helps.
The simplicity of this algorithm is so amazing!
It would be awesome if you guys did some podcasts. I learn so much from you guys and it would be great to listen while in the car or at school. Thanks guys.
I thought "solving the discrete log problem" was what happens when you clog the toilet at your in-laws' house?
...Sorry, I'll show myself out.
...yeah you should really log off
You didn’t get the video.... tbh he actually did explain it well
@@ivanarabome4172 and you didn't get the joke
@Jesse Duncan yup, been watching on instaflixxer for since november myself =)
Nice
This is the best explanation I've ever found about it. Incredible concise and very illustrative. Thank you guys.
Great explanations folks, keep pumping these videos out!
This made so much sense and was super easy to grasp. Mike pound really is a great teacher. Also props to Diffie Hellman for creating such an easy but quite unbreakable algorithm.
Fantastic explanation of the mathematics behind this encryption!
Thanks for explaining, you do such a great job. Always my favourite computerphile presenter.
The whole time I am thinking about the clock-drawing test, after seeing those clockface numbers beeing squished together.
PS. I find this video even simpler to understand than the original "simple" version with the food colouring.
Same. With the color one I was like "what do you mean I can't figure out what the other color was? We just subtract the public color from it!"
The actual process is more difficult as instead of adding, it's multiplication to the power of another colour. Imagine Yellow^red x blue.
Me too. It's so easy to understand math when it doesn't have weird symbols that I have no idea what the heck they mean.
Yes, the thing with math is that it is a language which you need to learn. Once you do, the concepts behind the maths become not that complicated, when you know them. Implementation could still be complicated though. Devils in the details...
The clockface only goes up to n-1, (n mod n = 0). He forgot to change that too when he added the zero afterwards
Thomas Wigele Yeah I came down to point it out as well :)
not only that, but the primality of g is completely irrelevant - the n needs to be prime if you want the n in the 2 to 4k range (and DH to remain secure), best if n is actually a safe prime (then g can be any number but 0, 1 and n-1) - a prime for which (n-1)/2 is also prime
666Tomato666 I don't get why n would have to be prime at all?
+Π. Καράπας This way you are 100% sure that the remainder is in [0, n-1].
Александър Савев but doesn't modulo always return something in the range of [0,n-1]?
This is the video I've been waiting for!
Thanks Guys, awesome job!
It's interesting that he doesn't mention that n should be prime in addition of being large. That is an aspect equally as important as its size which could easily break the encryption (if n isn't prime). There was even a defcon panel (2016 I think) that discussed this very problem and how it is often ignored.
In the previous video regarding this topic, with the color mixing, he mentions that n is a prime number.
I really was hoping he would explain why n has to be prime. I'm thinking the reason is that the search space of [numbers] mod n would be much smaller than n if it weren't a prime number, but that's just my intuition and I'm not sure if I'm right.
@@tissuepaper9962 The goal is to have the denominator and numerator being coprime i.e. their greatest common divisor is 1 as otherwise, you'd end up getting 0 as the result. The only way to guarantee that is with a prime number which is naturally coprime to any number.
Watched both videos. Beautifully explained.
It's beautiful how simple and elegant it is.
thank you for putting this video up. Such a good teacher and elegant explanation of public private key at its core. There are 100s of videos super complicated.. this is very nice, one can implement themselves right off the this video..
Please do a video on Elliptic-curve cryptography and Elliptic-curve Diffie-Hellman , pretty please :D
That really is a beautiful solution to exchanging keys. I really love that. Fantastic video as always!
beautiful and simplified explanation
Thank you so very much for these videos. I find them invaluable in understanding the ins and outs of cryptography.
After watching all three videos, its much easier to understand and glad you used 'n' here that you didn't use in the last video with colors. This was an amazing series but if you label them as part of a series or make 1 full video combining all three or the first and last video, more viewers might decide to stick to the end. Anyway, Thanks a lot for this.
Amazing explanation with the circle!
I really loved this vid. Great work guys!
This is a really nice explanation and thank you for talking about it. I wish you talked a bit about man in the middle weakness and the ElGamal improvement.
beautifully explained!!! I love this channel. Thank you!
Yeah! Best explanation on key exchange ever. Thanks to both videos I understood it instantaneously. A test on paper and mental arithmetic worked like a charm.
Thanks :-)
Your videos are excellent. Thank you very much guys.
Absolutely amazing explanation! Keep it up
I like the videos with that guy, he knows his stuff and knows how to explain it!
It shouldn't be 'n' on the clock but 'n-1' because 'n'==0 in modolo
I'm not the only one who is botherd by this ^^
This is very true. Easy mistake to make, though.
same thought here - was about to write a comment.
There are 3 hard things in programming:
1 off by one errors and
2 naming things
I was about to write the same xd
great video, just to clarify modulo math, ((g^a)^b) mod n = ( ( (g^a) mod n )^b )mod n,
by using this we can generate same value from both side
This is the most important fact that makes this explanation work.
Wonderful explanation, thank you
Thank you so much, this really helped me get to grips with SSH!
Excellent explanation. Thanks a million!!
Superb explanation!
Excellent explanation!👍
I can’t believe people come up with this kind of stuff 😂
Such an amazing teacher he is! 🤩
Awesome explanation, thanks!
g must be the generator of cyclic group of n (the set 0..n-1) by repeatedly applying the group operation (exponent, modulo)
So simple yet so beautiful!!
Very well explained video.
This video is a paradox, in which your head embodies the problem and the solution at the same time.
Is it weird that I found this easier to understand than the one with the colors?
The clock-face analogy is really powerful as it really helps explain why it is not very reversible. "Undoing" g^a%n is effectively going backwards in steps of n and checking to see if g factors into this number an integer number of times. Since a is generally (very) large, we'd have to go backwards an awful lot of steps. There can be more efficient ways of doing it, but conceptually it's the same.
Note 2048 bits --> N < 2^2048 ~ 10^616. The world's best supercomputer has a power of ~1exaflop or 10^18 floating point calculations per second. Assuming every attempt could be achieved in 1 operation, that would take 10^598 sec = *10^591 years*
I always find it difficult to understand how "difficult" something is when we say p is a really big number, until it is put into a time-to-calculate number.
Great explanation!
Finally a decent video on computerphile... all of them should be AT LEAST this caliber instead of resorting to oversimplified (and hence incorrect) analogies.
Awesome explanation. Thanks
thanks for making this so simple to understand!
great explanation!
Just to caveat, in modulo we deal with numbers from 0 to n-1. We exclude n since the remainder can't be equal n - it must be less than n, otherwise when we're dividing by n and we see that there's a remainder n it means that n could fit one more time in that number leaving us with the remainder 0.
Another thing that I noticed (but doesn't matter), is that actually, Bob can't compute (g^a)^b mod n, because he has to calculate (g^a mod n)^b mod n. The beauty of modulo arithmetic is that the result is the same.
Thank you, this explained the matter (at least to me) a lot better than wikipedia.
please do more mathematical videos like this.
Fantastic interesting video, thank you for uploadinf
Great video honestly!!
Somehow this is the first time I've seen the clock representation of modulo. Really helpful to conceptualize when you're rushing through a problem!
Really? I've never seen it explained without this clock representation.
i really love you man, great explanations =)
The missing part is why (g^a mod n) is fast to compute even though the reverse is not.
Tl;dr; g^a can be computed by starting with x_0=1 and and defining x_1=a_0^2 or x_1=x_0^2 * g and then doing the same for x_2 and so on. Making the right choices at each step (which is basically reading out the bits of a) you get g^a in at most log2(a)+1 steps.
This video is amazing thanks!
Great video!
This is extremely elegant.
N-1. Great video. Thank you!
Technical question. Does the value of a, b, g, n are permanent, or they get changed periodically.
g needs to be a primitive root of n for this to work.
Is that true? After thinking about this (admittedly only for a few minutes), it is obvious to me that for a given size of n, selecting g to be a primitive root is the most secure because you'll get the maximal cycle length of n-1. But choosing a non-(primitive root) doesn't seem to actually break the algorithm.
@@trogdorstrngbd yes is doesn't break the algorithm but you will need to exhaust fewer values
Q:
If you had a 100 petaflop supercomputer and you discovered some workarounds in how diffie hellman key exchange can be sped up, say 5x, how long would it take on average to discover the private key given the size of 4096 bit RSA? You can make some assumptions about how many operations it would take to compute one key pair.
4:05 - To be more accurate, it's the same as long as g, a, and b are all positive. Throw in negative numbers and it doesn't always hold.
I'm pretty sure a and b (the private keys) should be between 2 and n-2 inclusive, as a^1 (mod n) is just the value of a itself (mod n), and a^n-1 (mod n) will be 1 by Fermat's Little Theorem, so this would be easy to spot for an attacker.
Impressive is the right word! Outstanding really, thanks!
The Mike Pound Show
Very well done!
Yeah but you never talked about if (g^{a} mod n)^{b} is the same as (g^{b} mod n)^{a}. As in, does g^{a^{b}} = g^{b^{a}} hold in the modulo field.
Yeah, I wanted to hear that too, but considering the technique wouldn't work otherwise, it probably is. It also kind of makes intuitive sense in the terms of significant bits. Modulo tosses away all most-significant bits, and only keeps the least significant bits. Multiplication will push the least significant bits into the most significant bits (two 32 bit numbers multiplied for example, gives a 64 bit number). So tossing away those most significant bits (which would've been moved to even higher significant bits) shouldn't matter.
The modulo just takes the remainder of the division: being the multiplication commutative, g^(ab) = g^(ba), you'll get the same number. You can then do modulo n, but the number will be the same anyways.
Let's say g^{a} = X + Y, where X is divisible by n and Y < n (therefore g^{a} mod n = Y). Then
g^{ab} is (X^b + c1*X{b-1}Y + c2*X{b-2}Y^{2} + ... + Y^b). Since X is a multiple of n, all but the last term will be zero mod n, so we are left with Y^b mod n = ( g^{a} mod n)^b mod n after we take the modulo. Therefore g^{ab} mod n = g^{a} mod n)^b mod n. I hope that's clear enough.
This is why he put this in the Maths video and not the colour video. As you should know this basic thing to know the theory. And he mentioned it in the video, very short.
Very well explained.
It would be useful to explain the shortcomings of Diffie-Hellman, and why we have switched to Elliptic Curve Diffie Hellman (whose math is more complex)
At 1:51 you don't need the n as you have 0. n (mod n) = 0, so the clock would only go from 0, to n-1
I was wondering why Diffie-Hellman hasn't been deprecated in favor of a better algorithm since it's almost 50 years old now. Now I know why. You could always increase the bit count of N to compensate for faster computers and still get a shared secret in as low as 2 messages. Neat!
I imagine there are not enoguh cases to make a whole new channel
but as a maths fan and a programmer I really did enjoy... I actually I understood it a little better than my cryptography profesor explanation
Can you (Dr. Mike Pound) please do a video on Linux security, and its flaws for being open-source?
AGAIN : YOU ARE THE BEST !