SHA: Secure Hashing Algorithm - Computerphile
Vložit
- čas přidán 12. 06. 2024
- Secure Hashing Algorithm (SHA1) explained. Dr Mike Pound explains how files are used to generate seemingly random hash strings.
EXTRA BITS: • EXTRA BITS - SHA1 Prob...
Tom Scott on Hash Algorithms: • Hashing Algorithms and...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
Mike Pound is by far my favorite person on this channel... he has the most interesting subjects, shines with crazy knowledge while still keeping the video fresh and dynamic.
I like him and his topics too, though the AI topics are interesting and the person explaining them is good too
he has great body language, tries to use it as much as possible
And a fair looker.
And the same accent as the 11th Doctor (Matt Smith)! :-D Where is that accent from?
Absolutely agree, Tom Scott is my second favourite, that guy is hillarious
This is too much work, can’t we just trust each other?
That ,my friend, is the real problem
How can I trust other people when I can't even trust myself
@Mohamed Seid GodisGood666!
Dont trust verify
No Way!!!
I could sit and watch videos from this guy all day long, so informative and laid back
wrg
Love how these videos get STRAIGHT to the point.
Been watching a whole bunch of Mike's videos as a complement to my introductory module on Security and Authentication. One of the best teachers I have come across!
I've been trying to understand the concept for 3 days from the slides my teacher covered and the book she shared and ended up with complicated mind, this video gave me a pure understanding in 10 mins. Great job!
Mike Pound is the best! I love hearing him explain things - keep em coming!
Roses are red
Violets are blue
Unexpected { on line 32
coding joke
A poetic compiler? I like that idea
Unresolved external symbol
Felt that on a spiritual level
Violets are blue
Roses are red
Your code isn't thread-safe
Use locks instead
This is my favorite guy on this channel. I just love stuff like this.
I am at a hackathon in Chicago Illinois at Illinois Institute of technology and I have to use sha-1 on some facts before I pass then to an api so I can make a project for the Hackathon. You did a wonderful job telling me what she-1 was so I could understand the cryptic api documentation. Thank you very much.
Thanks, Dr Pound (if you read this). I find your demeanour easy to engage with, and you set me off on the journey of understanding fully (with much work!).
My dealer need this.
Appreciate your feed back!
Thanks for watching, for more info and guidance on how to trade and earn.
W…h…a…t…s…A…p…p~~M.E……
+…1…7…2…0…3…1…9…7…5…5…1
😂😂😂😂😂
😆
🤣
I've always loved your videos and now I study computer science and can watch your videos for studying, it's amazing
Hmm, so far this is fairly straightforward, but the interesting part would be how exactly these compression functions work. Will there be a follow-up video on that?
In essence, it generates 80 32 bit words derived from bits of the plaintext, then the state does right circular shifts, some XORs, some bitwise ANDs, addition with the round word and round constant, and then permutation between all state variables
@@liljuan206 thanks, this really helped clearing things up
it isn't compression he is describing it is hashing. which is not what encryption is. which is what sha is. (notice the s part stands for secure).
@@liljuan206 how do they make it so it can't be reversed?
In essence Sha-2 uses 6 primary functions: Choice and Majority, and S0, S1, E0, and E1 all which move and permutate bytes around during compression
Note to self: Don't use a regular monitor as a touch screen
Its a university flatron monitor, probably expendable.
I love this channel so much...
I love these videos when Dr. Mike Pound is in them.
The washing machine example really helped seal in this topic I was trying to understand and helped me on my final project. Thank you!!!
Mike you are my favourite person to appear on this channel. I enjoy your clear explanations and like the quite recent toppics like google deep dream, dijkstra and so on.
Would you please explain the workings of the "washing machine"? ;-) I.e. the compression functions?
Thanks. I'll give this snippet a look. :-)
SHA Hashing Algorithm?
Secure Hashing Algorithm Hashing Algorithm
ATM Machine
RAS Syndrome
LAN Network
GNU's Not Unix...wait a minute
LCD Display
Dr Mike Pound is the best! More videos with him please
I always wondered how these things work. Great video
Thought I was following until 9:35
He describes a way of padding that will produce the same padding string for messages with the same length - then says it's important that messages with the same length don't have the same padding string. Did something important end up on the editing room floor?
I'll check with Mike but I think it was just a slip of the tongue - ie The padding would be the same for messages of the same length but the messages would be different if they are different >Sean
No, "0010110" padded would be "0010110100000...", but "001011000" would be "001011000100000...", so the 1 (first bit of padding) would be later.
+Mat2095 He obviously meant if you just pad them with zeros.
How does the padding work if a block is 511 bits long?
aullik Considering almost all real-world data is stored as a stream of bytes (8 bit values), That's incredibly unlikely to ever come up.
It could be 504 bits, but 511 is highly improbable.
If your padding has to add at least 8 bits (one byte), then the thing he described works fine.
Remember working with individual bits is almost unheard of in computing.
If you have to store individual bits for storage efficiency, you pack them into bytes.
(similarly, if you store 7 bit values, you either store them in 8 bits and ignore a bit, or you pack it such that you store, say, 56 bit blocks. (7 x 8 - eg, 8 sets of 7 bits stored in 7 bytes)
aullik: Exactly the question that raised to my mind too :-) Since there isn't necessary enough bits left in the block to include the length of actual message.
You could add another block of 512 bits to the end to make it work.
+KuraIthys
Going with bytes, the longest message that could still be padded would be 496 bits long. 504 wouldn't work as you'd only have 8 bits left but 504 in binary is already 9 bits long.
+Kuralthys
I know that we usually work with bytes, But even if we say we have 512-8 = 504 bits Then we add 1 '1' bit to start the padding and now we only have 7 bytes left. The message is 504 bytes long but we can only store 128 in 7 bits.
The only answer is that we expand to 1024 bits. But the question would be how do we expand. What is the "syntax" for the lack of a better word
Thank you so much. I had a hard time finding someone to explain it well
What I want to know, for no particular reason, is if there are cases where a hash of a hash equals itself, of course sticking with one particular algorithm and hash length.
pound for pound Mike pound is the best narrator on computerphile
Re watched it at least 10 times. Thank you for this explanation
Love the Schildt on your wall!
How do you know the "1000000..." padding bits are for padding purposes, and not part of the actual data/plaintext itself?
I kinda want to make my own hashing algorithm now. It wouldn't be very good, it would just be some random jostling around of bits until it looks weird.
Can you talk about the colliding prefix issue? As I understand it once I find a collision with a file, I can continue to create collisions by appending the same thing to both files, and some how this allows me to create two meaningful files each with the same hash value where one might expect that any collision which might be found would be obviously fake because it would have to be made up of a bunch of random bits.
Excellent as usual, good learning resource
I would love to see a video about the compression function! :)
the video's shoots are like modern family and that make's me happy ! also the information so thanks!
never been this early for a computerphile, dope
Thank you very much for this video :) It was very helpful and educational!
Good job! Your videos are excellent.
Thank you! Made hashing much clearer for me now :)
Loved the washing machine demonstration!
You explained everything except for the part that actually matters. :(
You may as well have said, sha works by shaing things.
Exactly my thought :/
That they explain complicated things in an easier to understand manner. Sorta like every other video they make.
Ah, I see now...it's a washing machine with some knobs that does the sha'ing.
The compression function of SHA is where it gets quite complicated, and I don't think it would've fit into the scope of one video, as explaining it to someone with no prior knowledge isn't trivial, there's quite a bit of complicated math involved, and very few people actually understand the details of it.
YES exactly this..
What's amazing is the Tom Scott "rocket" animation didn't show up on a video from Dr. Pound
Another video explaining SHA-256 would be awesome.
It would be amazing a video how you can get tracked for example: ip, mac, canvas, hd serial number, etc
Thanks for your great work!
keeps me engaged great explanation
Elegant explanation. Thank you, Thank you, Thank you 😊👍
Oh nice, string hashing via SHA1 is something I've been interested in.
What happens if your message is, say, 509 bits in length? How do you pad it if the length won't fit?
Excellent, finall a video with subtitles :)
Superb video! Understood it even better with a lefty teaching me ;)
easy-going video which explains just enough about SHA algo to keep it simple. The details are better learnt once you "get" the basic idea.
I feel like a genius learning everything here!
Nice! Could you make a video about post-quantum cryptography please? It will be a great opportunity to learn more about this stuff
That 011001011 he wrote down is actually the start of the SHA hash value for "abd". I wonder if that was intentional, because the odds of that happening randomly are less than one percent.
Isn't padding used even if the message is already a multiply of 512 bits to avoid attacks?
Since SHA is deterministic, even though it is non-reversible, it is still possible to guess the hashes of some reasonably short messages. For example, string 'abc' ALWAYS produces ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad. If I have a large enough database plus computational power, I could probably guess some short messages, although not the entire novel.
That's exactly how most cracking is done. Hashed database against hashed database lol
The thumbnail made me think "OSHA" with the O as Dr Pound's head.
It'd be amazing to see Dr.Pound reviewing some books from his collection. Get to know his technical interests apart from image analysis.
Love these videos.
9:49 captions about Merkle-Damgard Construction are hilarious
3:17 And the reasons why the NSA came out with SHA-1 to replace the earlier SHA-0 (or just plain “SHA”) were not revealed publicly. But the weaknesses in the original SHA were discovered independently a few years later. This was part of a sequence of evidence indicating that the gap between public, unclassified crypto technology and what the NSA has was narrowing, and may not be significant any more.
I think it's widening because look at Pegasus and with Pegasus 2.0 you only need phone number to target a victim.
And, Pegasus is joint project between Israel and USA. Imagine what NSA would have kept to themselves.
It is common understanding in computer security feild that if government wants you, they have you.
What if the message is only a few bits shy of a block, not enough room for padding bits as described?
If there's less than 65 bits of space left in the final block for padding, you just pad toward an extra block. For example if your message is 480 bits, you add a one-bit, 479 zero-bits, and the 64-bit length, giving total length 1024 bits = 2 blocks.
Matthijs van Duin thanks
9:40 I didn't quite understand how that padding scheme guarantees that messages with the same size would not share the same padding.
Tx for the video :-). Maybe someone can help me with this question: What does determine the outcoming hash? At the one hand it is totally random, at the other hand it is consistent? Is it a super hugh complex formula, so that it is better to randomly guess instead of solving the formula? Or is it the NSA the only one who has the formula?
what happens if I feed 511 bits? it's not a multiple for 512 but the space left is too short to save the length
Really interesting videos !
How would the padding work if the final block of the message was long enough that you don't have enough padding room to say the number of bit in the message? So if the final block contained 510 bits you would have to pad in 9 bits(111111110) to say that the message is 510 bits, but you would end up with more than 512 bits.
The length field has a fixed size (which is sufficient enough) (also the field is not optional). The length of 10...0 is decided including the size of the length field i.e. you could jump over to the next block if required.
Thank you computerphile:-)...
So the padding is only denoted by the last one with a trail of zeroes and a length at the end? That is not a prefix and without some other way of indicating that padding is present it is indistinguishable from data.
After a quick google search it appears that the padding is always present so it doesn't need to be a prefix.
This was very informatice!
Question: Is there any significance to the initialization constants
h0 = 0x67452301
h1 = 0xEFCDAB89
h2 = 0x98BADCFE
h3 = 0x10325476
h4 = 0xC3D2E1F0
Or are they chosen "randomly"?
Thanks!
No, hey could be any numbers. BUt the cryptographic comunity is very sceptical of numbers that come out of nowhere.
What would be the padding if the final chunk of message is only 502 - 511 bits?
Some people speak terrible not understandable english, he is one of them. Even whole words were not completely spoken.
Anyone notice the 'hacking' book on the shelf behind?
It doesn't look like anything to me
Hacking: The Art of Exploitation is a great book by Jon Erickson, which teaches you the basics of reverse engineering, code flow, basic C programming, the stack, networks and other things to get you started on binary exploitation. It's a great book, I recommend it to anyone who's willing to invest time in learning how to hack properly.
lol
cyancoyote is knowledge of a programming language required?
cyancoyote Thanks for the reply. I've heard by many people that C is a very hard language to learn though... do you have any recommendations for introductory books to learning assembly?
So basically it's a randomization function that is seeded with the data you give it, right?
The key idea that i got from this video is that hashing is not encryption and there is a difference between the two, while its easy someone confuse between them.
I remember when SHA1 was actually still secure, and people could get away with MD5 (although it was started to be frowned upon). Now I feel old.
Apple once tried to get away with MD4.
5:50 summarised the subject in 1 sentence ;-)
hi, please explain how you get new A B C D E? When you put 512 bits with initial A B C D E, you get new 512 bits, is it right?
Can you do one of these for bcrypt as well?
are the initial values important? any recommended readings on this?
What happens if a message is smaller than 512 bits but long enough for the padding part to not have any space left to store the length of the message?
Then you pad to 1024 bits(including message length)
Mike is the best
Haven't seen that computer pyjama paper you are writing on in qute a while. Is it still used or is that just redundant stock?
So can two different string can output the same result after go through the hashing function?
awesome awesome awesome great explanation! ty
I ve always wondered what are those books, Would someone please show me the names of the books on the shelf and their authors?
I'm confused , what is that "abcde" stand for ? and why is the loop be done 80 times ?
and the text is 512 bits long right ? how do I convert them into H0-H4 which is 160 bits in total ?
thanks
Actually that process involves using x-or function ,you can see it on the net about the way the abcde is changed into a different abcde it is pretty interesting
Isn't it unsafe to have a padding scheme that leads to pre-image collision? E.g., h(msg) = h(pad(msg)).
So a hash function can protect against doctoring a message.
How do you prevent the insertion or deletion of a message in stream of messages? Each can be hashed, but you could create a new message, hash it, send it and its deemed good.
Do you have a secure cryptographic sequence number than can be embedded in any way?
"How do you prevent the insertion or deletion of a message in stream of messages?"
Before sha'ing you just append a shared secret. That way someone intercepting the message on route won't be able to produce a valid hash for an altered message. The recipient verifies the integrity of the message by sha'ing the message with the shared secret appended to it.
"Do you have a secure cryptographic sequence number than can be embedded in any way?"
If you mean some "sequence" number that appears to change randomly from one message to another, yet is known/anticipated by the recipient, than that's basically their shared secret, except it's not static.
However, in this scenario getting out of sync would mean that all the following messages would fail their integrity checks, until some sort of reset. That makes it trivial to do a DoS attack on the protocol/exchange. One common way to counter this is to reset every minute or two, but then the communication would have to be (close to) real-time.
Such a sequence can be any sufficiently random pseudo-random number generator sequence.
Really Great! Thanks alot
Is it possible to superpose pseudo random number generators to increase the levels of randomness?
Still a bit too confusing for me........ Can you make a video on Hashing VS Encryption? When is what used? If the hash always has less information than the actual file, why would you ever need to hash something in the first place?
Encryption is reverseable, hashing is not. In hashing, the receiver only need confirmation that the data is valid.
One example is password authentication. For security reason, the server does not store copy of user password, they only store hash of the password. When a user try to login, the server compare the password hash to the one stored as authentication. Meanwhile, if the database gets breached, people can't use password hash to find out the original password (other than brute-force the original password).
tell me which sha to use when finding duplicate files
You teach this better then my professor
I know youre not 'languagephile' but is there a real reason for nought and zero being so stark in contrast?
also: if oyu hve a message between 502 and 511 (inclusive) the padding would try to tack on 10 extra bits, how is that resolved? (10 bits because 1, then #of bits which is 9 in length)
He’s a very knowledgeable guy, what are his qualifications ?
I'm curious about the books on the shelf whose titles I can't read. They are the 4th, 6th,7th, and 11th books from the left. I don't think I care so much about the 12th book from the left. Does anyone know the titles of those books? I think I want those books.
This man forgot more about IT security than i will ever learn
At 0:34, my mind went dirty.