Essentials: Brian Kernighan on Associative Arrays - Computerphile

Computerphile

zhlédnutí 127 618

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 10. 08. 2017
The 'Swiss Army Knife' of data structures, Professor Brian Kernighan talks about the associative array with beer & pizza.
EXTRA BITS: • EXTRA BITS: Essentials...
"Code" Books: • "Code" Books (Prof Bri...
Many thanks to Microsoft Research UK for their support with the 'Essentials' mini-series.
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Komentáře • 332

@ThePandaGuitar Před 5 lety ⁺⁷²
Pizza: 10 POUNDS!
Beer: 20 POUNDS!
Coffee: 2 POUNDS!
Beer: 20 POUNDS!
You go Kernighan, that's the spirit!
@DavidChipman Před 6 lety ⁺¹⁰⁶
I see a Computerphile video featuring Brian Kernighan, I must drop everything and watch and "thumb-up". I'm a simple guy.
@eugenesarce3997 Před 4 lety
tg
@leninalopez2912 Před 5 lety ⁺⁸
I love Brian's voice, and how gentle and methodical he is when explaining things
@Hyreia Před 4 lety ⁺³³
I definitely fell in love with associative arrays in my Data Structures class in college. Between these and linked lists you can build just about everything.
@jerichaux9219 Před rokem
@@auroraborealis5565 That’s an imported library.
Built under the hood with associative arrays and linked lists.
@498fun Před 6 lety ⁺⁸
It makes me so happy to get some more lectures from my favorite prof even all this time after graduating. Not many people can be this entertaining and this informative at the same time!
@brucewaters1617 Před 6 lety ⁺³²⁵
this guys shopping list
Beer
Pizza
Coffee
Beer
@Riff.Wraith Před 6 lety ⁺¹⁶
£134 worth of coffee at that, hooooly
@498fun Před 6 lety ⁺¹⁷
Classic Kernighan examples :D
@knife_wizard Před 6 lety ⁺²¹
Eh. Sounds like your average programmer.
@noredine Před 6 lety ⁺¹²
you forgot beer
@jeffirwin7862 Před 6 lety ⁺⁵
+noredine Sorry I forgot, I'm blaming this one on the beer.
@AvailableUsernameTed Před 6 lety ⁺³¹
Larry Wall: Doing linear scans over an associative array is like trying to club someone to death with a loaded Uzi.
@landspide Před 6 lety ⁺¹⁵
A legend that truly understands 'the programmer'
@sebschrader Před 6 lety ⁺¹⁷³
Map is also a common name for this data structure
@JugglingGamer Před 6 lety
Sebastian Schrader he did mention that associative arrays can be referred to as [Hash]maps.
@sebschrader Před 6 lety ⁺⁶
I just rewatched it and didn't hear him say it. He mentioned only hash table, hash and dictionary.
@JugglingGamer Před 6 lety
Sebastian Schrader my bad, I must've misheard!
@unvergebeneid Před 6 lety ⁺²²
Although for C++, it's important to remember that map is usually some form of binary search tree and unordered_map is a hash map.
@sofia.eris.bauhaus Před 6 lety
object. B)
@linuxelf Před 6 lety ⁺⁴
Very interesting. I'd never studied how these structures were stored internally, and now I finally understand why data stored in a hash is stored in a somewhat random looking order.
@allluckyseven Před 5 lety
One thing in common between most if not all of these videos is that it is such a delight to listen to these experts talking about things in their respective areas.
@jan_harald Před 6 lety ⁺⁵
40th
I love this legendary man...
truly legendary, too bad I'm never gonna meet him in person...
@JahMusicTube Před 6 lety ⁺¹²
I'd have loved to have him as a professor! Very clear explanation :)
@loadedfries5764 Před 6 lety ⁺²¹
This was a really great video! The way I get it, the value of a hash table is that it's flexible and, as the Professor Kernighan noted, has almost constant time. You can use any type of data as the indexing element, thanks to the hashing function, and you almost always go through the same number of steps to access any data in the array, which is very different from--for example--a search function. And it's probably easier to read and understand in code. The only downside I see is that a hash table can be inefficient in terms of how much memory is used.
@tscoffey1 Před 6 lety ⁺¹⁵
It is the classic "cpu time" versus "memory used" trade-off in computer science.
@dmitripogosian5084 Před rokem
Access time in terms of caching seems inefficient as well
@azerotrlz Před 6 lety ⁺⁵
hashmaps are one of many ways to implement the associative array abstract data type. some of the most famous alternatives would be tree maps, implemented using self-balanced or unbalanced binary search tree, or associative lists, implemented using linked lists.
@andresilveirah Před 6 lety
Awesome idea to bring this "Essentials" series, specially for us who have seen all this some time ago at University.
@gggfx4144 Před 5 lety ⁺¹²
When he was talking about pounds, I initially wasn't sure if he meant weight or currency, so I was thinking "he buys 20lb of beer and pizza?!; programmer for life"
@grn1 Před 3 lety
Maybe that's my problem, I don't like beer.
@DJstarrfish Před 6 lety ⁺⁵⁹
Too bad this series came out too late to interview Dennis Ritchie. RIP.
@treyquattro Před 5 lety ⁺⁵
Ken Thompson is still with us...
@nicolareiman9687 Před 5 lety ⁺⁴
@@treyquattro ken doesn't like the interviews.
@oysteinsoreide4323 Před 4 lety ⁺¹
I'm using hash tables all the time in my code. In C# they are called Dictionary. Very useful collection type indeed.
@dustinjohnson6302 Před 6 lety ⁺³²
he's a young Dumbledore of programming wizardry
@sowellmemo Před 4 lety ⁺⁴
" Maybe beer collides with pizza. I mean they go well together! "
@donaldkjenstad1129 Před 6 lety ⁺³
We were doing this type of algorithms back in the early 80's to manage memory allocation for paging systems.
@-rikishi- Před 3 lety ⁺¹
Let's make a hash table JESSY!! -Misteeeer Kernighan, this is the purest blue linted code i've seen!
@Syntax753 Před rokem
Loved this!
@CakeIllusion Před 6 lety ⁺¹⁰
Can Kernighan please explain the Lin-Kernighan heuristic?
@afelps9515 Před 6 lety ⁺⁴
An episode about character sets and encoding algorithms would be interesting.
@VishiVish01 Před rokem
Great video!
@AmnonSadeh Před 6 lety ⁺¹¹
7:35 the marker pen makes its sound even when not being used :-o
@dec2 Před 6 lety ⁺²
How the heck did you catch that!? Please teach me how to sorcerer.
@X_Baron Před 6 lety ⁺⁸
Are you claiming that it's a magic marker?
@MegaGreenLightning Před 6 lety ⁺²
Please also make a video on Open Addressing, which is another way to implement associative arrays.
@haczyk84 Před 6 lety
Thank you!
@ThunderAppeal Před 5 lety ⁺³
When BK looks in to the camera i feel as if he's speaking directly to me.
As if I'm Neo from The Matrix.
@jasonmathew33 Před 6 lety ⁺²
The foundation of many efficient algorithms :)
@reallyWyrd Před 6 lety ⁺⁴
In perl it's actual '%' not '#'. '#' is for comments instead.
But yeah perl has hash tables as a basic data type. That always seemed very weird to me, but now I get it. Up until now, I simply could not understand how something seemingly so elaborate could be said to be efficient or quick. I get it now.
@rrp2600 Před 5 lety ⁺²
I have used PERL hashes before but I don't think I really grasped the inner workings of them until watching this 10 minute video.
@typograf62 Před 6 lety ⁺¹
Some "administrative" programming languages have "temporary database tables". They are not committed to disk, they are private, they do not bother much about the overhead of behaving like a database table. But they do such a job just fine (or better) and you do not have to invent a hash function or copy data when things get crowded.
@absurdengineering Před 4 lety
typograf62 These days all languages that have sqlite bindings automatically get “temporary database tables”. In .net you also get DataSet.
@idivideby0096 Před 6 lety
I work with these every day. Very common in the medical industry.
@patricknelson Před 3 lety ⁺¹
As a programmer myself, I figured I might not learn much, but I didn’t realize hash tables utilized linked lists under the hood.
@kmac499 Před 6 lety
key value pairs. oft derided by comp sci and database guys is a natural way to handle data.
@jabuci Před 4 lety
What assoc. array library should I use for C? If I don't want to implement it each time, what do you suggest?
@timc3600 Před rokem
I wish I had a tenth of his knowledge.
I came across hashes in PERL and thought wow as they are so logical but I never thought about how they worked under the hood.
@PixelOutlaw Před 6 lety
It's one step further when your associative array can have different types of key. At that point you can model OOP at some level. :)
Not that is the most efficient to do it that way. But it's a fun diversion.
@XBOXLivexyab Před 6 lety ⁺²
Beer, Pizza, Coffee, and Chips... A programmer's grocery list for sure!
@eliastandel Před 5 lety ⁺¹
These are so essential that In Lua hash tables (called tables in the language) are the only data structuring mechanism, ie.e there are no lists, sets etc., only hash tables.
@mdmenzel Před 6 lety ⁺¹
Are tuples implemented in the same way by programming languages that have them?
@franciscogerardohernandezr4788 Před 5 lety
The master has spoken: associative array it is.
@mescobar12me Před 6 lety
You guys should do a video on, Network on Chip! :P
@zoranhacker Před 5 lety
We only know that the value for pizza is in some location because the hash of pizza gives the "address" (not sure if it's literally the address), right? So if there is a collision with another value and we expand the linked list how exactly would we differentiate between the two values?
@bobi97bg Před 6 lety
For anyone just getting into the java world, if you are going to use a Hashtable somewhere, its probably better to use a HashMap instead. More details can be provided by google/stackoverflow.
@subliminalvibes Před 6 lety
So, in real-world applications for a layman like me, would this kind of hash be behind such functions as, "users also bought" and "suggested for you"? Or is it more useful for 'categorising' items like "baking utensils" which can have *multiple* other categories like "cutlery", "bowl", "glassware", and more.
@keybizzoneg5209 Před 5 lety
Coffee is essential. I like this guy :D
@bsvenss2 Před 2 lety
02:15 I love the £0 spent on juice! *lol*
@IceMetalPunk Před 6 lety
In Java, they're called HashMap. In Javascript, plain, anonymous objects are used for this purpose. (Also, fun fact: in Ruby, the operator that associates a key with a value, =>, is called a "hash rocket".)
@BrianFrichette Před 6 lety ⁺¹
IceMetalPunk in JavaScript, there's been Map and WeakMap for a couple years.
@MissPiggyM976 Před 6 lety
Thanks, I saw debugging Java Hashtable the effect of collisions, but I didn't recognize it for what it was, I believed it was an Eclipse strange bug!
@skyepyro7104 Před 6 lety ⁺²⁸³
That small hesitation before 'Javascript "programmer"' makes me giggle.
@Croxmata Před 6 lety ⁺³¹
Are you trying to tell me that HTML is not a programming language?
Hmmm!
@weepinghomonculus4887 Před 6 lety ⁺²⁰
Shared the same sentiment, until I started to program in React + Redux. It's as sophisticated as anything else really :)
@BeCurieUs Před 6 lety ⁺¹⁷
People who use things like C++ and such hate to call people who use "scripting languages" like JavaScript actual programmers.
@knife_wizard Před 6 lety ⁺¹³
Yeah... I was on a group project in college that managed to, in one semester, add a whole 7 lines to node.js
that was a mistake... Javascript is hellish, and I feel sorry for the people that have to look at it for their jobs.
@sofia.eris.bauhaus Před 6 lety ⁺⁷
the only thing wrong with javascript is the few remnants of java in it. :P
@blueluelueluelue2343 Před 6 lety ⁺¹⁵
how do you loop through an associative array?........like in a traditional array, you can just start a for loop as (i=0;i
@animowany111 Před 6 lety ⁺²⁸
You use an iterator, as you can't index memory sequentially like with arrays.
Something like
iter = map.keys(); // or values directly
while((elem = iter.next()) != null) {...}
The details differ slightly between languages, but this is in general the way to do it.
@Multigor96 Před 6 lety ⁺³
Blueluelueluelue depends on how you implement hash function, usually hash function takes key and provides a number that corresponds to that key. So what you should do is just make normal array of n elements where insertion is done on indexes that correspond to key, what that means is that developer can go through whole array like you just said but user can't.
@wi1h Před 6 lety
Blueluelueluelue they're typically linked lists i believe. or you can also just use an iterator
@nerdy_crawfish Před 6 lety ⁺¹¹
The correct way is to use a foreach loop if your language supports it. It should automatically get the iterator for you and iterate through each element in the array.
@airjuri Před 6 lety ⁺⁵
foreach() is the easiest, IMO, way to loop through associative array. And by using associative arrays you don't have to loop through it to find the one you are looking for. For example if you need to find price of coffee, you just use that associative index. echo $data_array['coffee'];
php example follows:
foreach($data_array as $key => $data) {
// your code here
}
Inside that foreach loop, there are two variables, $key and $data, $key is the current array index and $data is anything that current index of $data_array holds. It can be anything that variable can be, another array perhaps :D
@LeonardoBaracat Před 2 lety
This is THE Brian Kernighan. 27 dislikes?! Are those people nuts?!
@leonhrad Před 6 lety
If you care about performance you should consider not using collision lists, but keep the array flat (each element contains the actual (key,value) pair instead of a pointer to a list) and use linear probing. It's usually faster. You only need to be careful where to insert new elements and how to remove elements.
You can then even separate the (key,value) array in two arrays, one for the keys and one for the values which is especially useful if you're iterating a lot and you're mostly interested in the keys for example.
@cad97 Před 6 lety ⁺¹
(Or even better, just use the builtin)
@kpjVideo Před 6 lety
Associative arrays are especially useful when trying to conserve time and space.
Otherwise, you'd be enumerating local variables quite a bit
@brittanymarie8523 Před 5 lety
Oh my.
@cmdlp4178 Před 6 lety
I think you should write specific hashingfunctions for specific applications, like you make a hash out of a string, while only adding the position of the letters in the alphabet instead of the unicode-id.
Why don't you split associative arrays into associative key-array and data-array, where you can reuse the key-array on other data-arrays, as you making a struct in C(++) and the key-array to access a specific member (which is "inlined" into code by the compiler) is not stored within the struct.
@vitalspark6288 Před 6 lety ⁺¹⁴
The Perl sigil for hash tables is %, not #.
@briansmith8967 Před 2 lety
I first learned about associative arrays when I learned Tcl and I thought, "that's magic!"
@davesextraneousinformation9807 Před 5 lety
Why is the symbol for "pound" that (strange to Americans) upside-down 7 with a line through it?
@vladomaimun Před 6 lety ⁺⁶
For some reason I always thought associative arrays would be complicated to implement.
@tscoffey1 Před 6 lety ⁺⁴
The complexity is in making them efficient for the maximal numbers of use cases. An associative array that only expected strings as keys can be optimized better than one that has to handle many disparate kinds of keys.
@MrSlowestD16 Před 6 lety ⁺¹
The problem with them is choosing the number of buckets. Choose too many and you have wasted space. Choose too little and you have long lookup times. Then to adjust the bucketsize as Brian talked about, it takes a fair bit, so it's not something you want to do often.
@DDranks Před 6 lety ⁺⁷
At their simplest, they are simple. But then there's the implementation choices and optimisations about the hash function, numbers of buckets, re-allocation strategies etc., and they suddenly become complicated.
@styleisaweapon Před 6 lety
The most complicated of them minimize overhead either in the space-complexity sense, or the time-complexity sense. The simple implementations fall right in the middle.
@schok51 Před 6 lety
Topic suggestion: persistant data structures.
@liamsutton6202 Před 5 lety ⁺¹
Brian could describe his breakfast for 2 hours and it would still be interesting
@ernststravoblofeld Před rokem
How many interviewees learn the crews' names? Cool guy.
@blvnktek Před 6 lety ⁺¹
for some reason hearing that marker really kills me inside
@pleappleappleap Před 2 lety
Why use a linked list to deal with collisions? Why not use a second-level hashtable with a different hash function? The chances that two items will collide in two hash functions is vanishingly small.
@DarthAthar Před 6 lety
This video is more about Hash tables than associative arrays, and even then it only looks at one way of doing collision resolution.
@AndrejPodzimek Před 2 lety
5:55 Well, in that case you might need to *undrink* some beer, diplomatically speaking. That was a very common occurrence during my university years.
@sizzlebread23 Před 6 lety ⁺¹
how did he get my shopping list
@limew Před 6 lety ⁺⁴
in bash you can create an associative array with:
declare -A array
array[pizza]=20
@SteinGauslaaStrindhaug Před 3 lety
Weirdly I refer to them as hashmaps or just maps when talking about them in general, even though my two main languages calls them dictionaries (Python) and objects (JavaScript)...
Wonder where I got that from, maybe back in programming class... Are they called hashmaps in C++ maybe?
@w0mblemania Před 2 lety ⁺¹
The advantage of calling them "Dictionaries" or "Arrays" is that you abstract the problem away. After all, whether a Collection uses a fixed array or a hash table should be entirely an implementation detail, usually dependent on the number of elements in the collection, and whether uniqueness is required. The programmer typically shouldn't care about the implementation detail, only the boilerplate description, and big O characteristics.
@shaylempert9994 Před 6 lety
Whats the algorithm that decides when to increase N?
@tscoffey1 Před 6 lety ⁺¹
It can vary. The associative array would keep track internally of both how many table "slots" are used, and also how long the longest collision list is for any one hash slot. When some cost function (which combines the two in some way) reaches some cutoff value, a growth process occurs. Bear in mind that growing these tables is expensive though, as each table entry must be rehashed. So the cost/benefit between growing and not growing (but having longer search lookups) can be tricky to get right. (If you grow too often, you waste cpu growing unnecessarily. If you don't grow enough, you waste cpu on table lookups due to more collisions).
@MrSlowestD16 Před 6 lety ⁺¹
"Buying beer and, pizza, and coffee, and chips" - yeah, 100% programmer confirmed, lol.
@balorprice Před 6 lety ⁺³
Did Brian Kernighan just make an off-by-one array length error??? So... Much... Irony...
@lels3618 Před 6 lety
Why arnt there just two arrays, one with the keys (So on a access you loop trought fill you find the index where the key was) and another array with the values (which you would access by the index where the key was in its array)
@daviddupoise6443 Před 6 lety ⁺²
The "mission" critical issue which Brian didn't really get around to is reducing the lookup for any one element. You don't want your algorithm to have to traverse the entire structure in order to find what 'could' be the 'last' element in a very very long list. Too inefficient. So the modified hash table is superior to an array or standard linked list or doubly linked list.
@SuperManitu1 Před 6 lety ⁺²
Yeah, your two array solution is O(n) to access an element, a hashmap is O(1)
@tscoffey1 Před 6 lety
The search cost for table lookups for that approach is very expensive. For string keys (as in this video), you end of comparing the strings for equality to find the matching key in the table. With a table of size K, you can expect to have to check K/2 keys on average to find a match. With hashing you still have to scan the lookup key string once to produce a hash value, but you then only have to search a much smaller subset of keys (the collisions), trying to match the key. Much, much faster. However, as with all things, there is a worst case scenario - the one where ALL keys collide to one hash slot - that then requires checking the same K/2 key strings to the search key string (as above). But this is very unlikely to occur in practice.
@TheMagicToyChest Před 6 lety
Is Dr. Kernighan in Nottingham or something?
@ValeUmCanal Před 3 lety
Good
@welltypedwitch Před 5 lety
Well... C# has both, Dictionarys and HashTables
I'm confused now
@computerscientist5953 Před 5 lety ⁺¹
they essentially serve the same purpose, but have some internal differences
@TheTurnipKing Před 5 lety ⁺¹
In essence, an array without index numbers?
@zss123456789 Před 5 lety
In essence, an array with index numbers converted from actual keys.
@MerthanE Před 6 lety
How about doing some CZcams magic and making "Essentials" a actual CZcams series, like Tom Scott did with the fizzbuzz video recently +Computerphile? Anyways, nice miniseries.
@midqualitygaming3498 Před 5 lety
3:25 *THE CAP*
@HazeAnderson Před 5 lety
0:58 he almost said Perl. He did. What happened to Perl is damn tragedy.
@Joe-ud1de Před 4 lety
What did happened with Perl?
@deltakid0 Před 6 lety
Could you please add English subtitles??? It's very hard to non native English speakers like me to understand everything you say. I've seen other videos from this channel supporting this feature or at least allow Google auto captioning
@styleisaweapon Před 6 lety
Its probably queued up to be auto-captioned by Google. Likely it depends on the number of views a video has before it gets put into the queue.
@ImSquiggs Před 6 lety
I thought hash collisions were exceptionally rare... do they really come up that much in associative arrays?
@MadocComadrin Před 4 lety ⁺¹
It depends on your hash function. It needs to be rare for cryptographic hash functions, but hash functions for hash tables only really need to be balanced--- infrequent collisions are okay if your hash values are spread out over the entire table.
@ficolas2 Před 6 lety
2:15, why is juice free?
@BlenderDumbass Před 6 lety
In python Dictionaries it explains it self
@jern6224 Před 5 lety ⁺¹
i feel uncomfortable of the sound of the marker grinding on paper, :
@boudreauxbroletariat3959 Před 3 lety
just noticed that Dr. Kernighan is a lefty -- why am i not surprised haha
@BrianFrichette Před 6 lety ⁺¹
I've only ever heard "associative array" used in PHP, a language I try to stay as far away as possible.
@GCOSBenbow Před 6 lety
web devs (cringes)
@recklessroges Před 6 lety ⁺¹
Avoiding PHP is to your credit. It was the 90ies version of node.js ;-)
@kensmith5694 Před 6 lety ⁺¹
The first one I saw was written in IBM-360 ASM. They are very useful when making compilers and interpreters. Programmers are notorious for using variable names that are similar.
@hansvetter8653 Před 2 lety
The only way for humans to express meaning is language, which uses words as its building blocks. So instead of a meaningless number, address in memory or numeric position in a list, you use meaningfull words instead in associative arrays ... very easy to use ... enhancing readability of code greatly ... BUT ... is hard to implement in any programming language in terms of compilers ...
@birdman4274 Před 5 lety
3:47 X["beer"] += 15
@kittyrules Před 6 lety ⁺¹
Clicked because of the cans on the thumbnail
@gz6616 Před 6 lety
really?
@daiharr Před 6 lety ⁺¹
"I take pizza and run it through a hash..." Yeah, nobody will eat that pizza anymore...
@garrettkajmowicz Před 6 lety ⁺²
No love for C++ map?
@lukenukem8028 Před 4 lety
C# Dictionary Yay
@lukenukem8028 Před 4 lety
They Are Awesome
@sidharthjha6916 Před 4 lety ⁺¹
IIITD SHOW UP!
@zee63976 Před 6 lety ⁺¹
> Coffee is essential
SAVE 418!!

Další v pořadí

Automatické přehrávání

Coffee with Brian Kernighan - Computerphile