The Knapsack Problem & Genetic Algorithms - Computerphile
Vložit
- čas přidán 30. 09. 2020
- Tournament selection, roulette selection, mutation, crossover - all processes used in genetic algorithms. Dr Alex Turner explains using the Knapsack Problem.
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
Finally I can work out what loot to take in Skyrim.
Skyrim? Son, if I knew about this in 1999 I'd SOJs on SOJs in D2...
Shiniest stuff first. Always. After that then apply a genetic algorithm, obviously, but not until the shiny stuff is all taken, even if that means you have to walk all the way back to Whiterun.
15 forks, a wheel of cheese, some clothes you cant wear, and a cabbage.
This problem is simpler than skyrim tho. With skyrim, you won't know all the items ahead of time, rather you get item and have to keep or discard without knowing future items.
Take the calipers, only the calipers; leave everything else. Problem solved. (Still looking for calipers in Skyrim.)
Usually it takes me two hours to pack my bags
Now it'll take hundreds of generations
you are Bruno Ribeiro
hahahahahhah
@@opkp yyyup.....?
I was taught by Alex in my first year at the University of Hull; he was always such a kind, interesting, and intelligent lecturer. I'm glad to see his prowess shown off to the world here.
5:18 that dying sound :,D
Right?
"You have a knapsack"
?
"Which is like a rucksack"
???
"Or a Bergen"
?????????????!???!
Backpack
It's like a cloth portmanteau that closes with a zip rather than buckles. ;-)
@@violet_flower Heyooooo!
Don't worry, I sent him a note informing him of a more general, if somewhat obscure term: "Bag"
It's like a haversack.
Bergen: a special backpack used by the Brittish military. Looks like a daypack. From the name of the manufacturer. "As for the nickname, “Bergan” is an adaptation of the name of the Norwegian backpack manufacturer Bergans,"
I needed this
Thank you! I re-listened to that bit like 5 times trying to figure out what he was saying.
Thank you.
Ah! The famous Bergan. Never heard of it, so thanks!
source?
Genetic algorithms are cool. I wrote one in college to find patterns in Conway's game of life that resulted in the densest and longest lasting sequences. It's horribly slow, even on 10 threads, and I've never seen it reach the ideal on anything bigger than a 15x15 grid, but it always produces fun results.
evolution also takes millions of years . GA will converge to a global optimum , given enough time and a very big population
@@gerritgovaerts8443 hm
@@gerritgovaerts8443 hm
Oh my I am litterally trying exactly that right now, I didn't imagine anyone else had the same idea before
@@gerritgovaerts8443 not really, if the mutation rate and crossover is not well selected it might reach a local optimum and never reach a global optimum
It'd be nice to have a second episode on this discussing GA and local maxima.
The main problem with all these heuristic algorithms is the vast number of metaparameters that need to be adjusted for them to be efficient and the fact that there is no a priori way to make an informed decision on which initial values are likely to be ok for the specific problem at hand.
You are right but incidentally you are also defining a problem for which a genetic algorithm would be excellent solution (assuming that it was faster to compute, perhaps not a genetic algorithm itself).
I was wondering about genetic algorithms this week! Glad to see a computerphile episode about it now!
8:57 - Sean took the words out of my mouth (or thought out of my head 🤔); this makes more sense when dealing with a large number of items and variables, otherwise it's more efficient to just brute-force the permutations. Back in the day, when I was trying to figure out the best way to put files on floppy disks (and later, CDs) to minimize wasted space, I just did it manually.
I mean, the standard solution is to use dynamic programming and a memo. The worst case is still no better than brute force, but it usually does pretty well.
Knapsack? You can do it with Dynamic Programming. Also, computerphile you need to do an entire series on Dynamic Programming
I thought he was about to write out a table and do the classic dynamic solution to the napsack problem lol
I thought same
You can also tell whos a compsci undergrad, prolly 3rd-4th year with this problem. They all scream DYNAMIC PROGRAMMING!!! 😂
He picked a problem that worked well for the type of solution he provided. The point of this isn't "how to solve the knapsack problem"
Great presentation. You are so right about ga’s. They are fun to work with and sometimes can find interesting solutions to hard problems. Around 20 years ago I published a series of articles developing ga’s to break ciphers. One was a paper on using ga’s to break the knapsack cipher which true to form showed some promising results.
This was GREAT! I'm currently teaching Binary Search & Bitwise operations - tgis seems an IDEAL problem to hack in Python before attempting it in Assembly then attempting to establish its time complexity.
One of the best explanation so far. Thanks a lot for your time and efforts.
Alex, legend. Best lecturer I've been taught by.
This sounds like an alternative for what I learned on optimizing course for mechanical engineers. Simplex algorithm which conveniently matlab was happy to do for me if I presented a couple of base functions like objective function. Very interesting stuff.
I'll believe anything that a man holding a Crayola pen tells me.
so funny abd simple
That tournament sound effect, love it :D
00:16
"I think that's pretty NEAT"
I see what you did there ;)
NEAT is such a cool thing!
Self brag here. Got 100% grade in Alex's AI module last year. Was super fun, had to program our very own GA to address the coupled inverted pendulum stabilization problem. Looking forward to going to a PG level and studying more AI!
Why would you not calvulate a value density for each one, order them from highest to lowest then pack it from the top down? For instance the 1:7 has the highest value per unit of mass, then 2:4, 7:5 and 9:2 last. Pack them in that order until you are just under the limit then reorder and recalculate from there? Eliminate any that are over the remainder, then from there down pair any that are between .5 and 1.0x of the remainder and calculate the density of the pairings, iterate over this process until there are no more under the remainder.
this guy is a masterful explainer. great work
I did my dissertation project on evolutionary algorithms in python and I can say this video is very well done thank you guys 😁
What's your opinion about using Python for this kind of algorithm? I'm just starting a research (as an undergraduate) on GA and I'm picking up a project were my colleague left off, but is is in Python.
It's a very slow language and my research will involve parallelizing some algorithms to make a comparative study.
Hey Alex! Glad to see your doing well in nottingham, missing you here in hull! Best wishes Alex, from Alex :P
A very nice video and explanation of GAs. I am so happy and proud to have Alex as the co-supervisor of my PhD :-) Thanks for the brilliant tutorial Alex and Computerphile.
Nice to see GAs being used. I used them in my dissertation back in 1994, never used them since.
This is a wonderful explanation, thank you.
Nicely explained. I did use GA in my master's thesis.
Interesting what did it include?
Hey Alex 👋👋 Cheers for helping me with my dissertation 🎉🎉
Why do tournament selection instead of picking the two randomly generated solutions with the best scores?
I think it generally leads to more diversity in the population. Otherwise you might end up with only very similar individuals in the population, which then have very similar ofspring. This is bad because you get "stuck" in a small area of the solution space, while the best solution might be in another area of the solution space entirely. If you only take the best individuals you never select mutations which temporarily give bad solutions but might lead to better ones in the long run. Hope that makes sense.
Tassle So is it basically to avoid the equivalent of inbreeding?
@@gingeh1 You could frame it that way (except that inbreeding in this context doesn't really produce worse results, but can prevent producing better ones)
Me being in aladdin's cave and just sitting down to do maths...
Hopefully, you have access to, and time on, a supercomputer to calculate it all out.
It's Alibaba
Wonderfully well explained
I remember, ages ago, some softwares that I used to fill the floppy disks at their maximum capacity. They used the Knapsack algorithm. Nice memories!
Knapsack is a problem, not an algorithm. There are loads of algorithms that can approximate an answer to the knapsack problem.
I would love if you guys did a video explaining CHAP!
Fascinating and somehow related to simulated annealing - at least in how neighbors are created by manipulaing random bits of the candidate solution
. I found that the latter gives a solution within 3% of the optimal ones for big knapsack problems with 1000+ items
Hi I really enjoyed this video and was wondering if you could point me to any resources you would recommend to get a better understanding of some real-world applications. Thanks!
How does the convergence rate of the genetic algorithm compare with a straight mutation algorithm in which a random bit is flipped (small change) with occasional multiple bits flipped (big change) where the tournament is between the parent and child? I am thinking about data vectors of size > 1000.
Thanks for the wonderful presentation. .. I got to the point that from example you considered the initial population and each iteration you went generating 8 or 16 children observing increasing in fitness values of population...What I didn't get was what was the final results? what is that your bag was finally filed with (for instance considering example you took which one items ended up in the bag)?
Nice Animations Brady!
I see Dr. Ferrante Neri is at Nottingham now. I did a course with him on Optimisation when I was at DMU, you should ask him about memetic algorithms!
@9:00 that feeling when the guy just said a problem which is provably as hard as any NP-complete problem is "trivial".
I think it's the ost fun algorithm I've learned so far. Thanks :)
- This problem is trivial
DP Students: 😭
What's DP? I heard my mum mention it to my dad and uncle.
@@TVIDS123 dynamic programming
@@xfxxgj7086 He was joking hahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahaha
I once worked with a developer whose boss had asked him not to use the word "trivial" because it was giving clients the wrong idea.
@@xfxxgj7086 Actually... I think DP in this case is Diploma. The IB DP computer science exam is on genetic algorithms next year.
thank you, it helped me a lot 🙏🏻🙏🏻🙏🏻🙏🏻🙏🏻
This video makes me want to revisit Boxcar2D :)
Thank you for perfect explanation!)
I've been struggling a lot with the minimum set cover problem...If anyone has another fast and efficient algorithm, except the greedy one, I will be happy to see it.
yay for combinatorial optimization! :DD
How is GA vs particle swarm optimization? Is NSGA ii still recommended? Mostly for multi objective optimization usage :) .
Is there a technique that is more deterministic yet reliable enough to avoid local minima? How about one that is computationally efficient for quick and dirty estimates?
Imagine getting the birds and the bees talk from this guy. 'Well you see, there's a knapsack problem.... and through an evolutionary ranking, weighting, and robustness selection system, parents with the best scores are chosen, then half of the genetic ones and zeros from each parent is taken and combined to form a child, toss in a little random mutation here or there and repeat the process until you reach an optimal point in the population distribution curve and BOOM, out pops a baby from "Aladdin's cave'.
Superbly lucid!
But in which scenarios should we use it?
Omg what a clear explanation
excellent! thank you
Amazing thanks
Is it correct for crossover you need the input parameters to be independent?
Now I want to hear more about it.
Please please please do a video on other population based optimization algorithms such as particle swarm optimization, differential evolution and artificial immune system!
This guy is great.
How would you formally explain why this problem should be solved with GA and not a neural network?
Would you agree it's because the fitness function is not a continuous function?
Also, can you recommend a book on GAs?
the animations help make it more comprehensible
QUESTION
Wouldn't it be the same as we just type algorithm that calculate every value one by one combination and compare with the variable MAX?
Could this be done with the 3D Bin Packing Problem as well? Seems like a better solution than brute forcing
Struggling with this, in the given scenario wouldn't simple loops checking the actual criteria allow us to score the "Loot" with absolute accuracy?
So im curious what is best solution to this problem being explained?
1110 is the best, with a value of 16 and a weight of 10.
We can't have all 4 pieces (that would be 19 kg), but if we remove the least valuable piece, we are already under 15 kg.
dynamic programming
I think the "solution" is the generalized algorithm - solving this SPECIFIC problem is not really the point.
Is it possible to simulate a simple rna organism with code?
Try it! ;-)
It's probably pretty easy for a knowledgeable person. With some approximations, of course.
Yes
You might be interested in artificial life, the CS field that aims to emulate life in computers
If you're talking about a RNA virus that makes a bunch of proteins that then need to deliver the RNA payload into new cells to infect, you'll need some way to estimate how well those proteins work. You might need to find a solution to or a way to bypass the "folding problem" which is that a protein with a specific sequence of amino acids in water always folds into the same shape, but predicting the shape it will take out of all possible shapes it could take is very hard. The shape it folds into determines how well the protein works.
You would think ideas like this would be applied to computer language compilers to create better machine language code to reduce file sizes. However it could loose some abstraction.
What's a fast alternative to this algorithm? Something suitable for near real time sorting? Great explanation otherwise, you summarised in a few minutes an entire class I took for a semester.
This type of algorithm is used when there is no obvious alternative - meaning that there is no known 'faster' way, and also if a brute force search is impossible due to the huge number of possible solutions (aka 'combinatorial explosion').
@ambassador - a simple loop that requires an astronomical number of iterations isn't feasible, unless you have an astronomical amount of time.
This problem is known as the combinatorial explosion.
awesome, quite interesting
They are definitely great for blocking!
Is there a special name for the kind of paper used? I love it😍
Look up pictures of "fan fold paper" and "line printers". Oh, the memories. Nottingham seems to have infinite supplies of this paper. :)
hazy or is it like no sompile is not hey dough?
So how does this approach compare to Dynamic programming solution in terms of complexity?
it may be because of the simple example but why is this genetic algorithm 'better' than just creating a list with all possible permutations and then finding the best one in there?
It works well for even more complicated problems, where listing all permutations is not possible. Stuff with many continuous variables. Check out CaryKH's evolution simulator on CZcams for another example
They're only useful when it's not practical to do a brute force search (i.e., when evaluating every possible permutation would take a very long time).
Remember the last part of the video. Even for 100 boxes, each solution is 100 bits long and as such there are 2^100 solutions to search through. That's already a massive amount of data to search through by brute force. Also consider that the solution space for this problem was discrete and finite, it could be made infinitely big by a very small change in the problem. If you had liquids instead of boxes, you would need to consider taking 0.5 units or pi units or 2.2852 units of liquid. Basically it would be fundamentally impossible to list all solutions.
Usually, knapsack problems are solved with dynamic programming which caches previous computations to reduce the time complexity to polynomial time. Idk how it works with random mutations
It's NP-hard.
What is the alternative to no crossover? Randomly select a parent and make the child as a copy?
If that's all you're going to do, then you're done on the 1st round. You'd never generate any *new* combinations, so the best solution on the 1st round is the best you'll ever have.
@@jursamaj yea, I implemented a random pivot point for my crossover. I was just curious because he mentions a crossover rate and I was assuming that meant a % chance for crossover to occur.
@@kyleallain5923 Possibly that, or a random point along the length for crossover, which is more like what happens with real genetics. Every chromosome gets at least 1 crossover per reproduction, and can get more.
yo why not just runing it a while loop that reads preinstalled binaries of the cases ..... if the nmber is max it break if not it continues processing the all the possible cases
The problem with this sort of algorithms is that sometimes most efficient solution is not necessarily the most complex one
But the algorithm isn't looking for complexity…
@@jursamaj i know, its just that if you're looking for complex solutions to problems - this might not be it.
I always search to work in company who mic Genetic + computer. Thanks for the video, it's helpful
Utterly irrelant to your lovely video but... I have tried like a mad Google-searching thing and I cannot find proper 123-column line-printer paper, preferably with the kindo of 'music-ruled' stuff on every other line like you were using in this! Pretty-please tell me where I can buy that? I want to educate some of my younger colleagues about debugging from a hex dump using just a source listing, a core dump, and a highighter pen - and we all know you can only do that on 'proper listing paper' :)
s/irrelant/irrelevant/ sorry!
In the UK I just search "music lined tractor feed printer paper" or "continuous stationery" - last two batches were bought from a company called Paperstone hth -Sean
@@Computerphile Hah! Just found them, and I think it's wonderful that the paper size is "11 inch x 362mm"!! Get your units sorted out people! (And thank you for the pointer)
Just call it a bag!
Mentally this is similar to picking the best first car to begin with in a racing game.
can't this problem be solved using nomral optimization algorithms?
Yes, and probably done better than with a GA.
Guys, he’s talking about what ASML is
Gotta love the 'No Views'
Kudos to the animators and voice over 😂
have you ever calculated specific energies, ie, kwh/kg, maximize that
minimize the weight, dont try to fill the knapsack
sorting a set of new (pca) variables, 0.72, 2, 7, 0.22, should give you a preferred order of selected items of [3,2,1,4], then just add up weights, it will be (breadth first search, quaranteed best solution fast, ie, always take the best item you can fit, then skip over if it does not fit, ie, always the best), in order of [3,2,1,4] w=0->1->3->10->19, so this algorithm gives a selection of set [1,2,3], with a weight of 10, which is under 15, but is it the optimum
the first suggested algorithm gives value of 16, not NP complete, at all, mostly radix ratios
try solving go with local focus zones, having a tractable 2^n size, times the total number of same size focus zones (like visual neuron focus zones), so not 2^n in one focus zone, but N*2^n is approximately linear with the number N size of the zones that fit the go game board
pca the other dimensions, down to one sortable number, ie, value against weight and size, ie k-sort-value = value/(weight*size)
Who could not fully focus on the tutorial because the professor is TOO handsome? By the way, I really like the knapsack simulation.
9:48 what can be useful?
Ah, so that's how you stack your items in Diablo most efficiently. Thanks.
How does github work???
lol, and now everyone's talking about the problem that was picked to show how the algorithm works instead of the algorithm itself...
It would have reeeaaaaallly helped if you said the complexity comparison with other algorithms like dynamic programming on knapsack problem.
It's probably much worse in simple cases like with just weight and value, but I can imagine that if the problem becomes more complex, it will get more and more difficult to solve it with dynamic programming
How do you know when to stop?
When to stop:
Keep track of the overall population average fitness, and if several generations pass without the population fitness improving, call it quits and present the currently fittest solution as the final answer.
8:52 ...
Isn't there VOLUME to consider? Something might make the weight value limit but not FIT inside the backpack. There needs to another variable or use another example truly only having 2 values.
8:58 answers this
this problem seems like a bad example for the method being discussed to me? I imagine a much better solution would be to find value divided by weight for each item and choose the highest until you run out of room.
It is a great example. What you have proposed is called a greedy heuristic, and is not guaranteed to provide an optimal solution for the knapsack problem. Knapsack problems (it leads to many applications, think about any decision under capacity or budget constraints) are highly complex and there are no known algorithms or heuristics that are just "the best" to solve all instances of them.
@@lucasgabrielmalheiros5589thank you for the clarification that makes sense, i can see how multiple parameters would make it a lot harder
Can we have a video on "off the record messaging"? This was the technology used in EncroChat, the cryptographic communications system used by criminals and cracked by law enforcement agencies.
It's a quick read. I believe the Signal protocol was based on it, or at least inspired by it
How do you assure that your solutions respect the weight constraint? Do you eliminate children that violate it?
Yep, they're culled.
They would be the Darwinistic equivalent of dying before reproducing.
UNI OF HULL COMPSCI
first you had to mention about the reason for not using a better algorithm that gives optimal solution.
it appears I was already working on the knapsack problem without knowing it while playing skyrim. You just loot only gold itself, gemstones and jewellery )
That would optimise the total value of one haul, but maybe you should make it more complex and take "effort to bring this home or sell it" into it as well. If there is for example cheese or a bucket to be looted next to your house or a vendor that could be worth taking, but if it's a longer trip then that additional effort would not make it worthwhile.
My way of doing it was always to only pick up items with a gold/weight ratio of over 10, which was easy to calculate in my head. Remove the last digit from the value, if it's greater than the weight, keep it. Once I got full, I'd make more room by dropping the items with the lowest value/weight ratio until I could pick up the new item. If I wanted to optimize it completely for value, then I shouldn't have discriminated against ratios of less than 10, but then you spent to much time picking up and dropping things, so I optimized partially for my quality of life haha.