Implications of solving the ARC benchmark
Vložit
- čas přidán 10. 06. 2024
- ARC Prize is a $1,000,000+ public competition to beat and open source a solution to the ARC-AGI benchmark.
Hosted by Mike Knoop (Co-founder, Zapier) and François Chollet (Creator of ARC-AGI, Keras).
--
Website: arcprize.org/
Twitter/X: / arcprize
Newsletter: Signup @ arcprize.org/
Discord: / discord
Try your first ARC-AGI tasks: arcprize.org/play
can be a big thing! let's se after playing a bit
Here is my little observation I noticed while playing with ARC. It is very dependent on the assumed geometry of the grid. Meaning, that if we imagine that each cell is a node of the graph and we do not know how the graph is connected we cannot build a program that describes the transformations of the particular task. So we need to assume some way it is connected - i.e. geometry of the grid. For each task the geometry is different and depending on how well we assumed the geometry the program that ‘solves’ the task might be either simple or complicated. There are several consequences that follow from this. Firstly, for each task there might be a huge amount of "correct" solutions that exist on the unfamiliar geometries but with very simple generating programs. Such solutions most likely will not be qualified as a correct one. Secondly, to solve ARC an AI should be aligned to the expectations of the human of how the proper task should look like and might be solved. And lastly, most of the ARC-like tasks generated via some algorithm that assumes random geometry of the grid and simple program would be completely unsolvable for almost any human in most of the cases.
So, my main point is, ARC-like tasks exist in the much broader space that we tend to assume. And the only reason they are solvable for us at all is that the subset of them that was created by François reminds us of tasks we have already encountered and we can reliably guess what François had in mind while creating them.
let's see if it scales, I think it can
IMO, solving ARC is necessary for AGI, though it may not be sufficient
It's not obvious to me whether a solution to ARC would generalize to other problems
Like Francois mentioned, no benchmark is perfect, and there may be ways to "cheese" ARC!
If people just keep throwing augmentation at it than yeah, stupid LLM memorization works and no AGI for us
Is this related to P vs NP in an interesting way?
No
If some really smart model is made then it could make a faster solution to any np problem so that it becomes p problem, which would make the model the proof to n=np