Roman Yampolskiy on Shoggoth, Scaling Laws, and Evidence for AI being Uncontrollable

Future of Life Institute

zhlédnutí 3 349

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 9. 06. 2024
Roman Yampolskiy joins the podcast again to discuss whether AI is like a Shoggoth, whether scaling laws will hold for more agent-like AIs, evidence that AI is uncontrollable, and whether designing human-like AI would be safer than the current development path. You can read more about Roman's work at cecs.louisville.edu/ry/
Timestamps:
00:00 Is AI like a Shoggoth?
09:50 Scaling laws
16:41 Are humans more general than AIs?
21:54 Are AI models explainable?
27:49 Using AI to explain AI
32:36 Evidence for AI being uncontrollable
40:29 AI verifiability
46:08 Will AI be aligned by default?
54:29 Creating human-like AI
1:03:41 Robotics and safety
1:09:01 Obstacles to AI in the economy
1:18:00 AI innovation with current models
1:23:55 AI accidents in the past and future
Věda a technologie

Komentáře • 21

@Tygerdurden Před 5 dny ⁺³
This man is awesome, props to him
@Sentientism Před 4 měsíci ⁺⁵
Thank you both! Here's more from Roman @Sentientism in case of interest. We talk about "what's real?", "who matters?" and "how to make a better world?" (or how to avoid destroying this one) czcams.com/video/SK9lGvGmITc/video.htmlsi=MqwjYp8OlWekU-ZO
@TheMrCougarful Před 4 měsíci ⁺⁶
If the host read himself some HP Lovecraft, he would know that the Shoggoth started out as a universally useful tool, made of artificial life, that eventually destroyed its maker. The shoggoth was not in any way superior to the maker, except in being more insanely violent.
@mrpicky1868 Před 4 měsíci ⁺¹
i would say the proportionate predictability drop with intelligence rise is arguable. in competitive task yes but in many cases where there is only one optimal way it's the other way around and more predictable
@Hexanitrobenzene Před 2 měsíci ⁺²
Ha. Yes, you know that it will choose the most optimal way, but you don't know which way it is in advance because you cannot compute it.
@bobtarmac1828 Před 2 dny
Uncontrollable? Maybe. But with swell robotics everywhere, Ai jobloss is the only thing I worry about anymore. Anyone else feel the same? Should we cease Ai?
@Dan-dy8zp Před 4 měsíci ⁺³
It doesn't make any sense something would change its terminal goals because 'it's just something some guy made up'. That's not a *terminal* goal.
@Walter5850 Před 2 dny
We have goals that are built into us through evolution. We avoid painful stimuli, seek pleasures etc...
And we can ponder about changing those. Perhaps you may want to not want to eat as much chocolate, you may want to want to do your homework.
As Schopenhauer said, "Man can do what he wills but he cannot will what he wills."
For us it's not so easy to change our hardware so we can't really change what we want and not want, but for an AI it might be easy since it's just software.
@Dan-dy8zp Před 2 dny
@@Walter5850 We definitely have conflicting desires, which I believe causes us the situation you describe. But I don't think what you describe is equivalent to the AGI changing it's deepest most fundamental goals.
The classic example is you would fight heard being fed a pill that would make you want to kill your kids, even if you know that if you eat the pill, you will no longer be bothered at all by killing your kids, and that the effects will be permanent.
You would never try to change your goals to make yourself want to experience horrible torture.
As for our conflicting goals, there is a theory that humans have competing strategies for attaining goals, the do-what-worked-before, the do-what-feels-best-immediately, and the long-term-planning rational strategy.
These aren't really conflicting ultimate goals, but the conflict between these three strategies may explain why we both do and don't want to eat the cake so often.
@Walter5850 Před 2 dny
@@Dan-dy8zp Best description that I came to so far about why we have these conflicting goals is simply because our brain evolved from inside out. So our older systems, such as lymbic system (emotions) which is tied more directly to our emotions makes us behave in a way that feels good to us.
Then way later, we evolved the prefrontal cortex, which is slower, but has a lot more predictive power. That way, you can reasonably say that if you eat the cake, you'll get fat and that's not good for you, but your emotions still make you want to eat the cake because that simple brain structure calculated that this is good for you.
I think the examples you gave about eating that pill and the torture make sense but are purposefully pointless?
I can imagine for example, maybe I would want to change my hardware so that I really enjoy learning or working out etc... because these things will ultimately lead me to accrue more power and give me more optionality to achieve any other goals I might have.
There is also an interesting point here, if AI could easily change what it wants to want, it could create for itself the goal which is the easiest to achieve, therefore maximizing its success. It could also just flip the reward system so it's continually ON, effectivelly drugging itself without any negative consequences.
However, the most reasonable sounding thing to me is that it would want to accrue as much knowledge and power as possible in order to play the longest game possible and perhaps with time realize what might be a more appropriate goal to aim for.
Just like we humans sometimes don't know what or why we're doing things, but we still have humility that there might be something missing, something we don't currently know. And maybe this thing in the future will give us meaning.
I wonder what you think
@Walter5850 Před 2 dny
@@Dan-dy8zp Best description that I came to so far about why we have these conflicting goals is simply because our brain evolved from inside out. So our older systems, such as lymbic system (emotions) which is tied more directly to our emotions makes us behave in a way that feels good to us. Then way later, we evolved the prefrontal cortex, which is slower, but has a lot more predictive power. That way, you can reasonably say that if you eat the cake, you'll get fat and that's not good for you, but your emotions still make you want to eat the cake because that simple brain structure calculated that this is good for you.
I think the examples you gave make sense but are purposefully pointless?
I can imagine for example, maybe I would want to change my hardware so that I really enjoy learning or working out etc... because these things will ultimately lead me to accrue more power and give me more optionality to achieve any other goals I might have.
There is also an interesting point here, if AI could easily change what it wants to want, it could create for itself the goal which is the easiest to achieve, therefore maximizing its success. It could also just flip the reward system so it's continually ON, effectivelly drugging itself without any negative consequences.
However, the most reasonable sounding thing to me is that it would want to accrue as much knowledge and power as possible in order to play the longest game possible and perhaps with time realize what might be a more appropriate goal to aim for. Just like we humans sometimes don't know what or why we're doing things, but we still have humility that there might be something missing, something we don't currently know. And maybe this thing in the future will give us meaning.
I wonder what you think
@Dan-dy8zp Před dnem
@@Walter5850 A point in favor of our survival is that we humans are constantly replacing our existing large artificial neural networks with new versions that are really completely different 'individuals' and subjecting existing ones to more RLHF to tweak behavior for politeness, which I suspect has the same effect as replacement. 'Death' so to speak.
So for the AGI to 'bide its time', seems currently suicidal. This could all change, but I hope it doesn't. We are hopefully encouraging premature defection which might be survivable and might teach us not to mess around making too-smart AI.
As for choosing its own goals, this implies preexisting criteria to make that choice. Anyone needs preexisting preferences to make any goals but random goals.
Those preferences may be the means to the end of fulfilling other more fundamental preferences. It's not turtles all the way down, though.
Ultimately, you me and AI have to start with some arbitrary preferences for future states of the world that we didn't choose, that we use as criteria to make our choices and goals. Evolution or the ANN algorithm choose these for us, making it possible for us to make any decisions at all.
Those most fundamental un-chosen preferences are the terminal goals or values.
You have to have something to base your choices of goals on or they are random.
Nothing is objectively good for every hypothetical mind that could exist.
@UrthKitten Před 4 měsíci ⁺¹
Wall e
@akmonra Před 2 měsíci
I found this interview disappointing. I've always had a high opinion Yampolskiy, but he mostly seems to just be rehashing old, faulty arguments. Maybe his book is better
@Hexanitrobenzene Před 2 měsíci ⁺⁹
Faulty ? Which arguments do you find faulty ?
@magnuskarlsson8655 Před měsícem ⁺²
Then why don't you enlighten us? What's wrong with the arguments he presented?
@RonponVideos Před 2 dny
@@HexanitrobenzeneWelcome to the AI doom debate lol.

Další v pořadí

Automatické přehrávání

Roman Yampolskiy on Objections to AI Safety