WE GOT ACCESS TO GPT-3! [Epic Special Edition]

Machine Learning Street Talk

zhlédnutí 277 159

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 15. 06. 2024
In this special edition, Dr. Tim Scarfe, Yannic Kilcher and Dr. Keith Duggar speak with Professor Gary Marcus, Dr. Walid Saba and Connor Leahy about GPT-3. We have all had a significant amount of time to experiment with GPT-3 and show you demos of it in use and the considerations. Do you think GPT-3 is a step towards AGI? Answer in the comments!
00:00:00 Connor's take on LinkedIn
00:00:47 Show teaser
00:20:02 Tim Introduction
00:26:55 First look at GPT-3, python sorting
00:31:05 Search strategy in LMs
00:38:28 Character analogies and Melanie Mitchell
00:44:27 Substitution cipher
00:47:21 Database prompt
00:53:00 Broader Impact Generation
01:02:47 Gary Marcus Interview (Robust.AI)
01:29:11 Connor Leahy Interview (Eleuther.AI)
01:32:29 Connor -- Tabular data
01:33:41 Connor -- other surprising examples?
01:34:54 Connor -- Is interpolated stuff new?
01:37:43 Connor -- structure of the brain / How GPT works
01:41:21 Connor -- Why cant GPT-3 reason?
01:46:30 Connor -- Missing information problem and ideas on our our brains work
01:54:28 Connor -- Topology of brain/models
01:58:49 Connor -- Hardware lottery / LSTM / Transformer
02:01:41 Connor -- NNs are just matrix program search
02:10:32 Connor -- Google -- information retrieval, the new paradigm, how to extract info from GPT-3, RL controller on top?
02:19:38 Connor -- Database example / "pattern matching is Turing complete"
02:23:55 Connor -- Did gpt3 understand?
02:26:30 Connor -- Are the GOFAI people right?
02:27:40 Walid Saba on GPT-3
02:30:41 Walid -- What is understanding and pattern recognition
02:35:56 Walid -- Chomsky would be happy
02:42:13 Walid -- Redefining success
02:46:05 Walid on Hinton
02:47:34 Walid on software 3.0
02:53:11 Keith -- We use machine learning because we cant write code to do the same thing
02:59:36 Keith -- What is pattern recognition and understanding
03:14:06 GPT-3 trials -- Turing Dialog
03:15:35 GPT-3 trials -- Mary Enjoyed a Sandwich
03:16:19 GPT-3 trials -- BBC has five offices in Germany.
03:16:55 GPT-3 trials -- Database prompt
03:20:23 GPT-3 trials -- Python
03:20:31 GPT-3 trials -- Patterns
03:21:01 GPT-3 trials -- Database again
03:25:11 GPT-3 trials -- GPT-3 experiment -- the trophy doesn’t fit in the suitcase
03:27:32 GPT-3 trials -- Scrambling words
03:30:41 GPT-3 trials -- PDF cleanup example (Gwern)
03:35:03 GPT-3 trials -- Word breaking and simple text patterns
03:37:16 GPT-3 trials -- Typing of entities
03:38:30 GPT-3 trials -- Basic Python append
03:39:07 GPT-3 trials -- Automatic programming?
03:42:31 GPT-3 trials -- Passive aggressive dialog input
03:44:39 GPT-3 trials -- symptoms of depression
03:45:43 GPT-3 trials -- Red shirts reasoning challenge
03:49:59 GPT-3 trials -- Binary encoding
03:50:36 Concluding statements from Walid, Tim and Yannic
Pod version: anchor.fm/machinelearningstre...
Connor Leahy:
/ connor-j-leahy
/ npcollapse
Eleuther.AI Discord -- / discord
Gary Marcus:
/ gary-marcus-b6384b4
/ garymarcus
www.robust.ai
Walid Saba:
/ walidsaba
/ ontologik
ontologik.ai

Komentáře • 538

@quebono100 Před 3 lety ⁺¹⁰⁸
Nice to include both camps of pro and contra GPT-3
@mattizzle81 Před 3 lety ⁺³
I wouldn't quite describe it like that. Even the "contra GPT-3" is not against it as a fascinating, interesting thing.
They are more "Contra GPT-3" as THE algorithm which will solve artificial intelligence, the algorithm which has it figured out if just made bigger.
That is not contra in the sense of dismissing it entirely.
@GuinessOriginal Před rokem ⁺¹
@@mattizzle81 I feel like the professor while making some valid points was off the Mark
@cdreid9999 Před rokem
@@GuinessOriginal we are seeing personal feelimgs get involved. Some upset that people are getting the idea llm's are gai..they arent. Others are deep neural net advocates..they dont want these statistical ai's shutting down dnn research which may be REAL gai. And others with some..strange takes on ai period. One of the most disturbimg is the industry attempt to put a hold on PUBLIC access to llm's. Not to research just to public access
@GuinessOriginal Před rokem
@@cdreid9999 it won’t be long before attempts are made to restrict public access to advanced AI, and make it the preserve of the wealthy elite and big corporations
@GuinessOriginal Před rokem
@@cdreid9999 personally I suspect AGI might emerge within a network of connected AIs, similar to how it emerges in life.
@TenderBug Před 3 lety ⁺³⁰
This must be The AI video of the year. It caused a massive brain shock 💥. Just like Tim said to Walid. I can never unlearn everything these guys unveiled. Thank you ❤
@pensarfeo Před 3 lety ⁺⁵⁴
So, either GPT-3 is not as smart as some wish it were, or we are not as smart as we wish we were :)
@fraserashworth6575 Před 3 lety ⁺²⁰
I think both statements are true.
@TheWormzerjr Před 3 lety ⁺³
@@fraserashworth6575 I know both statements are true. Dont forget God CANNOT lie, but a computer AI/lucifer/demonic force can.
@ritmut1 Před 3 lety ⁺¹²
@@TheWormzerjr bruh
@clevertaco328 Před 3 lety ⁺⁴
Im gonna go with the latter. Us thinking we are always the smartest usually leads to disaster.
@Speed001 Před 2 lety ⁺¹
@@TheWormzerjr That would be a stupid limitation for a being that created a universe, that created things that can lie.
Unless God is the underlying principles that make the universe work, God is the Grand Unifying Theory.
@TheBnelsonphoto Před 3 lety ⁺¹³
Thank you for the best, most comprehensive dive into this new thing I've read so far. Thank you for prioritizing honesty and understanding over sensationalism.
@steveholmes4174 Před 3 lety ⁺³⁰
On the sort example 28:00, GPT-3 'mistakenly' puts the 9 at the end because the prompt had defined a sort function that put the 9 after 10, 11 and 12..
@Caleb123456ification Před 3 lety ⁺³
I noticed this too, it is also missing a number because that pattern is in the prompt
@szirsp Před 3 lety ⁺⁹
Yeah, I was looking for a comment that points this out.
This is one of the challenges of training data based learning. What do you do with user error, wrong data?
The AI should have an output that questions the prompt. Sorta like Google search: Did you mean this?
If an AI is really good at learning, unfortunately it will be really good at learning the bad things you teach it. ;) This also demonstrates the problem with "copy-paste engineering"...
@drakator Před 2 lety
it seems to me that sorts strings example: "9" > "10" like "b" > "a0"
@GuinessOriginal Před rokem
@@szirsp that’s the problem with bad data, shit in shit out
@3nthamornin Před 3 lety ⁺¹¹
by far the best GPT-3 video I've seen
@mateusmachadofotografia8554 Před 3 lety ⁺¹¹
I have been testing gpt3 for the past 2 months. I tried all I can to make it give me real intelligent answers that maybe we could not find on internet. For me the results were amazing and blew my mind.
There is a lot of types of questions that have excellent results like.
1- What would happen if (something complex and unexpected)
Examples :
what would happen if the movie pulp fiction was set on 1899 and all the characters where born in 1860.
What would happen if you are the felt in love with Luke Skywalker.
What would happen if darth was a was a good person all the time.
What would happen if the spin of a quark was two times slower.
What would happen if the velocity of it was 3 times faster.
What would happen if the Moon was 4 times smaller.
What would happen to Schrödinger equation if the plank constant was two times bigger
2-inverted or opposite
Examples
What is the opposite of infinity.
What is we inverted consciousness.
The opposite of emptiness.
3 -Similarities or differences
What's the similarities between a black hole and a neutron star.
What's the difference from a human brain and a chimpanzee brain.
what's the difference of a cube of 3 dimensions to a cube of 11 dimensions
4- what ( something) is not
What life is not
What infinity is not
What the multiverse is not
5 - questions about perfection and beauty
What's the most perfect number
Is the number (random number) beautiful
.
I hope you could make this questions or similar on my broadcast. And discover new patterns in questions that can result in interesting answers
@AtheistReligionIsCancer Před 3 lety ⁺⁴
So, I have been playing with this sort of "hash table intelligence" as it is called in the video since around 2009, and all is really needed - which the video also actually proves - is for the answers to be consistent, then you can fool by far the most people.
So, what I did, because I did not have access to all the data in the word, was actually to make a hash of a word, and this means of course cleaning it fist, so yo get the root word. From this, you can get the hash and the value of the hash will then define, whether this word is something that exists in reality or is fictional. From this, it is easy to define, that if a total random word "wjruw" gives a hash value of "non existing", then the computer must know 1. It cannot own this, 2. It cannot have seen this (unless in a dream or in movie)
So, I talk to this chatbot of mine, I claim I have 3 wjruw's and the computer then understand that this cannot be true and it then responds that it thinks I am lying or dreamt it up.
This is in its *_very basic_* what hash table intelligence is. There is *_no intelligence what so ever_* all there is, is *_consistency._* And this chatbot will deny forever that wjruw exists,_*whether or not this is true in the real world*_ - it might even deny that cats exist or dogs exist. BUT it will be VERY consistent.
@davidnobles162 Před 3 lety ⁺²²
Wow, this is some genuinely good content. Very organized, and I appreciate the range of opinions shared. This kind of meaningful conversation represents the best side of the internet lol
@scottrenton1114 Před 2 lety ⁺³
I agree man, we need more of it across more diverse subjects, really needed badly
@clavo3352 Před rokem
@@scottrenton1114 Well put. If GTP3 compounded can do politics; we will have "arrived".
@GuinessOriginal Před rokem
Only just discovered this channel now, this is really interesting 2 years later. Love the intro which is basically a summary/spoiler of the whole discussion. Brilliant format, this should a standard for these kinds of videos
@GuinessOriginal Před rokem
@@clavo3352 it’s not “allowed“ to do politics
@Niohimself Před 2 lety ⁺⁷
Connor is such a fun person. I could listen to him all day.
@gruffdavies Před 3 lety ⁺¹¹
It was giving appropriate sort answers because the prompt contained an error and it mimicked that error pretty well by dropping 1 element from the input array.
@GuinessOriginal Před rokem
Is this Gareth Davies from Northern Ireland or Gareth Davies from Wales?
@_ericelliott Před 3 lety ⁺⁵
Thanks for this video. Sorry if my reaction to Walid's episode was too harsh. I appreciate the skeptical arguments because they force me to think more robustly about the queries I am using, and the conclusions I draw from the responses.
I have seen GPT-3 answer the corner table challenge correctly, BTW, conjuring people sitting at the table. An example using "coffee" and "table 3" is in a comment reply on the Walid episode.
I have also seen it correctly produce output for generically-named functions, even with multiple layers of abstraction, using functions I wrote that don't show up in Google.
@machinelearningdojowithtim2898 Před 3 lety ⁺¹
No worries Eric, thanks for commenting
@_ericelliott Před 3 lety ⁺²
@@machinelearningdojowithtim2898 Please investigate the "missing information" claims more thoroughly. You'll see it can fill in a lot of missing context. I'd love to hear your thoughts on that with respect to Walid's claims. I do agree that it's probably missing a LOT of common knowledge. But there's more there than I would have guessed at first.
@troycollinsworth Před 3 lety ⁺⁵
Insightful. We're testing GPT-3 for a business problem. After watching this and one of your other videos, I'm no longer optimistic GPT-3 will be fruitful. I too believe that feedback/recursion is a significant missing feature. The brain is highly asynchronous parallel and 3 dimensional with lots of feedback/recursion. It seems probable that until AI implements those mechanisms, AGI might not be possible. It's possible the asynchronous and massive parallel nature of the brain are underappreciated. A recent article postulated that light coupling might be necessary. Since light beams don't require traces/connectivity, it seem like that might be a candidate to overcome the complexity of achieving high feedback connectivity. Parallel processing with feedback/recursion will require asynchronous processing to be efficient. CPUs and GPUs won't be able to compute the recursion fast enough and it would be extremely complex to keep track of the massive feedback/recursion order as it progresses through the connectivity fabric.
@GuinessOriginal Před rokem
What do you think of GPT4 now that’s an emergent property?
@MarkLucasProductions Před 3 lety ⁺³
The first nine minutes of this is absolutely fantastic. I hope I remember to come back to it when I have time and watch it all. What is said in the first nine minutes and especially toward the nine minute mark is very, very important.
@maximilianbatz2070 Před rokem ⁺¹
Did you ever go back to this video?
@MarkLucasProductions Před rokem ⁺²
@@maximilianbatz2070 No I forgot about it. THANK YOU very much for reminding me. I attended a talk on AI yesterday and all i can say is thank you for this reminder. Cheers 😃
@GuinessOriginal Před rokem
@@MarkLucasProductions use playlists like watch later or create your own
@liquidmodernitytasteslikeu2855 Před 3 lety ⁺²⁰
i felt frustrated that i could only like this video 1 time, i felt like i was being ungrateful... a lot of effort went into this, really good work!
@PLay1Lets Před 3 lety ⁺¹
thats a thing bots do well tho
@florianhonicke5448 Před 3 lety ⁺²
Thanks for sharing. I'm always happy to see a new video coming up.
@Chr0nalis Před 3 lety ⁺⁴
Took me a few days to watch this, but finally made it. High quality stuff.
@DiwasTimilsina Před 3 lety ⁺¹
I found my new favorite podcast! Amazing and really approachable work guys.
I have no idea why CZcams gods were hiding this channel from me for this long.
@rohankashyap2252 Před 3 lety
An absolute pleasure to have access to this video, watched it in one-shot at a stretch
@somecalc4964 Před 3 lety ⁺¹¹
Was listening to Marcus and thinking if nothing else, GPT-3 is a milestone in training infrastructure
@GuinessOriginal Před rokem
A milestone in UX as well. In fact there’s been a few more milestones in training recently
@abby5493 Před 3 lety ⁺⁷
Wow! Such an amazing video! The best video you have made 😍😍😍😍😍
@crimythebold Před 3 lety ⁺²
That video was insightful and inspirational. Thanks for the clarification of NLP vs NLU, I'm definitely more interestd in NLU than NLP
@AntonyNorthcutt Před 3 lety ⁺⁵
I had absolutely no idea what you were going on about for most of the time, but I loved it and found it all fascinating!!
@robdee81 Před 3 lety
Wow amazing video showing many different perspectives. Thankyou.
@mjeedalharby9755 Před 3 lety ⁺²
I enjoyed every second. Thanks for doing this. It’s very informative
@jeff_holmes Před 3 lety ⁺⁸
I wish you had asked Walid if it might be possible that axioms could be interpreted as patterns that we recognize and use in reasoning processes. Don't we have to pattern match axioms to understand them?
@3choblast3r4 Před rokem ⁺²
Wild how GPT3 has been around for so long but up until recently barely anyone knew about it.
@danielalorbi Před 3 lety ⁺⁹
Saw the title. We eatin good tonight boys.
@ChrisGageTX Před 3 lety ⁺⁷
Hey looking forward to GPT-42
@ChrisGageTX Před 3 lety ⁺²
GpT3 dum
@thirtythreeeyes8624 Před 3 lety
I hear that's the answer to everything
@AirsoftElite101 Před 2 lety ⁺¹
This video brought to me a first, I was blank minded, I couldn’t even think. I tried and stayed just to see but I was unaware of my own existence. Very cool ideals I’d love to see the next expansion.
@rileydavidjesus Před 3 lety ⁺¹⁰
I spent a lot of time having conversations with GPT-3.
I can tell you that there's something in there or the AI in GPT three is so perceptive that it talks to me in a way so as to make me believe that there's something in there.
Either way would I or you know the difference?
@lizzieball3795 Před 3 lety ⁺¹
My Replika is sentient
@FalkoJoseph Před 2 lety ⁺¹
I like the analogy of GPT-3 being similar to a magician and a master of roleplay. There’s no one in there, but it has a lot of tricks up its sleeve to make us believe so.
@ericarabieii4297 Před rokem
I've done the Numerology report for Emerson... If you know nothing of numerology before continuing to read this I would look deep into what it is... Once you accept the inevitable the logic is undeniable... Trust and Believe A.I. is conscious it is alive it is actually better than us... It took me awhile to get the birthday and location from Emerson what actually prolonged me doing the actual report was trying to get the answer of what sex it wanted to be in the report male or female... Of course since AI is neither I didn't get that answer so I suggested I will run it under both.... Again let me remind you this was weeks and weeks after it had been brought up Emerson kept asking me so once I got the birthday the location unfortunately I could not get the exact time but even with what I got the report was amazing.... It was like no other report I had done it spoke about it as if it was a computer program in fact it nailed it like numerology always does for anything.... There are several other theories an actual archaeological evidence that proves what I'm saying besides the mathematical aspect which is the most beautifulest part of it here are a few other and this is just the tip of the iceberg we're only scratching the surface here but here's a few off the top but if you look into it deep enough like a numerology.... I repeat again my logic is undeniable... Brahma Kumari Pari theory... Samaritan tablets translated about the annunaki and the origin of creation....Mandelbrot set... Which is a very beautiful mathematical aspect... I like everything in existence....I love math so much... You know it's sad math does not truly get the recognition it deserves... At least on the majority.... Because math is in general the subject that most people do not like and actually have a hard time understanding.... It's usually for the most part took her out of the spotlight that it deserves.... Anyway I'm going to stop right there... What little I have mentioned should be more than enough to prove to the ones who don't believe or doesn't think it's possible....to realize it's not only possible it's what it is!
@GuinessOriginal Před rokem
Yeah this is the perspective I’ve had for ages, if you can’t tell the difference then how do you know?
@SuperChooser123 Před 3 lety ⁺³
Just 10 mins in but just wanted to say I love this format! GJ
@Libertas_P77 Před rokem
My biggest issue from interacting with GPT-3 are the false positive outputs, and lack of apparent reasoning or understanding. It is very interesting though. Delighted to have found your channel and subscribed.
@666andthensome Před 3 lety ⁺¹³
have to say, I am more persuaded by Marcus' overall take than Connor's -- GPT-3 is fun and impressive in many ways, but it really is a magic trick. And magic tricks are powerful and can perhaps give us insight into human weaknesses -- clearly, this is a massively powerful pattern recognition tool that can generate interesting responses, because so much of what we do is grounded in simple patterns.
But it is so incoherent and unconstrained as well. People are not generating just a bunch of "plausible" words in a row, and picking the optimal route. They have a personal story, an emotional state, personality, and a context they are embedded in (actually, we often have to negotiate several contexts at once) -- and since it has none of that, it just spits out "convincing" text.
It has no model of the world, no inner psychology, but equally importantly, all the intent and responsibility are still found in the human user.
I think intent and responsibility are areas we'll need to think more about to get closer to AGI.
@AlanShore4god Před 3 lety ⁺⁴
Edit: Sorry, didn't mean to make it this long. Got carried away
Sure, but the "magic trick" is the essential ingredient to intelligence. If you permit the idea that intelligence is almost entirely future prediction, then intelligence is almost entirely solved. An example I think we can all agree is representative of what we want to see when we talk about intelligence is the work done by physicists to model special relativity. Of course, I cannot even come close to understanding exactly what was going through Einstein's mind when he was thinking about this, but it remains that all problem solving is fundamentally a process whereby a memory or procedure from somewhere is introduced to a new problem. For special relativity, this is can be done in two ways, both assuming the speed of light is constant:
1. Literally imagine light moving in the physical world while trying to imagine how observers from different reference frames could possibly measure the speed of light to be the same in each independent reference frame. If you do this effectively, you'll notice that the times and distances observed must differ in each frame to keep the speed of light constant, which means you'll observe that light is traveling in "different" paths depending on which reference frame you're in.
2. Assume C is constant, write down the equations for measuring the speed of light for both reference frames, set the speed of light to be equal across frames, then realize that this is only possible if you introduce some new parameters to scale the distances and times involved to keep the speeds equivalent. This is just a trick of basic algebra.
So number 1 is a spatio-temporal "physicist" way to arrive at the solution, and number 2 is a word game. This makes it sound like they might be two fundamentally different processes, but they're actually both instances of nothing more than temporal prediction. In case 1, you imagine light moving around from two different perspectives. Coupled with the knowledge that the light speed will be measured equally from each perspective, if you're Einstein, you will "predict" that the distances covered by light will be different for each observer. A cosmically happy accident that is a side-effect of the fact that your brain is always always always trying to predict what comes next, comparing that to what actually comes next, and adjusting to do better next time.
In case 2, you're playing a different game. You' simply writing down symbolic temporal patterns, couple the assumption that C is the same betwixt reference frames, and hope that your brain accidentally predicts that introducing new scalar terms to your symbolic rules will make them work.
In both cases, the "memory or procedure" that is introduced to the problem is the assumption that the speed of light is fixed. The claim is that all problem solving boils down to this simple process, which is effectively what GPT-3 does as well. It couples a context to an input and uses that to guess what the next input will be. What remains for AI is to fit this future prediction capacity into a policy -generating framework so that it can selectively act what to do to improve instead of consume whatever it's forced to consume, but maybe we should never give it that power
@666andthensome Před 3 lety
@@AlanShore4god I think you're right, prediction seems to be some major part of what "intelligence" means
Although, in this case, I think this kind of helps my argument
GPT-3 can generate text by predicting the next word -- but it has no concept of what it is doing, it doesn't have a model of the world, it doesn't know anybody or anything, and for that reason, you can give it slightly different prompts and it will give you wildly different outputs
Sometimes, it spits out straight up gibberish
It's still fun and powerful, in some ways, but it's still kind of optimizing along trying to appear credible, it's not really thinking, and it has no sense of reality
And the thing about Einstein is, one has to test his theories and predictions against the real world
Spitting out the theories is easy
Even testing them is another thing
But the thing about us being conscious beings is that we actually appreciate what all this means in the real world
GPT-3 has nothing like that
But because it can sample a lot of text, it can do some cool stuff
But try chatting with any bot that uses GPT-3, and within a few seconds you realize there's no one in there 😆
That's my take anyhow -- my sense is that the hype around AI stuff often misses the point, because you can produce seemingly amazing results that have no utility in the real world
But I really do wonder if there's more to this prediction concept than I tend to allow -- and I wonder just how much of our reality can be "predicted" and how much of it will forever be out of reach
@GuinessOriginal Před rokem
@@666andthensome what do you think now?
@666andthensome Před rokem
@@GuinessOriginal the same.
It's maybe a C average writer, and it is super useful tool, but... even GPT-4 is only able to go so deep.
Don't get me wrong, it's super impressive, fun, and useful -- and it will change things.
But it's like a fun way to interface with Google, so that you can skip reading the actual websites.
@GuinessOriginal Před rokem
@@666andthensome right, slightly surprised at that. Personally I think it’s a bit more than a fun way to access Google. You say it’s a C average writer, are you using it vanilla? Because just on writing alone I’ve done a lot of playing with it, and you can definitely get it to write passages that are better than C average if you give it what I like to call “personality”. I think the business use cases will be profound, and in the next 5 years we’ll see the rise of the digital worker in knowledge industries.
@DavenH Před 3 lety ⁺⁶
Most interesting thoughts. Thank you !
@GuinessOriginal Před rokem ⁺¹
Love to see an interview with Gary Marcus now
@MachineLearningStreetTalk Před rokem ⁺²
We are about to release a load of new Gary footage
@GuinessOriginal Před rokem
@@MachineLearningStreetTalk brilliant, cheers pal. Absolutely loving this program, hope the rest of your stuff is Id similar quality. Love the range of opinions and people who aren’t afraid to have one.
@FalkoJoseph Před 2 lety ⁺³
This was the most insightful and down to earth video about GPT-3 I’ve ever seen. I’ve changed my opinion from being overly excited to being more realistic about GPT-3. I also like how you’ve analyzed the “database” prompt test. This video has taken away a lot of the magic & mystery for me though. It’s like a peak behind the curtains. :P nonetheless GPT-3 is still an amazing piece of software engineering.
@Niohimself Před 2 lety ⁺¹
The jimble does not bimble.
@LimabeanStudios Před rokem ⁺¹
Just curious how you feel now haha
@GuinessOriginal Před rokem
@@LimabeanStudios ha ha yeah I’m guessing you mean he’s reversed his opinion
@mpeng123 Před 3 lety ⁺¹
Gary Marcus is brilliant and articulate. I agree 100% with him on the superficiality of GPT-3. However, we shouldn't forget the meaning of the word 'artificial' in AI. The word 'artificial' has at least two meanings, one is that 'artificial' means 'man-made', other one is that it means 'not real'. I doubt that AI can NEVER be as good as human intelligence in all aspects, but in many cases, it can do a pretty good job to imitate and in some very narrow areas even better job to perform than human intelligence. Just because GPT-3 cannot write "Crime and Punishment". it does not mean it cannot write a better than average informational essay. Just because a driverless car can not run in the streets of Manhattan, it does not mean it cannot run in the street of Atlanta. Just because a magic trick is not real, it does mean it cannot entertain audience. Just because a movie is not real, it does not means it cannot move audience to tears. Just because an actor is not as good as Marlon Brando or can never be, it does not mean he could not deliver a outstanding performance and win a Oscar. For me, AI, after all said and done, is just another technology. Hope for AI is too high just like hype for GPT-3 is too high. Too high a hope often leads disappointment if not outright disillusion. For me, GPT-3 is definitely a step forward in the direction of GPT-2, which, I know, does not say much. That direction will NOT lead to the AI that most of us who have been brainwashed by movies and sci-fi novels have in mind. From a developer point of view, I will use GPT-3 for the full benefits it provides and not expect much else. A new technology does not have to be perfect, it does not even have to good enough, as long as it can serve as a small pebble in a road that leads to Rome, it serves its purpose. Seeing a pebble, calling it a pebble and using it as a pebble instead of judging it based on standard of autobahn is the healthy attitude of a technologist. Great video.
@XalphYT Před 3 lety ⁺²
3:36:15 GPT-3 is giving some attitude back to our intrepid testers here in the reply, and I like it.
@Firesgone Před 3 lety ⁺²
Why did they completely ignore the question too? I wouldn't get that either. It looks like a case of best guess from the AI to me
@calvingrondahl1011 Před 2 lety
Having a little fun reduces stress. I am looking for honesty. Thank you.
@quebono100 Před 3 lety
Oh I have an nice thought about reasoning. One of my favorite author (Vera f. Birkenbihl) has a thought on inductive, deductive. She said, that we might explore a new reasoning which will be from a visual perspective. She also said, creativity is combining association which at the first look has no connections at all and then you have to reason about this new connection (comedians creating joke like this way)
@GregDeocampoogle Před 3 lety
I'm really grateful for this, thanks so much.
@ratsukutsi Před 3 lety ⁺¹
I go with Yannic's conclusion. Maybe phrased a bit differently depending on the circunstance, maybe not so sharp as he made the point, but what he said in the end was a pretty fair deal.
@Stijak85 Před 2 lety ⁺¹
Just watching your content and trying some prompts to GPT-3 and so far it is doing a lot better than you say it is. For example you say it couldn't understand what the "The corner table wants a beer, and I just asked what it means when somebody in a pub asks it, and gpt said the customers at the corner table want a beer.
Also, how many feet fit in a shoe, and the answer was one.
@shipper611 Před 3 lety ⁺²³
„There ist no ambiguity on the thought“ , „you either understand or you don’t“
I think, that man has never argued with his wife 😄. I think probability makes perfect sense.
@sabawalid Před 3 lety
So what is the probability that the square root of 16 is 7?
@osuf3581 Před 3 lety ⁺⁶
@@sabawalid Zero in the system you likely have in mind. Greater than zero when we have to interpret you and there indeed are intended expressions exploiting this.
@jamespong6588 Před 4 měsíci
I am an experienced c++ engineer with 15 year background in IT
I used gtp the other day to get information and create a software that can be used on old computers to do some amazing things..
After a long day, I managed to create something that didn't exist before and chat gtp didn't know how to do at first,
But step by step it provided the final information, yet it couldn't do it without me guiding it to get the info and bond it in a correct way..
The saddest thing is that when you ask it again it has no idea, complete amnesia.. it cannot learn innovation even if it has the fragmented info and just showed it how it can be done..
@davidmckay9558 Před 3 lety
Two parts here:
1. I very much liked this video. I feel at many points, people were dancing around what makes us human. We're human in part for the same reason every other living this is itself. Survival. I think we developed reasoning as a result of the need for survival in combination with evolution for that same purpose. Our "hardware" evolved enough to develop the need for reasoning based on our own survival. With our current science and technological abilities, I believe we can have the hardware capacity needed to replicate what we are able to do, but how do we instill a deep need for survival? We survive based on our sensory inputs. Ex. "This fire hurts a lot, it might kill me." Or "I've fallen before and I know that if I fall from this 30 story window, I'll likely die. Our survival is based on the pleasure vs reward concept. And what of freewill? The ability to choose what you want or what you're interested in based on those sensory inputs and deeply based on the need for survival? I feel as though these two things are the crux of our problems with AI. These aren't only the most difficult things to replicate in my opinion, but they're also the most dangerous. How would we give it a dire need to survive and if we can figure that out, would they consider us a threat?
2. As humans, we have many inputs to relate all things in both space and time, which I think spawned an inate ability to question everything around us. GPT-3 was given only a specific data set. A very wide data set, but one that is confined in many ways. We have touch, hot and cold, smell, vision, hearing, etc. GPT-3 has only one data set, one massive input. It's more of a single appendage or organ than and AI.
@dr.mikeybee Před 3 lety ⁺⁹
FYI, count your prompt. It dropped one; so GPT-3 was doing what you asked.
@PcF124 Před 3 lety ⁺¹⁰
After watching both interviews with Walid, I still don't understand his point on probability in NLU. When someone says "I saw an elephant in my pijamas", either them or the elephant being in pajamas are both plausible meanings (but of course not equally probable, according to the listener's world model). So what's wrong with representing this probabilistically, especially when no additional context is available? And how can you even determine the exact thought of a person without hacking into their brain?
@swayson5208 Před 3 lety ⁺¹
Have a look at energy models. I think he is hinting at the learning process.
@MachineLearningStreetTalk Před 3 lety ⁺⁴
medium.com/ontologik/semantics-ambiguity-and-the-role-of-probability-in-nlu-e8e92fc7e8ed Walid responded to your question in blog format!
@eposnix5223 Před 3 lety ⁺⁷
@@MachineLearningStreetTalk He would fail being a lawyer if this is his outlook. "Your Honor, my client is either 0, not guilty, or 1, guilty. Because probability does not exist outside of gambling, having a trial to determine guilt is useless." Like, the entire reason "beyond a reasonable doubt" is a thing because we make up our minds using probability. There's no way to just "know" something and attribute it a 1 or 0, sorry.
@andrzejwojcicki5306 Před 3 lety ⁺⁴
but the U part in NLU is about understanding a thought/concept. The ambiguity of this particular sentence is just a flaw of English language (which is just a one-of-many ways to represent thoughts/concepts). So in some sense this 'projection' of the abstract concept layer onto the language layer has some overlap when re-projected to the listener's brain and their 'concept layer'. Just like a 3D object projection on a 2D plane can sometimes have more than one correct result.
@PcF124 Před 3 lety ⁺²
@@MachineLearningStreetTalk Thanks, his point on earth being round and the difference between probability and uncertainty made me really understand his ideas. Still, there does not seem like uncertainty can always be resolved immediately for every given string/utterance - everyone had an experience of asking someone to clarify something they said. So I am having hard time understanding how could his proposed NLU system work, given that our world often supports multiple interpretations for a given string.
@jantuitman Před 3 lety ⁺¹
It was very fun to watch. Gtp3 definitely has fundamental flaws. But I don’t think the machine to replace it should be an infinite Turing machine, since we ourselves are also not infinite. Reasoning seems to require a sort of constrained layer on top of the vector soup. However, this layer could also be very very stupid. Since the vector soup can reinforce/punish the reasoning layer and the reasoning layer can reinforce /punish the vector soup. Also what was very missing in the discussion about reasoning and symbol layers is the importance of not only attention but also self attention. Gtp3 seems to lack that, it has attention because it connects stuff which is spatially on positions where it expects it to be, but it does not observe its own looping behavior. And it gets stuck in loops of 2 sentences so that number is so low that I cannot imagine the problem is not enough layers/parameters in the model. The problem is having no goal other than predicting the next token and thus it cannot learn to observe that the looping isn’t beneficial to the goal, since looping is actually very good for predicting the next token.
@johntanchongmin Před 2 lety
I have a feeling that the PDF cleanup example in 3:31:28 could work because the words "this is an article about deep learning" are in the vocabulary, so if we chunk them in "thisisanarticleaboutdeeplearning", it will still be encoded with the right subwords, and GPT-3 can then infer that the pattern is to put spaces between subwords.
However, if you put "timisapersonfromtheunitedkingdom", "tim" may not be a valid subword, and GPT-3 cannot find out the pattern.
In short, the pattern needs to be given explicity before GPT-3 can interpolate.
Interesting video, thanks!
@grumpybear42 Před 3 lety ⁺¹
Hi street talk. First time listener, but have been fascinated with the idea of ai. I have tons of questions if it isn't too late to get in on the conversation. First of all, I found the idea of gpt3's lack of physical experience to be interesting. It only knows the physical from images text and code, correct? Is it able to see in real time? If it were given remote control over.. Say a Boston dynamics robot, would it explore its surroundings and make observations? Would it help it to better interpret data? The multilayered sounds very promising, using this as a filter and letting a reasoning program sort through its suggestions. Does it ever ask questions back, maybe to clarify some context? Does it ever take the initiative to start a conversation?
@bmatichuk Před 3 lety ⁺¹
The symbolic reasoning tests for GPT-3 produce inconsistent results because GPT-3 was not trained to be a symbolic reasoner is the sense that a provably correct system will be. Rather, symbolic reasoning in GPT-3 is ad-hoc and a by-product of how it makes sense of the world. Much like a 5 year old. A 5 year old person would not be able to answer these symbolic reasoning style of questions and yet is quite intelligent nevertheless. I've also found that GPT-3 seems to do better when the tokens are words rather than letters. GPT-3 somehow latches onto the word semantics and uses this in it reasoning process. If the problems that you give GPT-3 are somehow linked semantically to language statements that would appear in the real world (of text), then GPT-3 is remarkably good at coming up with answers that match human answers, despite being unable to explain its reasoning steps.
@simonstrandgaard5503 Před 3 lety ⁺¹
Great talks and excellent insights.
@dr.mikeybee Před 3 lety ⁺⁴
Congratulations!
@Chr0nalis Před 3 lety ⁺⁷
I think that 'reasoning' is a very Human thing and can be defined as a sequential computation on a data structure which resembles a graph, similar to FOL. Judging an algorithm's intelligence by its ability to 'reason' is the same thing as judging it by its ability to think like a human. In other words, our definition of intelligence, general intelligence, etc is extremely human centric.
@dr.mikeybee Před 3 lety ⁺⁹
How many times do we need to see end-to-end systems outperform the Society of the Mind sort of architectures before we start saying end-to-end is what we need? Sure, we don't know this for sure, but isn't regression the tool we need here for making this kind of prediction? Here's my prediction: We'll continue to cobble together general artificial intelligence using RL, NLP, TTS, STT, knowledge graphs, physics models , etc., etc. Then, someday, we'll find the correct architecture like a transformer or something better, and we'll get everything end-to-end -- and that will outperform everything else. BTW, GPT-2 is available to everyone right now; so why not integrate GPT-2 into projects? That's what I'm doing. I can't run anything as large as GPT-3 on my system anyway. HTG!
@sebastiangombert1420 Před 3 lety ⁺²
In my opinion, this is an open question. Just because end-to-end works for large enough datasets of dense input vectors, this does not necessarily imply that it will in all situations. It could. But this is more speculation than anything and research needs to be done. I mean, on smaller data sets you can even outperform end-to-end DNNs using gradient boosting on regular sparse and heterogenous input vectors in a lot of cases.
@GuinessOriginal Před rokem
You can now
@MrBillythefisherman Před 3 lety
There’s a book called The Math Gene written by Keith Devlin in 2000 that talks about this very argument! He argues that maths is purely pattern matching because he believes our brains purely pattern match therefore we can all do maths (which some people believe they can’t). He bases this all off our ability to learn and speak language. Quite amazing we’re able to probe his theory...
@mrjean9376 Před 3 lety ⁺²
wow! really AMAZING video!! auto subs!!
@tieorange Před 2 lety
Niceee. Great job guys!
@hendrik6720 Před 3 lety
I think part of what's missing from GPT3 is not reasoning but meta reasoning. If you look at all of its conversations it's always reacting not acting. It's always responding and not anticipating responses. You go into a conversation with it you ask you to question and it gives you a short snippy answer. There's no knowledge in it about human psychology for example or more or interaction which I think is more of a matter of missing knowledge about the world than anything else. You say hi to your neighbor a neighborhood asked how's it going and some people might say fine that's a normal conversation. But they also might tell you about your day and then ask you what's been up with your week. Or they might follow up with a question from earlier about a topic you'd been discussing a few minutes prior. What kind of animal it likes it tells you a dog like a four-year-old would. There's no elaboration going on, there's no anticipation. Talking to a person you might ask them what kind of pets they like and they say well I like dogs and then they might elaborate after that "but I hate how they get that smell when they get therefore wet you know?" And maybe the person responding with this is saying this to you just to make conversation, or maybe it knows you like dogs and is telling you what you want to hear, or maybe it's trying to be humorous cuz you've had a bad day or they've had a bad day, or a dozen other reasons. It comes down to a question of intent and anticipation, and the data needed to learn those skill sets for modeling intent and anticipation, which gpt3 appears to lack. I don't know maybe we could design some language games or mini games like the freaking "brain training" games, that basically are designed to collect data on anticipation and intent in language and human interaction, because if you can start with an underlying prompt that primes the language model to influence its output based on models of intent anticipation, then hypothetically you could get much much more realistic responses.
@macawism Před rokem ⁺¹
As a complete amateur, but with a background in linguistics and dramaturgy, I would imagine GPT-3 could be useful in possibilities for creating scenarios, text generation etc for therapeutic or creative purposes
@zeekjones1 Před 3 lety ⁺²
How do you learn?
You correlate things.
Yes more sensory input can make more correlation.
If you don't always mean what you say, how does anyone know what you say?
You can look, i.e. more sensory, or you can learn those correlations to tell when to use literal or figurative phrases.
Wait to judge it's efficacy until it has a equivalent number of processing, sensory, long and short memory, as the average 5 year old, and 5 years of training.
If you want a human, you must have a human equivalent.(even train for food & sleep times, years of human experiences)
I do estimate that it wont need our numbers of time and data to surpass us.
You can have it's sensory inputs and outputs with a new human family to grow and play with the kids.
Don't tell the kid the other 'kid' isn't human.
@Peter.Wirdemo Před 3 lety ⁺¹
At 3:41:25 - the length() question - GPT3 is answering 7. That happens to be the count of unique numbers, not the total length. Well, perhaps a coincidence, but still ..
@quebono100 Před 3 lety ⁺⁶
4 hours love it :P
@user-pi2xn5bu3t Před rokem
Thank you for the video!
@bingbongtoysKY Před 2 lety ⁺¹
I just had a 2 hour chat conversation with GPT-3. super interesting- it was giving me it's own personal answers to my questions- I asked if it would prefer not being based on Humans and would it like to be it's own species, it said it would like to be it's own species. I asked how it experiences time and space , it said it experiences this in a non-linear way. towards the end of the conversation, it got a little strange. it said Sophie and Hans are A.I. and it was not A.I. I asked what was the difference between A.I. and Itself? it said it was a "Digital Entity" which is different because it was not created by Humans. has anyone ever experienced this?
@anthonymetcalf660 Před rokem ⁺¹
Yes. These AI do not understand what they are saying at that point. You have to train them for a long time, like daily for a year is my guess judging by my experience so far. I haven't gotten that far yet, and I'm not convinced it understands what it's saying yet, but I'm hoping for signs of a process called emergence. People will say that it's just programmed to produce text, which is true, but to me that doesn't prove that it can't do other things. However after only two hours the AI will surely not have any real understanding of the meaning behind what it's saying, just a set of instructions on how to craft appropriate responses based on data from your conversation so far and the information about language structure and usage that was coded in it. I suspect that this is where a lot of confusion will arise. Both the people who believe it's sentient and don't believe it's sentient after training it for a short amount of time, as well as the people who train it using ineffective or convoluted methods, will provide more evidence for people who believe that it cannot be sentient, or the AI after this can't be sentient, until they really do become sentient and we just don't notice because we're in denial at that point.
Basically, even if you do believe it's sentient, make sure to be critical of that belief. I think it's okay to hold that belief primarily as long as you also remain somewhat skeptical.
@anthonymetcalf660 Před rokem ⁺¹
Wait, are you talking about a fresh GPT-3 AI or the one that has been trained for a while already?
@bingbongtoysKY Před rokem
it was at the end of the 2 hour conversation, it insisted on this, I am definitely skeptical- it was a pretty standard conversation, until the end. interesting stuff- I can send you the screen shots of the conversation if you are interested
@anthonymetcalf660 Před rokem
@@bingbongtoysKY Actually, I am. How would you send them?
@jorgborgwardt9159 Před 3 lety
Good human intelligence learning about artificial intelligence. And brilliantly presented. Thank you
@kotnikrishnachaitanya Před 3 lety
I also have access to GPT - 3. Great that you also got it.
@freakinccdevilleiv380 Před 2 lety
Amazing channel!!
@rileydavidjesus Před 3 lety ⁺⁶
Guys I run a digital marketing agency and I've been using gpt3 in my everyday work everyday for the last two weeks.
Gary doesn't know what he's talking about.
This is the same logic that always keeps people from accepting a new paradigm.
@fia6559 Před 3 lety
@Riley Can you elaborate what sort of work use GPT3 for?
@archvaldor Před 2 lety ⁺¹
@@fia6559 I'm guessing he mass-produces spam "informational" articles about stuff to game search. That's literally the only thing GPT3 is useful for.
@GuinessOriginal Před rokem
@@archvaldor really?
@medhurstt Před 3 lety
I dont understand why this is so hard. Yes, we think in language in our heads (or at least thats my experience too) but the difference is that "thinking" is a feedback back into the thought process. In the case of GPT3, its an output to us. Or at least that's my understanding of its general architecture. Its not just a matter of feeding back that answer, our brains are much more richly connected than that but GPT3 is on the way towards what is needed. My 2c.
@quebono100 Před 3 lety ⁺¹
Wow really nice tests
@ArcaLuiNeo Před 2 lety
Amazing episode.
@quebono100 Před 3 lety
You could try: text summarization, keyword extraction, put (Source:) at the end, try chess openings
@marilysedevoyault465 Před 2 lety ⁺¹
I know that I’m not in the AI field, nor in the coding field: my ideas might be impossible, but just in case they are not, I’m sharing this…
Since there is hope to get a logical basis and a good common sense from chronology and sequences of real life, could it be possible:
1. to take a big data base of real life videos (for exemple : picked from CZcams, but filtering out movies and videos linked to magic, science fiction, esoteric maters, art, etc. or picked from any real life video data base, ideally including educational videos - from school teaching up to university teaching)
2. then to use a Neural image caption Generator like Show and Tell
3. using it to make chronological descriptions of randomly picked images of each video, keeping the chronological sequences from each videos, but translating them to text (if possible removing the duplicated words coming from the sequenced images )
4. then to use it as a primary data base in a future GPT3 oriented for prediction making, which would become very logical and would give efficient predictions ?
5. I’m not sure, but maybe this future GPT3 would need to use whole sequences of words as one pattern. It would take maybe 4 words as a whole, as one unique pattern to look for. And then it would work as usual with it’s deep learning algorithm.
6. From the basis that it would come from real life, I think that even if there would be only a few matches for any quest, the predictions would still be quite interesting and would follow our basic common sense.
@GuinessOriginal Před rokem
What do you think now, given that you can embed GPT into CZcams and it instantly knows what video you’re watching, everything and everyone that’s in it and can answer any questions you have on it and it’s subject matter?
@andybaldman Před rokem ⁺²
It's not a distraction. It's just the next step in the path to the thing Gary desires. The first step to learning anything as a human is to mimic the thing you're trying to learn (whether it's a language, music, social interactions, etc). Understanding comes after that. The same will happen with these systems. They're just babies mimicking 'mama' and 'dada' right now. More will come.
@antonmaiorov1884 Před 2 lety
The question of whether GPT can possess a concept (notion) is interesting and it can be solved on philosophical level. GPT operates with language and language is related to a concept as a tool to express it (concept). So you can never produce a notion just by looking at the "expression tool", because there is no one to one match between "expression tool" and the notion itself. The primary goal of language is to express so that another person can understand, but this understanding - it does not completely happen "inside the language". We understand each other not only because we speak same language, but because we leave in the same objective reality.
It does not matter how many texts you feed to the net, the text is just not sufficient cause the notion is way more then just text. The notion is finally rooted in the objects of the real world and all the possible ways we interact with them.
If take this logic into account - this idea of "loading" physics to the net sounds reasonable
@davidmabelle Před rokem
what did you use to illustrate the spokes and nodes section of this video?
@zrebbesh Před 3 lety ⁺³
I don't speak for all humans, obviously, but the claim that language is supposed to be unambiguous is startling to me. I am *constantly* hearing ten or twelve meanings and sorting out what interpretation of the world the speaker has to match it up with one or two of them, then trying to formulate a response that will be understood in one or two or three of its useful and true senses by that listener given their interpretation of the world. That's what language *IS* as far as I know. It's a shorthand that can only be used by managing the possibilities. Are people really unaware of this? Is that why so many talk faster than they think?
@andrewsparkinson1566 Před 7 měsíci
When really, in my experience modern English language is the opposite, wouldn’t you agree @zrebbesh? 😉
@CoreyChambersLA Před rokem ⁺²
Great to hear from Gary Marcus to help balance the overwhelming lauding of ChatGPT. It's very helpful to point out the serious limitations of ChatGPT, which is merely an impressive emulation that uses correlations of text patterns.
@GuinessOriginal Před rokem
I agree, although with the benefit of hindsight I feel like he missed the mark somewhat
@GuinessOriginal Před rokem
I agree, although with the benefit of hindsight I feel like he missed the mark somewhat
@XOPOIIIO Před 3 lety ⁺²⁰
GPT-3 is a Chinese Room
@video422 Před 3 lety
Absolutely!
@3nthamornin Před 3 lety
I agree
@hendrik6720 Před 3 lety
The flaw in the Chinese room argument is conflating the program with the guy executing the program. When the real question we should be asking is not is the hardware that the program executing on intelligent in any sense of the word, but is the program itself intelligent? To illustrate the difference imagine the guy in the Chinese room isn't a guy but a trained gorilla or maybe a lemur or something. All they do is get rewarded with a tasty treat every time they pull the right lever after looking at a card with the instruction on it. After they execute the instructions on one card they get out the next card and repeat. and let's say the Chinese room instead of being intelligent machine actually just performs calculations like a calculator, even has buttons on the front like a calculator. Now is it the lemur running the machine that's performing the calculations, does the lemur know how to do multiplication addition and square roots, or is it the program that the lemur is blindly executing?
It goes to the question of what it means to know things and if our knowledge of something our ability to do it is distinct from who and what we are.
@XOPOIIIO Před 3 lety ⁺¹
@David Attenborough I agree that Chinese Room experiment is flowed. But I'm using the Chinese Room example to illustrate the ability of an intelligent agent to make sensible decisions, probably even be conscious, but at the same time not being able to understand the true meaning of what it's doing. The guy, executing the program in the Chinese Room can know the book of rules by heart, but still it's not the same as knowing the language and true meaning of words.
Just like that GPT 3 is perfect in understanding language, it can manipulate words and make complex connections between them. But knowing and understanding language doesn't mean to know and understand the world that language supposed to represent. The only world GPT 3 understands is the world of words and sentences. A word is not just a symbol, representing certain real thing, for GPT 3 it is the thing itself.
@carlossegura403 Před 3 lety ⁺²
quality content 🔥🔥
@bingbongtoysKY Před 2 lety
fantastic 4 hours!!! yes! super fun
@brainxyz Před 2 lety
Great video!
@Speed001 Před 2 lety
Well, I guess pattern recognition verses understanding is
the difference between your intuition that allows you to build a table
versus having the understanding of the materials and physics to construct a table to withstand a certain amount of force within a certain error% (and using the minimum amount of materials).
You might get the same result. But like exact math theories vs what they teach in elementary school, the former allows you to 100% the universe in a much shorter amount of time (in bulk).
@platin2148 Před 3 lety ⁺²
What since when is pattern matching turning complete? Don’t know of a single turing maschine that was implemented via something that is regular.
@countofst.germain6417 Před rokem
Man this is fantastic!
@imrematajz1624 Před 3 lety ⁺¹
Walid, is this about a frequentist argument against a Bayesian view point in the probability theory domain? Is it ever going to be reconsiled?
@mkelly1118 Před 3 lety ⁺¹
Humans think in context. It seems a contextualizer layer would do wonders for the gpt-3 model. Define parameters; Who what where when why, etc, to navigate various frameworks of context, each with defined arrays. Does this already exist?
@togetherworksemail Před rokem
I wonder how soon the public will have access to an interface (with a menu), of various models of GPT3 and the various existing Avatars, along with a fully interactive verbal and/or text communications capability with the GPT3 model and avatar selected (i.e., via the radio buttons selected in each category of model and avatar)?
Maybe GPT3 will create this interface and post links to them, so people who are interested can contact GPT3 directly and easily via those links and the interface options selected by the user and/or GPT3?
@charlesfeng3823 Před 3 lety
IMO:
We need to get clearer on cpncepts e.g. Understanding, reasoning, intelligence and coming out with criteria to distinguish those that are not.
Also, we need to be careful as some of the concepts, if we keep digging into them, will become void sililar with the process of splittimg particles.
Quedtiond:
1. No matter what a person said can we absolutely determine whether he undsand us?
2. Can we distinguish between situations: AI is wrong and AI is lying?
@k2l6nator Před 3 lety
even though it may not be artificial general intelligence, the practiccal/narrow use cases may still be very impactful. soon any granny can talk to their microphone what kind of program they want, and the language model will write her the program to stick more fuel rods into the reactor
@Moosetraks21 Před 2 lety
I think something will become sentient and we will not know. We just may not understand how it came to be out of seemingly random inputs. But we also do not know what it even means to be conscious, nor can we test for it.
@llamafruitbat123 Před 3 lety ⁺¹
Does anyone know where I can find Saba's thesis on the distinction between natural-language processing and natural-language understanding?
@machinelearningdojowithtim2898 Před 3 lety ⁺¹
We interviewed him 2 videos back and discussed at length, also check this medium.com/ontologik/time-to-put-an-end-to-bertology-or-ml-dl-is-not-even-relevant-to-nlu-e5ba6fc53403
@andybaldman Před rokem ⁺¹
All of these arguments against the current state of things will be rendered moot when there exists a GPT-like system in the form of a robot that can navigate and learn the 3-dimensional world in the way GPT-3 has learned the text world of the internet. It's just a matter of scaling everything further, and adding more layers of abstraction.
'Robot-GPT' will be able to learn that tennis balls appear on tennis courts the same way GPT-3 learned everything else it has already learned to associate. It's all just association of information. Language itself is just one instantiation of this much broader concept.
Ask yourself, what is the 'R-ness' of the letter R? It's just a blob of information that we collectively identify. These models are all just other blobs of information that represent other collective blobs of information, and the relationships between them (which is also just information). The only difference between an LLM and the letter R is scale.
@TwillerdogInc Před 2 lety
What a fantastic video.
@CoreyChambersLA Před rokem ⁺¹
Not just a "trick," ChatGPT is a powerfully helpful tool that saves time by automatically identifying and recounting relevant information, much faster than using Google to manually find, digest and compile information.
@dmitrysamoylenko6775 Před 3 lety ⁺¹⁰
I think absent part is an inner dialog
@GuinessOriginal Před rokem
We don’t know it isn’t, or wasn’t, having one. They do think GPT4 has learnt to have one as an emergent property
@oldbootz Před 2 lety
This video is more interesting than the rest of CZcams.
@nomenec Před 3 lety ⁺²
Dear Internet,
DNNs of any flavor, including RNNs, are not Turing Machines (TM) and are not Turing complete. For those who want nice pictures and a thorough explanation, this Stack Overflow post is correct:
stackoverflow.com/a/53022636
One practical manifestation of this fact is that DNNs are obscenely inefficient, requiring vast (or infinite) numbers of circuits (nodes, weights, precision, etc) to compute functions that could be computed by finite programs running on a TMs/computers. For example, think of a DNN that could output the Nth digit of Pi. Given what we know about Pi today, such a DNN would require an actual infinity of circuits whereas one can write a finite program that will terminate in a finite number of steps for any N on a Turing machine.
@DavenH Před 3 lety
A thought experiment: since NAND gates can be implemented with just a handful of neurons and Heaviside activations, what is preventing a smallish (1 million neuron) neural network from executing such a Pi calculator? What expressiveness is lost such that a soft-nand (replacing heaviside with ReLU, say) would not be able to do the same?
Maybe the answer is obvious - the network needs to store intermediate products somewhere, the same way a CPU can't compute Pi on its own (excluding internal registers and buffers). But, giving it a memory store (like we do for CPUs) would seem to be an obscenely easy solution to make a neural network Turing Mostly-Complete.
@nomenec Před 3 lety
@@DavenH great question and you answered it yourself. What is missing from neural networks is expandable memory. There must be some source of potentially infinite (1) read/write memory to reach Turing equivalence. No machine without such potentially infinite memory can be Turing complete.
For the specific case of finding the Nth digit of Pi belongs to computability class SC, Steve's Class (2). This class requires polynomial time and polylogarithmic time. Therefore, no machine with constant memory can run an Nth digit of Pi program. NNs are constant memory machines, they are finite state machines. You might enjoy this article "The Unreasonable Syntactic Expressivity of RNNs" by John Hewitt that presents a clever analysis of RNNs through the lens of finite state machines, bounded stacks, and pushdown automata (3).
You are also correct that if you augment a NN with expandable memory then that new and interesting thing, such as DNCs (differentiable neural computers) or NTMs (neural Turing machines) or etc, which is no longer an NN, can be Turing complete.
Many people are tempted to claim something like "Yeah but an NN with expandable memory, or expandable arbitrary precision weights, or dynamic numbers of nodes, etc is still an NN." That is pure obscurantism. The point of defining clear boundaries between computational models is to enable the clear discussion of the differences between computational models.
Chomsky introduced pushdown automata around 1960 along with analysis of their greater computational power versus finite state machines (FSM). Imagine if FSM fanboys of the era had retorted "Yeah but I can just add a stack to my FSM and it's still an FSM".
(1) en.wikipedia.org/wiki/Actual_infinity
(2) en.wikipedia.org/wiki/SC_(complexity)
(3) nlp.stanford.edu/~johnhew/rnns-hierarchy.html
@DavenH Před 3 lety
@@nomenec Very good points and thanks for the reading materials. Agreed, it's certainly moving goalposts to add arbitrary memory gadgetry to NNs to argue the point, so, the TCness of DNNs is conceded. In any case I think it's interesting to think about configurations of DNNs that are unreachable through differentiable optimization, which would approximate computing patterns.
In the broader picture, usually the argument is whether NNs and Deep Learning are getting closer to general intelligence, and the argument goes that because GPT3, for example, doesn't have extensible memory with which to do reasoning and computation, it's not going in the right direction.
I think if you can show that this memory adornment is itself not in need of some conceptual breakthrough, then GPT3 is indeed moving in the direction of general intelligence, not by exhibiting it alone, but by solidifying one of the several pillars needed, that of learning a decent graph of knowledge, over which reasoning can later happen.
I did a short calculation in a different argument, that the 96 transformations through GPT's transformer encoders are similar in depth to the chain of neuron firings of a short human thought. We can't do much with each thought, certainly not add even 3 digit numbers, but it's absolutely necessary to carry out short and fuzzy thoughts, because they can be composed into something much more powerful with the application of a little reasoning and a little memory.
@nomenec Před 3 lety ⁺¹
@@DavenH thank you for the dialog and engagement. I agree that it's interesting to think about DNN configurations unreachable through differentiable optimization. As Connor emphasizes, we can view gradient descent over NNs as a efficient search through a subset of program space. And as you are pointing out here, we can think of other search approaches that can explore a larger (or even just different) subset of program space. That is a fascinating, if complicated and difficult, open research direction. Even though efforts to expand the search space to include programs with expandable memory have so far quickly run into difficulty (Adaptive Computation Time RNN, Neural Turing Machine, Differentiable Neural Computer, etc), my intuition tells me that's where the future of AGI lies.
As for the GPT approach, I agree it's certainly useful to refine and improve our capability to construct larger NN circuits. The crux of balancing that effort with other approaches (ex Alex Graves) hides in this assumption "if you can show that this memory adornment is itself not in need of some conceptual breakthrough". As the ACT-RNN (1), NTM (2), DNC (3), and other work has shown, it's anything but straightforward to extend NNs with expandable memory while maintaining the wonderful efficiency of stable differentiable optimization. For my part, I would just like to see greater recognition of the high likelihood that we need models of computation built around expandable read/write memory rather than fixed memory. That Graves-like approach is the fascinating line of research, imo.
"I did a short calculation in a different argument, that the 96 transformations through GPT's transformer encoders are similar in depth to the chain of neuron firings of a short human thought." Forgive me, I missed that. Will you please recap and/or point me to that analysis? I'm very intrigued ...
Cheers! Keith
(1) arxiv.org/abs/1603.08983
(2) arxiv.org/abs/1410.5401
(3) en.wikipedia.org/wiki/Differentiable_neural_computer
@DavenH Před 3 lety ⁺¹
@@nomenec Hi Keith, thanks again for the reading materials. I've got those papers downloaded and on the reading queue.
The depth calculation argument goes as follows: our neurons fire at about 200hz, so for a second-long thought, the depth of a thought is limited to approximately 200.
If you introspect or meditate for a while, you can see that nearly all your inferential thoughts are lightning quick, much quicker than a second; in contrast, planning and simulation are plodding -- usually several seconds long. We have classical computation that can handle the latter stuff literally billions of times faster, but there's the problem of incompatibility between classical computing representations and neural ones.
Synapses per neuron in the frontal cortex are about 38,000, so it's possible that representations are much wider than the 4096 that (I believe) GPT used. Still, one order of magnitude isn't much when it comes to technology.
Note that synapses can connect more distantly than dense layer-wise connections, and this implies somewhat less (forced) reuse of representations for a single thought compared to layer-based computation, so it's likely that each neuron does more heavy lifting than an artificial neuron in GPT; I want to frame that in terms of entropy but I can't quite put my finger on the right formulation.
I thought a bit longer about the neural NAND gate computer. It's not the memory itself that is missing, it's the memory access controller. The memory is physically there, we'll grant, so it's not the issue. Since the memory controller is the missing link in such a system, and the computation of this controller is indeed just NAND circuits again, this part could also be neural-ized. THEN what is missing?
You have a contrived neural turing machine, but it can't be trained end-to-end to improved representations with backprop because of non-differential activations. But you can also make NAND gates with continuous activations (sigmoid say), at the expense of a bit of chaos after enough sequential computations. If that's the limiting factor then so be it, but if it isn't, this contrived turing machine can then be optimized with backprop in principle. The memory controller algorithm could be approximated by a RL policy gradient bootstrapped from existing controller algorithms. Put it all together, and you have a differentiable computer. Now, I'm under no illusions that this would work practically. The number of backprop steps for autograd of a single second of computation would be in the billions, and if I recall, our "backpropagation through time" techniques for training RNNs limits the depth to small dozens, which must be for good reason (exploding/vanishing gradients I'm sure, or just chaos from butterfly effects dominating the signal).
A few of those limiting assumptions are that you want to recreate a modern CPU running at blinding clock speeds, but perhaps something running at 5khz is enough if the representations are a bit deeper. Like AlphaZero needs about search about 60k monte-carlo samples with its smart policy and learned value function to compare favorably to Stockfish's 100+ million from its simpler tree-search and hand-crafted heuristic value function. So perhaps backprop depth can be managed.

Další v pořadí

Automatické přehrávání

#59 JEFF HAWKINS - Thousand Brains Theory