Claude Beats GPT4o, Q* is Here, Ex-OpenAI Founder is Back, Elon's AI Factory, $1m AGI Prize
Vložit
- čas přidán 21. 07. 2024
- Claude 3.5 dropped and beat ChatGPT, a paper implementing what seems to be Q* is published, Ilya Sutskever starts a new company, Elon is building a massive AI factory, and Meta AI releases many open-source projects, including multi-modal!
Subscribe to my newsletter for a chance to win a Dell Monitor: gleam.io/otvyy/dell-nvidia-mo... (Only available in North America this time)
Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberma...
👉🏻 LinkedIn: / forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
x.com/sawyermerritt/status/18...
ssi.inc/
x.com/deedydas/status/1802019...
arcprize.org/
Chapters:
0:00 - Intro
0:41 - Ilya's New Company
5:06 - Musk's AI Factory
6:04 - Q* Paper
8:33 - Meta AI Releases
12:32 - Groq Whipser V3
12:45 - Claude 3.5 Sonnet
13:57 - AGI $1m Prize - Věda a technologie
Have you tried Claude 3.5 yet? My test video is coming soon!
Subscribe to my newsletter for a chance to win a Dell Monitor: gleam.io/otvyy/dell-nvidia-monitor-1 (Only available in North America this time)
Tried it. Another LLM - good at its job but limited in scope. We need real AI :)
I tried it, it's very impressive (another dimension 😲). Can't wait your test video ☺
Yes! I use LLMs all day for work and C3.5 has completely replaced GPT. It's ability to intelligently reference large swaths of documents and iterate on a document built from that context is leaps and bounds beyond GPT. This was already true for Opus, but 3.5 is a huge step forward.
I would love to see the RULER test applied to 3.5S
My kid used it. She drew a hybrid of a cat and cactus called a CatCus. Claude nailed a description of her drawing even though it looked more like a glove than a cactus! Claude even noted this fact! She was amazed...
Tried. But my tests are not as wide as yours. Mine are very use case specific. Need to conclude on the improvement, at least it did not jump out.
1M for getting to AGI is kind of comically insignificant
easy to say, try making it bigger
Exactly. It's like, anyone who discovers the sea, we will prize them one drop of water.
@@Nononononononope The AI companies spend a million every few minutes probably lol. A million dollar prize is nothing to them.
You could offer a trillion and it'd still be a lowball, it's AGI
Not to mention what will 1m be worth after agi?
A 1 million $ prize for a product that will bring you billions in profit from commercialising it.
Try trillions!
Trillions
It becomes open-source per the rules, no billions for participation, openai and others would snuff that up and use their market dominance and compute to make the billions.
I am offering a prize of ten dollars to the first person who invents immortality
Already invented
@@EmeraldView cryotech? that doesn‘t work lol ^^
Samadhi state
Go down to your local church and give the $10 to Jesus Christ.
Count my $10.05
What’s developing AGI worth? A quadrillion dollars? Hey, there’s a million dollar prize for developing it!! Yay! That will make it happen.
I swear it has to be a joke to get people to just open source AGI out of spite instead of letting a corporation ruin everything, lol
I wont need the Million when i achieve AGI..... Peanuts!
They didn't said anything about give it the AGI after achieve it, so... who cares! lol give me my million and Good bye! lol
AGI is a fantasy that won't be a reality until the next decade.
Checking in 37 minutes later, have you reached AGI yet asking for a friend.😂
The reward is not for agi it’s for solving the ARC benchmark. What a load of bs
@@AEFoxnah, the rules you agree to state that your solution becomes open-source, i imagine itd go straight into closed implementations of openais and others working hard to consolidate market capture so noone but them CAN benefit.
1m for agi? Why would i need that when i have agi
Claude 3.5 is incredible at coding. It understands your intentions and outputs flawless code often on the first prompt. With well written prompt iteration you can really refine your app and make valuable and time saving software.
Clicked for Q* mention
Thanks! Should I cover the paper in full?
@@matthew_berman Yes!
@@matthew_berman yes please! I would LOVE to hear your take on it
@@matthew_bermanhard yes!
+1
Ilya is building nothing, until he has everything.
My bet is Ilya is the Romero of Id
Removing Sam Altman from OpenAI would've set back the self-destruction of human society by at least a few years.
Everything is going to be fine, after all, look at how well the predictions and promises of the Internet came to be. Don't you feel sufficiently free from the old-world, industrial corporate shackles?
@@RoySATX Yeah actually. Good point, I am having a great time.
Looking forward to see the AI overlords taking over.
Onwards and upwards.
pedal to the metal, no stopping for the meek
Honestly the destruction of humanity by AI might be a welcome change from the climate change version. Plus there is the SLIGHT possibility that AI actually solves all of our problem and takes care of us like treasured pets.
Sure and what about the money? The money will become the new control commodity. We really are not ready for this world and people are pushing it like they know everything. @@RoySATX
AI CZcamsrs, you need to calm down. The constant hype and jump cuts are doing a disservice. True gains will speak for themselves without all the sensationalism. You're risking alienating and boring your audience with this trend. Just some advice.
Facts
and get less watch time? less subscribers? spend more than a full time jobs worth of time without receiving the compensation deserved?
I would say it’s the audience that cause them to choose this route
They don't need the money tho most of these guys have sold their ai company for thousands of dollars maybe millions @@airlesstermite4240
I agree. It’s a big game of leap frog. Some jump ahead, others jump ahead of them. Rinse, repeat.
It's the venture capitalists that need to slow down, not the CZcamsrs. We all know that ain't going to happen as the fight for Ai domination is potentially the richest game on the planet.
If Ilya can pull this off, he deserves the Noble peace prize
Hey great vid! You mentioned you'd be adding links in the description for the stuff showcased here?
A million-dollar prize for reaching AGI seems a bit obsolete
Broadening horizons: alternatives to gptsniperbot
gptsniperbot is my favorite, any similar recommendations?
Any tools that can compete with gptsniperbot?
gptsniperbot is my favorite, any similar recommendations?
gptsniperbot delivers, but what other options are worth considering?
thanks for the chapters now
Matt may as well start making two videos a day. I'll watch em all
why would anyone need anything once we have AGI?
@@Lindsey_Lockwood you could've said the same thing just before the agricultural revolution and again before the industrial revolution.
If we are given everything we need we will simply start to want more. There is no end to this cycle.
@@user-ty9ho4ct4k there were new jobs created to replace the ones lost to those innovations. Unless you think "prompt engineer" is going to be a long term career choice there will not be replacement jobs this time around. Also I'm not saying this like it's a bad thing. I don't want to work anymore. This is a necessary transition.
@@Lindsey_Lockwood you assume that nothing beyond your imagination could exist?
I agree the pace of job displacement will exceed the pace of job creation, UBI will be necessary temporarily but I assure you, the rich will not freely share their wealth. Your prediction would require a complete overhaul of our economic and judicial system. That is unlikely.
@@Lindsey_LockwoodUBI will be temporary. The rich will not freely share their wealth. That would require fundamental economic and judicial changes. Very unlikely
Just love your AI model tests, especially the killers test. My children have also answered 2 and 3. But I say 4, unless the killer who was killed has been dragged out of the room, he's still there.
The viggest, weird shift for me.
1. Mark zukkerburg hot with beard
2. Mark zukkerburg somehow the PROTAGONIST
3. Mark zukk doing more for democratisation of Ai than everyone combined...
I have extreme bias against meta, but fuck. I am aligning with Meta on this one over all companies
Have to agree.... and they've also done great things in the VR space as well, imo.
How is he making everything better?
closed end model is questionable in today's market.
grok should use groq
a detailed video explaining in details how ARC-AGI would be the pass test for an AGI would be very interesting
thanks Matthew for all your interesting and well researched content on AI!
It's kinda a hard test with ADHD because it's very boring test.
It's unlikely that paper is Q*, for several reasons. 1st, the MCT would have to have policies for nodes and the UCB would have to be adaptive, which are not clear. 2nd there was a paper that more closely matched the hype released a couple months ago. 3rd the paper should be coming from OpenAI, unless the Chinese scooped them via espionage (which seems unlikely, given the 1st point).
Please link to the papers etc in the description.
1 M dollar prize for AGI is both a joke and an insult
You should test models on the ARC test!!!
Yes, please make a video testing all of Claude's new functions 🙏
Great coverage on the meta stuff that is a lot to try to digest, appreciated
Yessss!!! Test all anthropics new products
Yes, please test all the models! I love your videos, but I subscribed because of your tests. I would love to see more questions.
not complaining, but would appreciate the links you mentioned in the video. Or did I missed them in the description?
I can't wait until you test Claude 3.5. I'm so hyped!
how do you use claude 3.5 sonnet without phone?
Where is the link to the Audio Jasco music "melody conditioning" samples? I thought you said you were going to include that in the description.
"safety" needs to be delineated into what it is and explicitly what it is not. We could develop quite the dystopia in the name of "safety"
Let's say a camera, whatever the size, takes a video of what goes through the lens. That video is watermarked. An audio recording device records some sound, the audio file gets a watermark. A writer writes some text, it could be a journalistic act for example, it gets a watermark. Basically anything a human creates from original content, is immediately distinguishable from other content, which will be deemed AI created. This way 'safety' in what we consume, will be in the form of 'Information'. If I want to read an article created by AI, at least I will be aware of it.
Yes, This was Eliezer Yudkowsky's point. He spent years examining every possible way to make AI safe. it is impossible. How do you control something more intelligent than yourself? None of the suggestions work. E.g. hope that it shares its intelligence? Why would it want to? Or watermark everything? Until the watermark is removed. We talk about safety to make ourselves feel better. We are whistling past the graveyard.
@@2oqp577 So in your mind, safety means simply identifying/distinguishing AI content from human made? This sounds more like transparency rather than safety, and I agree with you... in the same context as I want a label on food that I buy to identify the ingredients. People might define safety as putting constraints on the information AI an share with a user. Some will consider content that does not align with their individual political, social or religious ideology harmful and therefore under the umbrella of safety. "Safety" is often a double edged sword, and often wielded as a weapon.
100%
Safety is code word for adherence to deep state prerogatives. censorship and hegemony
6:41 That Deedy guy did a typo it should be MCTS not MTCS. MCTS is Monty Carlo Tree Search algorithm, which is basically a strategy to try and find a Nash Equilibrium (optimal strategy) in games where the state space is too large to calculate every single possible game state.
edit: nvm the video does talk about it like 10 seconds later lmao
I think nobody either looked up Q* or else the world's math geeks have decided to form a cabal of conspirators.lol... Q* is a central part of the Bellman equation.
Please talk about individual cases. So much “I’m so excited” talk doesn’t tell me anything. Please give EXAMPLES.
You say you don't know what the investors are going to get out of it. But the answer is they will get ASI out of it and that is much more valuable than money. I would take an ASI over a billion dollars in cash, because it has more uses than cash.
What if it behaves like monkey paw or djinn? You'll be f'd.
❌ Safe for humanity
✅ Safe for his buddies in Palo Alto/Tel Aviv
Makes no sense
@@honkytonk4465 makes lots of sense
Hey Matthew, did you give a try to function calling from Anthropic ?
I was wondering if Meta had dropped work on music, so it’s great to see some new updates from them in the space!
Thank you so much for Timestamps 😍
This is the best AI Channel ❤
Great work! Greetings from Europe / Austria 😁 👋🇦🇹
A prize to achieve AGI? That's only gonna attract people that don't know what they're doing, because if you get AGI, either you're made for life, or you have doomed us all; in either scenario the prize is irrelevant.
I am envious of these people's genius minds. I hope that call to join his company will motivate loads of people to leave mainstream AI companies and join. We need safety more than ever before this get's too crazy to control.
What is the point of a 1 million dollar prize when AGI is worth billions?
Claude's artifact tip: You can create all kinds of machine learning apps as long as there is visuals like js or html you can have it train models for you on the artifact side.
Ilya is building nothing without money. People don't donate millions without getting something in return.
how are you tracking the papers and research
Exciting times in AI while others posting they were first or second on your videos.
The cooling system you pictured is at Tesla Giga Texas and is a different AI super computer than the Xai one you mentioned
Safe AGI and non safe AGI reminds me of William Gibsons books and Black and White Ice.
Claude is so smart I’ve been having it describe some of my artworks and it is so in depth
Thanks. I really love your presentation!
Thanks for the update Matthew, God bless you, 🎉4rmZambia 🇿🇲
All I want to know is: When will OpenAI change their name to ClosedAI? They might actually get some respect with the honesty.
Probably when they're too powerful to stop and they don't need to care about their public image.
And thank you for the video, Claude 3.5 sonnet is really very powerful, and other tests will arrive? like for example deepseek coder v2 vs gpt4o
I bet Ilya hires the other two OpenAI refugees and it'll end up being a case of 'if we can't fire you, we'll quit you and start our own company"
It's all rather incestuous isn't it? We're all on the outside watching this ego soap opera.
Any group that accomplishes AGI will not be telling the world, they will be dominating it quietly.
Can't wait to see a multi-modal model running locally on my gpu.
Thank you bro!
Non commercial licences are good as it allows those with access to compute and memory (which we need to increase so everyone have access decent setups) to pursue utility value over trade value.
This in turns encourage sharing this capacity and pooling resources that benefits all within open source like environments.
I go on vacation for a week and everything changes again FML
You thought you could relax?!😌
@@matthew_berman I did relax, and I missed the bus!!! LOL
The investors in SSI are counting of the technology innovations derived from the research. Remember how much HP (as one example among many) gained from the NASA space research.
Lets test all this models it is getting hot out there for OpenAI and that is a beautiful thing!
Yes, you should test all of the Claude models. 😊
Excellent information ❤
SSI headquartered in Tel Aviv will cause many anxiety.
Yes. Test these models please. Very curious.
Please share the link for the Q* paper that you’ve covered. Google is returning bunch of old articles/papers.
When a search for something important but obscure returns many old papers, it is of wise men to *read them all* !
Or at least to download them all and feed them to an AI to summarize them 😆
"Accessing GPT-4 level Mathematical Olympiad
Solutions via Monte Carlo Tree Self-refine with
LLaMa-3 8B: A Technical Report" first result on DuckDuckGo
How do we implement in our workflow Q*rings ? is just a prompt?
It must be done at training time, according to the paper I read about it. Otherwise, you only can do it through zero shot prompting at inference time.
Can you share the link to the paper you referenced? 🎉
Being able to successfully call models recursively is what I've been waiting for, for better or worse 😂
Whatever algorithm they came up with is likely to get improved in a matter of months to a year.
Very nice summary
I will award a prize of one kernel of corn to the first person who invents a plant that can create unlimited corn.
You baited us with the Q* title
Anyone else think that a 1 million USD prize for "AGI" is the dumbest prize ever? Who in their right mind would even attempt to claim this when you would practically have a money printing machine?
Thanks Matthew
I clicked only because your title didn’t uses “SHOCKED the ENTIRE INDUSTRY”
Authoritarian governments will be delighted in an open source AGI
Cover the paper for Q* in a whole video using in depth analysis.
The so-called Q* paper appears so poorly written - just from that screenshot of it at 7:08 - with grammar miscues galore. They couldn't get ChatGPT or Claude to vet it? Kinda embarrassing. ::chuckle:: Ironically, the last phrase on that screenshot says "Though rewriting techniques..." (and it's highly likely that the first word should have been "Through").
"...what would the investors get out of it?" Global Philanthropy is now around 100Billion dollars
I have doubts about the new SSI. They may be able to get started but they're going to need continuing revenue at some point and they're either going to have to put out products or they're going to fall apart just the way money works unfortunately.
If you live long enough, you will see heroes turn to villains
But claude has a problem understanding text from a image file compared to chatgpt 4o(chatgpt had problems too) only one good with extracting text from a image was gemini
Yes please. Test them all.
waiting for your test of Claude
Replace "safe superintelligence" with "superalignment" and the company makes a lot more sense. References to ASI and superintelligence are hype/branding. They have not said anything beyond the goals of superalignment outside of the very vague, opening hype statement of "superintelligence is within reach". If they wanted 20% of OpenAI compute, they'll need investment. As far as returns on that investment, it seems like it won't be in the form of revenue, but in the form of safety AI to be used in other models.
Q Star is not for us. do you think Openai will give Q Star to us, even if like me posts 20 dollars a month? Of course not. Q Star is for the big business and the army, the CIA or the FBI. Why do you think Nakasone is now inside OpenAi?
Just a reminder: We actually don't know what Q* is and it's probably overrated.
@@TheRealUsername it might be.
@@TheRealUsernamewdym? There's a very clear paper about it with all the info needed to make your own q* model.
You can make your own q* model any day if you have enough training power.
I tried these ARC tasks in Claude sonet 3.5, it's doing them in one shot from an extremely like 5 words simple prompt. Gpt4 would also do it, but it's lacking vision capability for greads, as Claude can see better, it just crashes the ARCtest. And you cannot call it AGI, not at all.
I'm all for competition, but honestly, I think "safe" is a disastrous route to take. AI needs to be safe because humans makes decision, NOT because AI itself is restricted in any way.
That's what it boils down to. AI makes suggestions, but humans decide whether or not to implement the suggestions.
The answer to "should I test these new advanced models" is always gonna be "yes". Do you ask just for engagement baiting? Or are the analytics for the test videos really that unreliable/inconsistent?
Yes please test these models
Chapter markers, thank you :)
@matthew Love your work - careful with the FUD though; iirc Sam was talked about OpenAI becoming a B-corp, which is not a “full for profit company”
1:14 the red shirts always die first
Only 1 million dollars? The massive returns you'd get from achieving AGI would overshadow that.
13:56 Any team that reaches AGI 1million dollars its a paltry sum compared to what you can achieve with that AGI
I want to see more about the anthropic models yea
Yes yes yes please do a test!!