The First AI Processing Unit is a BIG Deal.
Vložit
- čas přidán 11. 05. 2024
- AI News!
▼ Link(s) From Today’s Video:
Elevenlabs sound generation: / 1759240084342059260
Carlos on Groq: / 1759941976927924682
Groq: groq.com/
Matt on 1.5 Gemini: / 1759988776128921604
Matt Shumer's twitter: / mattshumer_
Mistral Next: chat.lmsys.org/
► MattVidPro Discord: / discord
► Follow Me on Twitter: / mattvidpro
-------------------------------------------------
▼ Extra Links of Interest:
✩ AI LINKS MASTER LIST: www.futurepedia.io/
✩ General AI Playlist: • General MattVidPro AI ...
✩ AI I use to edit videos: www.descript.com/?lmref=nA4fDg
✩ Instagram: mattvidpro
✩ Tiktok: tiktok.com/@mattvidpro
✩ Second Channel: / @matt_pie
-------------------------------------------------
Thanks for watching Matt Video Productions! I make all sorts of videos here on CZcams! Technology, Tutorials, and Reviews! Enjoy Your stay here, and subscribe!
All Suggestions, Thoughts And Comments Are Greatly Appreciated… Because I Actually Read Them.
-------------------------------------------------
► Business Contact: MattVidProSecond@gmail.com - Věda a technologie
Note that Google and Amazon have both had TPU's (Tensor Processing Units, ie. AI chips) for several years, however they were only used internally. You can rent time on them through AWS or Google's cloud, but you can't purchase them.
TPUs have been in all the new Google phones since the Pixel 6 (past three years) and Google also released the EdgeTPU for home use in 2019. I'm sure none of those are as fancy as the TPUs sitting in their data centers, but it's something.
Matts title is still misleading since processing units used for AI ARE purchasable, see Nvidias line of the A100's and H100's, A80s, etc.
@@px43 I guess it's got something to do with the fact that a few phones are already equipped with Google Gemini Nano (a local and offline model).
even then openAI still clears
@@midprogramming The A100's and H100's are a general purpose GPU which is great for AI, since they support 80GB of RAM, but they can also work well for physics simulations or even mining cryptocurrencies (though they cost too much for crypto). As an example, they support 64 bit floating point arithmetic which is too precise for neural networks.
The @AnastasiInTech youtube channel has details about Cerebras, Dojo, and various other chips as well as the H100, ASML, TSMC and other non AI topics. She's a chip designer. This is a better channel for the latest AI news.
Even if it's expensive, the fact that a small company is increasing supply in an industry that's priorly been dominated by a few companies that deliberately castrate supply is a sign of perhaps lower costs on the overall industry in the future.
yes because they wont just buy a small company..... like meta did severall times.... in fact meta would either buy them and if they didnt sell they just copy
The inventor of the Google Tensor Processing Unit (TPU), founded Groq. Jonathan Ross
That explains a great deal. I was wondering how these compared to a TPU or NPU.
It won't be long now, before we have about 1K that has CPUs, GPUs and a third Slot for the AI specific Unit.
Now thats mass adoption of local AI tools.
And since people don't like to be told what the can and can't do.
Mass adoption of uncensored open source local AI tools.
For now I would settle for a relatively cheap, sort of mid-range GPU, but with a metric shit-load of VRAM
@@cacogenicistYes. An RTX 5060 20G. Would be an excellent holdover until the new hardware comes out.
Hardware will continue to grow on the server side, but I believe AI will minimize the hardware specs required at the local level.
About Mistral Next being open source: Mistral's CEO Mensch said during a French radio interview in December that Mistral would be releasing an open source GPT-4 level model in 2024, which is probably Mistral Next (or something better). The hype is real. 😳
No they did not say it would be open source. In fact their current best model "Mistral-medium" is already closed behind API and not open source, so why would there even better model be open sourced?
Keypoint: "in December." Since then Mistral sold out to Microsoft and has scrubbed their website of any talk about opensource, including removing the part where they said they were committed to open source, lol. I wouldn't bet on them releasing anything again. In fact they even renamed all of the models they already released as "open" models, while all the unreleased stuff is named differently.
@@default2826 Yes he did. Are you gonna say "no they didn't"? Are you 4?
10:30 reminds me of the last part of the video History of the world i guess. .. “Let’s invent a thing inventor said the thing inventor!” 😅
Wow, Matt! I'm impressed. With precision and clarity, you've unpacked the significance of 11 Labs' groundbreaking sound generation AI, offering us a front-row seat to the innovations that are redefining our digital landscape. I'm so glad I subscribed to your channel. Great timing!
Full movies are coming soon
We’re still far behind for that lol , just imagine if the ai can remember and regenerate every face then I could see a movie , locations ect now that would be crazy
It's bigger than movies, it will understand how the physical world works 👍🏻
@@unkind6070 that is currently physically impossible because of the sheer amount of data out there and the cost to train it on all of the data.
@@CatfoodChronicles6737 it's not impossible ?? What are you talking about
@@lllllRoguelllllX Sora isn't only text to video, but image to video, video to video , (and there were a few more things I forgot it could do) you could just have it generate from where it left off essentially, might take somebody to generate something 60-90 times since each is 1 minute, but they could much easier string a full movie together nowadays with that, than say the other stuff we have out. Full movies are still coming, not in a single prompt though, but even that is potentially soon lol (if you consider a full movie to be like 5-20 minutes or some sort of websitcom episode)
That groq compute unit is insane!
Hey Matt, you always find the most excellent new AI stuff! Thanks for sharing this awesome update - I love geeking out over how fast this tech is moving.
cool background man! :) (i remember last time you wondered what to do with it, now it's a very cool setup :)
Wow, the volume increased when the train changed when it went next to the building. WOW!
My guess is that dedicated Ai chips are going to allow real-time processing of video, so you'll eventually be able to completely rework the graphics/theme of a game or movie whilst you are playing/watching (with imperceptible latency), similar to how we currently use Reshade and shader models to add some effects on the fly. You'll perhaps be able to take an old game and make it next gen level based on your description or sample input of what you want to see. Exciting prospects for both game developers and endusers.
Great video, Matt. Thanks for what you do.
Intel Movidius Neural Compute Stick was an AI inference accelerator in the form of a USB dongle.
Groq is certainly not the first company to venture into the AI accelerator business.
Hey Matt hey everyone 🍋
this is gonna be seen like a gt 710 in a few years, if not a decade
This is so exciting! :D
7:00 If their chips are so good, I expect to see the AI super-enhanced in not only speed, but all-around performance and intelligence too. Would be interested to check if it changes how the models do on reasoning or certain benchmarks. If it really enhances them more than speed it will be a huge game changer for me. I don't mind waiting 2-3 extra seconds. But I can still imagine how useful it will be for people that rely on speed...
because most of the compute is for pumping out a better model, not running inference.
I don't see the chips inherently improving reasoning, but this allows you to run way larger models at the speeds we are used to even if they aren't super fast.
Spending more compute per response "upgrades" model capabilities. If the card lowers the time and power needed, that alone can make AI smarter.
Keep in mind that demo had to use around 568 LPUs in order to fit both Mixtral and Llama-70B in memory, so you're looking at about 11 million USD if you want to do this at home, going by the current price of $20,000 per LPU
yeah... there's room for improvement. I naively thought one card would be good, but nope.
I'm dreaming of a world where large companies wouldn't have a monopoly on these LLMs.
You should have at least 10x the number of subscribers, your videos on AI are simply awesome ❤
Elevenlabs sound effects + Sora will speed up the workflow indeed.
Matt, you got me excited again! 🔥🍷🤟🍷🔥
One of the cons of those cards is that they have only 200MB of memory
230, but I think it works differently, you can't load an llm in 230mb, so it may use the system memory instead
230MB is what they have on the die, in the processor... It's A LOT, a pro, not a con.
"230 MB of on-die memory
delivers large globally sharable SRAM for high-bandwidth, low-latency access to model parameters without the need for external memory."
wow.groq.com/wp-content/uploads/2022/10/GroqChip™-Processor-Product-Brief-v1.5.pdf
Weird, but somehow it still runs the massive models fast so
@@apache937 because they use 500 chips to inference Mixtral
Well that didn't take long. Go to the bathroom and look what came out after Sora 😁
Grok is a word coined in an early Robert Heinlein science fiction novel called stranger and a strange land. That's where both of these names have come from. In the book it meant to understand completely and fully.
I don’t see the 11 labs link for early access. Thanks for the info. I’m building a new machine so I’m loving this stuff 👏🦋 very helpful
HII! This is amazing!
MattVidPro AI Fans: "Give us powerful AI graphics cards."
Nvidia: "New GPU that can render 32k resolution shadows! The most realistic shadows ever seen!"
MattVidPro AI Fans:
OpenAI and Elevenlabs tag team would literally end all competition. But again OpenAI has been working on their music generation model Jukebox behind the scenes. I wonder what they’re cooking up in that regards. Dev Day is gonna be mind blowing this year!
Ayoo love u bro
I think when we get together for movie night (even if getting together is virtual), we'll just type in the plot, and a random movie will be generated based on our prompt, complete with speech, background audio and sound effects.
Intrigued with this bit of hardware. Does this work side by side with graphics cards or a separate build soley for ai use only.
NVIDIA's dominance lasted 8 months, 22 days, 6 hours, 45 minutes and 23 seconds.
You hit the nail on the head, the “computer is out the bottle”…..
In terms of AI text-to-video with sound, the real mind-blower will be when it's possible to write some dialogue and have an AI-generated actor speak that dialogue with perfect lip-syncing using something like ElevenLabs' text to speech technology. I have no idea why Apple or Google have not already bought ElevenLabs.
Pure magic
Not the only two. Def need great music for music and score.
Groq and Mistral Next.... and incredible leap ahead!
I mean I may be wrong about this, but I read their paper, it took 512 of these chips together to run llama 70b. And each chip costs 30k so idk how practical this can really be at all. A single a100 can run a quantized 70b for 15-20k.
The morning of AI age with dedicated ships, cant wait for noon.
Interestingly, mistral-next was the only model who could answer this correctly: "I have 3 apples. Yesterday I ate 2 apples. How many apples do I have? Reason first then answer." Even ChatGPT4 and Gemini were wrong.
I'm just waiting for someone to come out with an AI Text to Video game engine. Or someone to make an AI GPU game consul to run Ai generative video games, That can run characters or generates images and object in real time.
Sora is getting close.. just need better control and faster image gen speeds.
Someone is completely forgetting about NPU’s and TPU’s.
I like that the chip literally looks like a sift.
Watch out, they might lobotomize it near the release date.
Can't wait for AI to realise that constant pan induces nausea.
From a sound designer, not bad overall but the puppies in snow, normal speed dog bark with slow mo video, a no no
Make more of these videos please 👍
Isn't this what Google's TPU is also trying to achieve?
iPhones already have a neural network (AI) chip for doing face recognition, among other things.
Thinking of investing into the manufacturing of a LGTCPU to make everyone happy.
I’m not convinced about the audio, almost everything is off. The car engine, is a startup engine, no ambiance, no sync. The train sound atmosphere lacks of interior echo and japan train soundscape is very different , same for the food steps, and crowd. I’d be curious to know if they used the same Sora prompts or used the video or used video to description as a prompt
Google has TPUs specifically for AI for quite a while now
not as fast
@@apache937 that's not the point...
Let’s find another name than AI Processing Unit for this, some people may start to abbreviate to APU which is already the abbreviation for an Accelerated Processing Unit.
Sure, groq calls them LPUs
Layer Processing Unit it is.
IPU
Intelligence Processing Units
@@quinnherdenThat could also work. Can’t think of any IT or computer terms off the top of my head that abbreviates to IPU.
ChatGPT has some ideas:
AI Optimization Processor (AOP): This name emphasizes the optimization for AI tasks, making it clear that the processor is specialized for artificial intelligence applications.
Intelligence Processing Unit (IPU): Highlighting the "intelligence" aspect, this name suggests a focus on smart processing capabilities, distinguishing it from general-purpose and accelerated processors.
Neural Computation Processor (NCP): This name references neural networks, a foundational technology in AI, suggesting a specialized processor for neural network computations.
Smart Compute Module (SCM): A broader term, this suggests flexibility and intelligence, indicating a module designed for smart computing tasks, including but not limited to AI.
Cognitive Computing Engine (CCE): This emphasizes the cognitive aspect of AI, suggesting a processor designed to mimic the way human brains process information.
AI Specialized Processor (ASP): Straightforward and to the point, this name indicates specialization in AI tasks without stepping on the toes of existing acronyms.
Deep Learning Processor (DLP): Focusing on deep learning, a subset of AI, this name suggests a processor optimized for deep neural network tasks.
Marvelous, Carlos E. Perez compares special AI chip to NVIDIA GPGPU. While such comparisons would be valid 10 years ago, now NVIDIA has special computation chips like H100, which are largely oriented on AI. And a great advantage of NVIDIA that its GPGPUs are able to run the same sw stack as top of the line AI products which puts the NVIDIA miles ahead of competitors.
Spending more compute per response "upgrades" model capabilities. If the card lowers the time and power needed, that alone can make AI smarter.
The real test will be how much you need to do to get these sounds. Does it figure it out by itself? What sort of prompt, if any, is necessary? How do you change it? What if you’re going to add your own robot sounds, but need the other stuff? What if this is generating lots of layers and they’re going ahead and layering them in in Cubase like regular sound designers do?
Sounds like we can finally use that extra empty Pcie 5.0 port on our motherboards, so you will have your normal GPU and a Groq in your pc🖥 🤔.
NVIDIA sat down too long on their antiquated Gtx, like Google sat down too long on their antiquated search engine.
Both are superceded by "nasty surprise"😂
i am waiting
How do you think the economical worth would change, if for example, it were revealed that Taylor Swift was entirely artificial? Do you think public perception in her performances would continue to hold weight? And would there continue to be any value in promoting and investing in her 'public' appearances? Now, extend that concept and scale it up to macroscopic proportions. What emergent properties, then, do you imagine will unfurl against the world?
Back decades ago, chess playing programs suddenly got a lot better at beating humans.
Nothing had changed in the algorithms, but computers had increased in speed and memory.
That enabled the programs to look a little further in the game tree.
Just being able to look a move further ahead than the human greatly improved its results.
So yes, scaling up speed and memory capacity of an AI system should improve its performance too.
This may be the point where AI can become super intelligent
ClosedAI will not be able to catch up, it's just Gemini 1.5, not even Gemini 2.0 😂😂
Google TPU is optimized for tensor flow only
quality fully ai 30min movie by end of year
Its not mind blowing its moving fast because they had this tech way before now. They just are releasing it now since they probably have something far more powerful than Ai.
Video to sound ai will be key. Time to go fire up my pytorch
Do you have any source to back up your claim at 13:21 ? The previous model mistral released (mistral medium) has not been open source and its been locked behind a paywall for months with no release date in sight. It makes me doubt they will release there newest model to the public or any other models.
The trade off for compute is always speed, until the chips accelerate in speed times 18! Now you can get 18 times more compute in the same time?
Whoa, can Groq AI CPU drive Sora to be running in complete real time (30 FPS at least) in the future at some point ? (so it would run locally in realtime and would allow realtime simulations or video games or game engine/3D software like environments and everything realtime related) Edit : Oh ! it might also allow us to run Sora in VR/MR setting in full 3D depth (with another AI model that would split the single frames into two angles)
The main question I want an answer for is - Cure for Cancer.
Solution for poverty
Not if big pharma gets involved, gotta protect its business model
Which cancer? All cancers?
Gemini 1 does not come close to glt4. Hallucinates more than a drunken sailor.😊
Most of what I want regarding AI is a GPT-4 level AI running locally in dedicated hardware. Imagine games using that stuff.
Might need to adjust your camera mount to prevent the shake.
The Mixtral model is not so smart. Asked it what sequence should one put a nut, bolt, spring washer and a washer? It said bolt, washer, nut, tighten the nut, then put the spring washer. Er... no.
We’ll end up having them in all computers like graphics cards. Expensive at first, big. Then it will get smaller and more powerful. You’ll have it in your phone and it will be standard
Do you remember one decade ago (more or less) how the RAM memory chips were so much appreciated and desired by the users? Why? Because at that time the software evolved to the point that new hardware should be created and imbedded in those days systems. It’s the same today. The AI continuously growth will require at some point an “AI new plugin card, with specific designated microchip” internal, to be plugged in some slots, or external, to be connected with machine via USB or TB
I tried groq; it's crazy fast, but unfortunately it still has the same drawbacks that gpu based systems have, like the absolute inability to do even basic arithmetic, and ridiculous though very funny answers to some queries that even a six year old can answer correctly. For example, I asked it "what has four wheels and flies?" It's answer was, "a garbage truck, because it has four wheels , and it goes flying down the road causing bits of garbage to fly out the back". You have to give it an "A" for originality for that one!
The issue though is that their price isn't really affordable. You need a bunch of these cards to serve one instance of a model. With TPUs/GPUs you don't need that much, as they have a much higher memory than these LPUs currently have (200 or so MB).
4:05 I wouldn’t be surprised if Elon sues these guys for such a similar name, or buy them out or something.
Local AI is the future.
Apple knows that ....
Mistral is no based on Llama 2 as mentioned in the video.
Damn
A bit expensive for indies but this seems to have a ton of potential for a medium-sized developer to offer realtime Ai dialog. I know there are already games that do this but they tend to have some awkward latency but it seems like this could get it fast enough that it would be no different than the latency of a standard video call.
maybe. I just think pregenerating millions of lines of dialogue for every npc could be a better initial option
@@apache937 With text to speech being good enough, that would be viable if a bit inflexible to novel interactions. Trying to store that many sounds files not so much.
@@apache937 Perhaps but then if you have the player able to give it any input they like, then you need some logic to match a particular input to a particular output, which is essentially what we have LLMs for to begin with so you still need the LLM, you just might be able to avoid some inference time.
my suggestion was more of for random background npc stuff,, while still having the main story be as they currently are, wouldnt be hard to add to current games as update to make the world feel much more alive @@user-on6uf6om7s
It is 2024 and every time someone says "mindblowing" it still blows my mind..
When will the Livestream be mad?
What irritates me is that even if you pay for OpenAI's subscription they are still adding watermarks to generated images, and since they do it with images it would stand to reason that they'll do it with video too. Personally, I have no interest in something I pay for having watermarks.
Hey Matt, did you hear anything about Elon Musk and Midjourney? Anything about reddit selling there Data to AI company? I wanted to ask you to verify it since you know more about AI community than me.
How much will it cost vs H100 system?
We AI enthusiasts should notice, the people outside our Bubble are going to be upset about AI and special angry about Sora and it ability (look at X). We are moving fast but the rest of civilization not. This can be pretty ugly. We talk about this too and be aware of this
First we got Grok, now we have Groq
I'm waiting for grocc
Groq was first. Elno thievo
12:07 did he even watch the MrBeast video? So many things are wrong with that summary
For costs do we think these new AI Processing Units are going to cheap or expensive? And are we talking Consumer cheap ($500) or consumer expensive ($2000) or are we talking commercial cheap ($4000) or commercial expensive ($20,000+)
Home users should just get a gpu which can play games and run opensource llm too.
So AI specific processing chips are nice and all and whatnot but neuromorphic processing chips is where it's at. Deep South is going to be insane. Hopefully not literally insane. That would be bad
Still waiting for the development of analogic computing to do ai. That will be like... idk massive. GPT 5 on a smartphone
Every foley artist and company in Hollywood just died.
The harry potter one doesn't impress me that much. you could CTL+F the sentence and then scan the context of the few paragraphs around it to figure it out and it wouldn't take that many tokens. If it asked it to write an essay based on that sentence and a different sentence in a difference spot in the book and asked it to tie it all together thematically with the entire book, then i'd be impressed and understand why it took so many tokens.
Yo what up!