I built a MONSTER AI Pi with 8 Neural Processors!

Level 2 Jeff

zhlédnutí 100 165

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 27. 06. 2024
The thumbnail about says it all. It works! But what does that mean, exactly?
Some of the things I mentioned in this video:
- Alftel 12x PCIe Card: pipci.jeffgeerling.com/cards_...
- ThirdReality Zigbee Smart Plug: amzn.to/3X97MvC (affiliate link)
- Pi AI Kit: www.raspberrypi.com/products/...
- Hailo-8: hailo.ai/products/ai-accelera...
- MidnightLink's helpful Coral TPU setup instructions: github.com/geerlingguy/raspbe...
- CodeProject.AI: www.codeproject.com
- Raspberry Pi Linux PCIe issue: github.com/raspberrypi/linux/...
Today's CZcamsr outtro tribute: Chris Titus Tech
Support me on Patreon: / geerlingguy
Sponsor me on GitHub: github.com/sponsors/geerlingguy
Merch: www.redshirtjeff.com
Main Channel: / @jeffgeerling
2nd Channel: / @geerlingengineering
Contents:
00:00 - You built a WHAT?
00:45 - The hardware stack
02:26 - Power monitoring
03:06 - Getting Coral and Hailo to work
05:44 - 55 TOPS!
06:30 - Not so fast...
Věda a technologie

Komentáře • 202

@electrofreak0 Před 22 dny ⁺²²¹
Can't wait for a decade from now when they're packing 1024 TOPS into "entry-level" devices claiming "you definitely need all this power for current models"
@heblushabus Před 22 dny ⁺¹⁵
1 BOPS?
@Level2Jeff Před 22 dny ⁺⁷²
640 TOPS ought to be good enough for anyone
@tuqe Před 22 dny ⁺⁵
@@heblushabusT = trillion, so B would be billion. Next step is Pflops for Petaflops
@heblushabus Před 22 dny ⁺¹¹
@@tuqe oh, right. bops sounded funny tho. so, POPS?
@Draggeta Před 22 dny ⁺¹
@@tuqeisn't T tera in this case?
@jonathantribble7013 Před 22 dny ⁺⁴²
It's nice to know that you could run multiple neural networks on independent NPUs! Like one for facial expressions, another for voice recognition, and another for text-to-speech!
@Rostol Před 22 dny ⁺³¹
seriously tho, 26 TOPS on that small thing is impressive by itself. combining a card with a few of those 26ers with something with PCIE-4x16 would make for impressive AI processing in a small package.
@Level2Jeff Před 22 dny ⁺⁶
Yeah, Hailo makes a 'Century' card that does just that-hopefully as multi-NPU becomes more popular, programming for it also becomes more popular!
@tinkerscustom9568 Před 22 dny ⁺³⁷
i would love a video on the home assistant power consumption!
@OmarMekkawy Před 22 dny
+1 Yes, me too
@bluesquadron593 Před 22 dny
Well that would be a 1 minute video 🤣
@Rushil69420 Před 22 dny ⁺²¹
Oh thank god, I’ve been itching ever since you showed that b-roll
@Petra-yd1fi Před 22 dny ⁺⁶
This reminds me of the Friends meme where Chandler gets a fancy expensive computer, and when asked what he is going to use it for he just says "Idk, games and stuff".
The infra is still fun even if you don't actually need the compute lol.
@RaineyPeng Před 22 dny ⁺²⁴
My favorite part of this video is definitely when the box gets identified as a cell phone and he holds it up to his ear 😅
@TheJonathanc82 Před 22 dny ⁺³
Never stop doing what you do Jeff. Love the content, love the experimentation.
@markaphillips14 Před 22 dny ⁺⁴
Jeff!!! The increase in content per week has been amazing. Don’t over due it but man I’m loving it
@ewasteredux Před 22 dny
Bravo Jeff! That was alot of work on your part. Again, congrats and thanks for all the hard work you do for us!
@DiamondMaster115 Před 22 dny ⁺⁵
I had no idea this channel existed, this is awesome!
@Level2Jeff Před 22 dny ⁺³
and now you do, ha! this is the channel where things get crazy
@gu9838 Před 3 dny
hehe a pi mad scientist cobbling together contraptions no one thought of. love it! great scott!
@fridje Před 22 dny ⁺¹⁶
🤗 Accelerate might be able to manage using all the NPUs at once. It's meant for using multiple GPUs for a single inference or training task but it might support NPUs too
@zepesh Před 22 dny ⁺⁵⁴
That was fast
@inferno14142 Před 22 dny ⁺³
I was about to say that
@Level2Jeff Před 22 dny ⁺¹⁸
My AI predicted it
@neb_setabed Před 22 dny ⁺⁵
that's what she said!
@harriet-x.x Před 22 dny
@@Level2Jeffu okey? u look a little red :p
@LeeZhiWei8219 Před 22 dny ⁺¹
Hey man! Wow. Saw this on Instagram and thought this would be on the main channel 😂. But awesome job man! This is so great 😂
@realandrewhatfield Před 22 dny ⁺⁷
OMG!!! Who are you going to be from now until the next video?!?! End of an era...
@Level2Jeff Před 22 dny ⁺¹
Ha!
@the_hetman Před 22 dny ⁺⁶
It’s always fun to see you pushing the limits on single board computers. I’m sure it helps the hardware makers consider unusual use cases.
Jeff, have you seen the stories on the new research into IBD which suggests a genetic link and that certain cancer drugs might be effective? I saw the story on the BBC news site. Obviously I can’t post the link here, but I thought you might be interested.
@Level2Jeff Před 22 dny ⁺¹
Yes, I've been following that research-still 5+ years out from being practical but it's promising. The key is to find drugs that have fewer side effects than current TNF-blockers (which have similar mechanisms but target the immune system more broadly).
@ur1friend437 Před 21 dnem
Love these kind of videos so keep em coming
@dfgdfg_ Před 22 dny ⁺¹
"I've created a monster!
No one wants to see Marshall no more,
They want Jeff.
I'm like chopped liver"
@prince3121 Před 22 dny
Jeff the mad RPi guy! Love these builds when you push the envelope! 😎🤣
@stillblazinkush Před 18 dny ⁺¹
Level 2 Jeff is truly on another level.
@Level2Jeff Před 17 dny
The second one!
@JTB_Computers Před 20 dny
Here before 20k subscribers! Keep up the great work Jeff
@tinkerscustom9568 Před 22 dny ⁺¹
thanks for the video!
@Argent_99 Před 22 dny ⁺¹
I’m amazed by the things you are doing with the humble RPi.
Not as amazed as that fractal north case for the pi5 they are showing off at computex, but still amazed.
@Bill_the_Red_Lichtie Před 22 dny ⁺³
I love the crazy/nuts thing that is this set up! And, even if the "Coral Dual TPU" only shows up as one of them, I'll bet that it is still cheaper and faster than the USB version. Now, where was that AM power meter that your Dad has . . . ? ;-)
@MaxHeadroomGPT Před 19 dny
Jeff, I too would love to see a video on how to setup a Home Assistant Dashboard for Power Monitoring. You sir, are a Wizard!
@sysfried Před 21 dnem
We love you for your tinkering!
@LifeOfTylerHughes Před 22 dny ⁺¹
I'd love to see a version of this with the CM5 when it comes out. I think it would be cool to build an ITX-ish size ARM AI computer that you could use for all kinds of AI projects.
@jeremybarber2837 Před 22 dny
I would love to see a video going over your choice of power metering smart plugs & the integration into HomeAssistant.
@TheMostOrdinaryPersonOnEarth Před 22 dny ⁺¹
Please make a power outlet video, I'm just getting into HASS now and power monitoring is next on my list - Also great video cheers!
@asksearchknock Před 22 dny
Great video 😊 - slightly off topic but do you have any thoughts on using the overlay file system with raspberry and will it really protect the pi os from being turned off without shutting down? Would make a cool video perhaps to hook up a pi to a relay and have it turn on and off hundreds of times to see if and when the os corrupts.; have you done anything with the overlay / read only file system before?
@higon99 Před 22 dny ⁺¹
Oh, man. That's a visually menace pi XD
Yesterday, you tried and SPAGHETTIBLY FAILED chaining NPU and now this. lol
Thank you for the attempts otherwise I would have tried myself. I think 2 NPUs can easily work with running 1 neural network on each NPU. This kind of configuration can realize many real world applications I have been dreaming of so many years. Thank you again.
@Level2Jeff Před 22 dny ⁺¹
Definitely! I think on the Pi 5 at least, that would probably be the ideal number of NPUs. You could stretch it to 4 okay too, but at that point the cost/build could point you to something a bit beefier like Jetson Orin.
@ws01212 Před 21 dnem
I really like your videos they always bring novel information
Do you know when it will support running NVMe SSD and Hailo AI Kit simultaneously? (with the NVMe SSD used as the system boot disk)
@AerialWaviator Před 19 dny
Being able to run two Halo8's would be a cool project as in theory would be (2x 26, or up to 54 TOPs in theory). Combined with dual Pi cameras, would offer high frame rate stereo depth of field, or other fun video processing.
BTW: the Home Assistant monitoring is fascinating. Would be interesting in hearing more details.
@catsupchutney Před 22 dny
Jeff, you have to write a book with a chapter on each of these types of Pi mods.
@BrianMaddox Před 22 dny ⁺²
I appreciate that there are so many different TPUs/NPUs on the market, I’m just frustrated that we’re all beholden to nVidia when it comes to actually training models and running a lot of things.
@Level2Jeff Před 22 dny ⁺³
Ditto. Wish at least AMD could offer something that would take the bottom out on either price or efficiency, but right now it is what it is :(
@BrianMaddox Před 22 dny
@@Level2Jeff I’ve got a used Tesla P40 with a water cooler in addition to another rtx 2060. Now that Intel works with Tensorflow and PyTorch, I’ve seriously considered just getting a 16 gig Arc 770 and paying for cloud computing if I need to train a model that needs more memory.
@omersalem73 Před 21 dnem ⁺¹
Hi Jeff, just wanted to let you know that you CAN use multiple hailo cores together, using the VDevice API - it automatically identifies the cores (granted it will only work with hailo chips).
@Level2Jeff Před 21 dnem
HailoRT seems to have some multi-core configuration too... definitely some fun to be had here!
@redactedofficial Před 22 dny
Hell yeah, finally you dont just tease😂 i was waiting soo badd, am i rpi addicted🤨
@afaulconbridge Před 22 dny ⁺¹
This would be useful to run Frigate AI detection from multiple cameras (e.g. PoE RTSP streams). Even this many devices is less power hungry than a 4080 to run 24x7, and compared to the cost of a good IP camera its not unreasonable.
@worfrozhenko4032 Před 22 dny
0:04 Just flappin' in the breeze
@LockonKubi Před 22 dny ⁺¹⁴
"Why?" well to quote The Doctor from The Waters of Mars, "Fun"
@TT-it9gg Před 22 dny
Nice!
So, other SBC with PCIE Gen3x4 can leverage the full power of all AI chips installed?
@oasismike2905 Před 21 dnem
Thanks for trying AND explaining the problems!
Can't wait for someone to write a "Kitchen Sink Aggregator" patch that'll normalize or make modules generically access -- ahh, I just wanted to use kitchen sink in a sentence.
@Sintrania Před 16 dny
We have level 1 tech,level 2 jeff what’s next level 3 steve 😂
@jacquesdupontd Před 21 dnem
Hey, thanks a lot for your videos. I'm wondering something that i'm sure has been adressed but i can't find a definitive answer on this. Does TPUs accelerate local LLM answer generations (Ollama for example) ? Thanks
@ingusmant Před 21 dnem ⁺⁶
300 TOPS sounds like a lot until you realize even an aging RTX 3080 has about 7400 TOPS. Sure the power use is different but point is you wont be running any big LLM models on this.
@pieterboots8566 Před 12 dny
Is this really correct? I don't believe this.
@MisiSzucs Před 9 dny
@@pieterboots8566
I don't think that this is correct.
@DathCoco Před 22 dny ⁺³
ai models can run in pipeline mode (if you have multiple tasks) for example yolo for bounding boxing a face and a second model that does check your expression and a third that correlates your face with known faces, and so on. I doubt it is possible to run these llms on these tpus - except they somehow split their model into n models
@MystikIncarnate Před 22 dny
Hey Jeff! I'm still pretty curious about everything "AI". I'm just not sure how to take advantage of any of it right now. I'd love to see a video going over a bunch of different AI projects that these can be used for, either here or on the main channel. Obviously frigate is one, I've also seen some self hosted AI chat bots, though I'm not sure how well any of them would fare on a pi.
I know you're the "pi" CZcamsr, but I'm also curious about other applications of such accelerators. I wonder if an AI chat bot would work decently well on a 1L PC (or some similar micro x86 system) using something like the Hailo for processing rather than trying to cram a GPU in a small system like that.
If you know of a CZcamsr who is doing that sort of thing, I'm happy to check them out, just let me know.
Keep up all the cool videos. Cheers!
@stratos7755 Před 22 dny ⁺³
If you want 200 TOPS, just buy the Hailo-8 Century. It has 208 TOPS and is a single PCIE card. Although normal person probably can't buy it...
@Level2Jeff Před 22 dny ⁺¹
I'll just do it myself! Haha, so far I haven't found a reliable way to buy Hailo-8 outside of being an OEM partner. Hopefully they open up more individual sales at some point.
@w13rdguy Před 20 dny
I know you won't tribute aVe, but, that would be hilarious!😂
@Flare1107 Před 22 dny ⁺¹
How do GPU bit miners support the core splits? Some setups are looking at spreading a single calculation over thousands of cores. Maybe there's a way to port a miner task divider to NPU tasks. But I also wonder if we are still limited to running each TPUs individual Floating Point rating? Or could we run full 32-bit models?
@stevanastardust8487 Před 19 dny ⁺¹
I actually met the alftel guy and he is a GENIUS with RF stuff.
@Level2Jeff Před 19 dny
I don't doubt it. The contraptions he makes...
@glabifrons Před 22 dny ⁺¹
At first i was thinking the tribute was to Major Hardware, but the intonation and rhythm is completely wrong... it's closer to Explaining Computers, but the line doesn't match.
Ya got me on that one! @Level2Jeff !
@egrinant2 Před 22 dny ⁺¹
I also recognized the tone as Explaining Computers, but the message doesn't match.
@Level2Jeff Před 21 dnem ⁺³
Check the description ;)
I've started to put the tribute into the bottom of the description to make it a little easier :)
@glabifrons Před 21 dnem
@@Level2Jeff I skimmed and totally missed it! Guess I should've ^F-d instead! 😄
@saiyantwan Před 22 dny
think you might be able to look over the nanopi boards since most have built in NPU? I know the Nanopi R5S has one. Not that much of one but still there
@awetmore Před 21 dnem
The trick to using those NPUs in parallel will be building a pipeline across them, where each NPU is supporting a subset of the layers in a model. This is a common technique in both training and inference, though I'm not sure if Tensorflow Lite supports it.
A pipeline would allow you to partition the model weights and compute across all of the NPUs, giving you a chance to run larger models then you could do on a single NPU. Your PCIe setup is very low bandwidth, but that is less of a concern here because pipeline parallel is only sending the activations (relatively small input tensors in inference) between the NPUs, not the larger weights.
Based on the limited information that I can find about Hailo and their sample hardware (a few of them have many Hailo chips on a single PCIe card) it looks like their software may support this.
@yorkan213swd6 Před 22 dny
Great video, does this also works with Pi4 ? Please also mention in future if the projects also works with Pi4 possible since not everybody has the latest and greatest :-)
@Level2Jeff Před 21 dnem
It works with CM4, but not Pi 4 directly since the Pi 4 doesn't expose PCIe without a pretty serious hardware hack.
@HamguyBacon Před 18 dny
can this be used for llm's and stable diffusion or is it only useful for video tracking.
one day graphics cards will be replaced by AI accelerator cards, and all you'll need is a low power gpu.
@farzadb82 Před 21 dnem
@level2Jeff Where did you buy the Hailo-8 card? On their website they only provide an option for product enquiry, not for purchase.
@GameDesignerJDG Před 22 dny ⁺¹
I wonder how hard it would be for a Pi to orchestrate all these TPUs together. Could it offload some orchestration onto another TPU? I kind of want to write some code for this thing to see how it performs with multiple processors.
@tonysheerness2427 Před 22 dny
As price drops for NPU's people will use more then software will write the software for it. That is what the raspberry pi was designed for learning in a fun way.
@justinknash Před 22 dny
Do these NPUs / TPUs work with ollama and the llama3 model? Currenting running ollama and llama3 on my gaming PC which has a 4060ti.
@sirkingjamz101 Před 22 dny
You choose to switch "geerlings" and it seems it paid off for the better :)
@gamereditor59ner22 Před 22 dny ⁺¹
Interesting, but cool!
@CoreDreamStudios Před 21 dnem
Great information in the video. Is there a PCI-e board like this that would work in a desktop, without buying the Nvidia RTX 40 series?
@Level2Jeff Před 21 dnem
Hailo makes a 200+ TOPS 'Century' card that straps a bunch together and would fit inside a desktop case (full height PCIe card).
@CoreDreamStudios Před 21 dnem
@@Level2Jeff Thank you so much. 🙂
@OriNachum Před 20 dny
Can you show us actual runs? Also would love to see the 10H version when it comes to your hands.
@avibank Před 22 dny
Might be interesting for scientific computing. Can you send instructions via MPI or something?
@ssteele1812 Před 22 dny
Hello Jeff. Odd question here. I have a Lenovo X230 that I will repurposing to "play" with local AI, specifically LLMs. It is going to struggle with the onboard GPU and I will eventually be moving everything to a different machine that will let me add a proper GPU card. Until then, is there a way that I could use a PCIE slot to NVME/M.2 adapter to put one of these little M.2 AI chips in the laptop? Since the machine has USB3 ports on it already, there isn't really anything useful I can put in the slot. If the adapter actually worked, would the extra AI board do me any good without custom software to utilize it?
@Level2Jeff Před 22 dny ⁺¹
Yes, at least under Linux. Not sure about Windows support for these things.
@andre-le-bone-aparte Před 22 dny
Question: Oobabooga (text-generation) Web GUI supports multiple GPUs and TensorFlow (TPU) with mixing + matching - would that work for your setup?
@phischtv4497 Před 15 dny
What's the best lowcost solution today for running local image-detection CNNs on a Pi4 or Pi5? Those USB-TPUs?
@RobertFabiano Před 22 dny
thanks for tinkering
@TheHillsideStudio Před 7 dny
r u gonna do it w/ the halio?
@isaacyonemoto Před 17 dny
Arent they all sharing memory bandwidth? Are there any AI m.2s with onboard memory?
@jobasti Před dnem
@Level2Jeff - Power Monitor Dashboard all the things - YES PLEASE
@jeremybarber2837 Před 22 dny
Oh man… this makes me think of how great a CM5 board akin to the CM3588 NAS board from FriendlyElec but for NPUs would be. Wait… could you just use that board for NPUs as is?
@RickySupriyadi Před 22 dny
so Google TPU might be the LPU arranged in certain way? doesn't seems so, because LPU have huge memory chips... well the tops seems matched I wonder where this experiment will go into this is really interesting. Really interesting! indeed!
@anonymousshoe842 Před 21 dnem
Btw what's the SSD and WiFi card for?
@Chapbook Před 22 dny
What is the best combo for pi hat AI NPU and also running an NVMe SSD all together?
@Level2Jeff Před 22 dny
probably either the NVMe BASE duo from Pimoroni or the dual NVMe board from Pineboards right now.
@thegreeneyej Před 22 dny
Yes, a dual m2 hat with a Boot and storage SSD, and that Hailo card for friggit, object detection , write to a db that is searchable (events) in plain text…😅
@tuttocrafting Před 22 dny ⁺⁴
We need cheap PLX chips that take a newer standard bus in uplink and can provide many downlink at lower speed.
Also on consumer mainboard it is starting to be annoing. 20 or plus pcie gen 5 lanes where almost all hardware is gen 3 and 4.
Lot of throughput lost.
@Level2Jeff Před 22 dny ⁺²
Couldn't agree more. Though those of us needing all that PCIe goodness are a slightly rare breed... and the answer till now is usually buy a big server CPU that gobbles up like 120W idle :D
@tuttocrafting Před 22 dny
@@Level2Jeff Even if we are the minority I think that the approcha that apple did with the MAC pro using the PLX and many lanes/slot is something that other HEDT/Workestantion OEM should start to consider.
New AM5 EPYC has been annoucned but lot of bandwidth is lost if you just plug a single HBA card on today mainboards x16 slot.
@Xamz_pok Před 11 dny ⁺¹
Can these boards be plugged into a full-fledged PC?
@azertyQ Před 21 dnem
There a Qualcomm AI accelerator that use Dual M.2 slots, that board could be perfect for it (I've never been able to find the spec for dual m.2, so the spacing might be off), just one of those could get 200TOPS/25W
Good luck finding one though...
@Megaflare47 Před 22 dny
I think the name is probably "Micro PC city" and the u is used as a substitute for mu because it wouldn't be recognizable at that size.
@turnkit Před 22 dny
before the WHAT it's compelling to know the WHY.
waited for the payoff but I guess I don't get enough why I'd want to use all these TOPs
I'm excited about using new tech but using it is the key. The excitement about the build comes after the excitement about the functional power. So I guess I don't get this video too much.
@guy_autordie Před 22 dny
You need that 12xm.2 card with a 16x connector and put it in your ampere workstation.
@echobucket Před 22 dny
Can you run LLMs on these NPUs?
@alexlandherr Před 22 dny
Maybe call it “Zora” after the AI on the USS Discovery?
@balazsfitz7517 Před 22 dny
Well, technically, _you_ don't see _us_ but _we_ see _you_ ... ;-)
@jameswarnock5655 Před 22 dny ⁺²
I had trouble getting my coral to work with codeproject ai long-term. It would work for a few hours but then stop responding. I don't know where the issue was on that. I mostly just gave up on it.
@Level2Jeff Před 22 dny ⁺²
I had a similar issue in my testing, though didn't take too much time to debug it. One time it seemed to lock up the frontend of the Pi, had to force-poweroff!
@BaldyMacbeard Před 19 dny
It's cool to see that working _technically_, but in terms of inference, that's kinda the equivalent of having a 1000 CPU cores running on a single, shared stick of DDR4 memory. Do you hypothetically have enough cores to run a ton of applications at the same time? Yeah. But how much you'll be ably to do with the system will depend on the throughput of your memory. And it's the same with these little compute sticks. They rely on your Pi's memory so while you - on paper - have a lot of TOPS, running any task that operates on a reasonable amount of data or live streaming data will quickly saturate your onboard memory and pcie bus and prevent you from doing anything useful with multiple NPUs
@AbdelkaderBoudih Před 22 dny ⁺⁹
Stop posting those AI generated video. I know it not you because you didnt recompile the kernel.
@Level2Jeff Před 22 dny ⁺⁴
AI Recompile, go!
@AbdelkaderBoudih Před 22 dny
@@Level2Jeff You mean AI ReTrain.
@GrahamCantin Před 22 dny
Marco Reps at the end?
@nekomakhea9440 Před 22 dny
I wonder how feasible it is to put an abstraction layer in front of the 25 coral TPUs so that the software only sees one big TPU?
That's more-or-less what RAID drivers for ZFS and BTRFS do so clusters of 25+ drives appear as one block device. And what CPUs do to make scalar code think it's still being executed in-order as the only process on a single-threaded CPU in a virtual memory space, despite out-of-order execution, superscalar execution, hyperthreading, multithreading, branch prediction, and more, all happening in the background.
@knifekitty_ls Před 22 dny
time to find some splitters for converting m.2 x4 into 4 m.2 x1 plugs for those dual-edge tpus
@ryamelp Před 22 dny ⁺¹
I know midnightlink irl
@hyperverbal Před 22 dny ⁺¹
I wonder if Bend would work with this config ❤
@davesaquarium4825 Před 3 dny
Have you ever worked with a jetson nano by nvidia?
@thesimplicitylifestyle Před 11 dny
Yes! 😎🤖
@HiddenPalm Před 18 dny
Soooo does it like tell the time?
@truckerallikatuk Před 22 dny
Jeff doing a Jeff impersonation? Tell me it isn't so!
@novantha1 Před 22 dny
So like, objectively I know that the Hailo isn't really meant for inference of production grade LLMs or anything.
...But like, I still want to see if a person couldn't do something silly like a bespoke MoE architecture with 128M active parameters and still get okay quality and speed.

Další v pořadí

Automatické přehrávání

Is The New Orange Pi 5 Pro A Good Raspberry Pi 5 Alternative?