Every Developer Needs a Raspberry Pi

Run your own AI (but private)

The moment we stopped understanding AI [AlexNet]

Co to ti kluci zase vymýšlí?🤭😅

TOHLE JSEM FAKT NEPOTŘEBOVAL VĚDĚT 😅

Секрет фокусника! #shorts

Llamafile: bringing AI to the masses with fast CPU inference: Stephen Hood and Justine Tunney

AI Engineer

zhlédnutí 39 177

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 6. 09. 2024
Mozilla's Llamafile open source project democratizes access to AI not only by making open models easier to use, but also by making them run fast on consumer CPUs. Lead developer Justine Tunney will share the insights, tricks, and hacks that she and the project community are using to deliver these performance breakthroughs, and project leader Stephen Hood will discuss Mozilla's approach to supporting open source AI.
Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at www.ai.enginee... & join us at the AI Engineer World's Fair in 2025! Get your tickets today at ai.engineer/2025
About Stephen
Open source AI at Mozilla. Formerly of del.icio.us, Yahoo Search. Co-founder of Storium (AI-assisted storytelling game) and Blockboard.
About Justine
Justine is a founder of Mozilla’s LLaMAfile project, a Google Brain alumni, and the owner of the Cosmopolitan C Library. She's focusing on democratizing access to open source AI software while elevating its performance and quality.

Komentáře • 75

@Neltharion2k Před měsícem ⁺⁴³
This is so awesome! Just tried out llava 1.5 7b llamafile and it worked out of the box running on my CPU, without eating all of my RAM! The token generation speed was good enough for me! And my CPU is ~8 years old. Holy cow!
@bigglyguy8429 Před měsícem
Where gguf?
@geomorillo Před měsícem
where?
@dskbiswas Před měsícem ⁺²⁷
What did I just watch ...mindblowing! Finally someone took the initiative of going against the tide while giving CPUs some attention that they have lost to the GPU madness!
@Viewable11 Před měsícem ⁺⁸
Llamafile now supports OpenAI API and non-AVX CPUs. Finally! Having the OpenAI API is a must.
@longboarderanonymous5718 Před měsícem ⁺⁸
These individuals are pioneers of the Personal AI. Efficient, Universal, and Economical.
@navodpeiris9054 Před měsícem ⁺¹⁴
loving the llamafile already. this is how i deploy local LLMs now!
@LeftBoot Před měsícem ⁺¹
Local for yourself or clients?
@aeu126 Před měsícem ⁺¹⁵
This was my favorite presentation!
@deadlokIV Před měsícem ⁺⁷
Justine just shifted the timeline 💥🔀
@gunnarasmussen207 Před měsícem ⁺²
Well, what I'm suppossed to say but: awesome...running local AI on normal consumer hardware without any worries about privacy seemed impossible just months ago. All the computational work in GPT, Gemini and others is done in the cloud on the companies servers. So you don't know, what they are doing with your data. Even if you have nothing to hide - I'm sure erveryone has certain things, he/she wants to stay private...this seems to be the right way of implementing AI in a private manner. And doing such a great afford without any commercial Interests is nothing but mindblowing. Keep up the good work, please!
@tejaslotlikar3573 Před měsícem ⁺⁵
Now this is called achievement. Meanwhile the so-called "open"AI is looting people. You guys are awesome
@OranCollins Před měsícem ⁺¹
omg i love Justine Tunney! they are amazing!
@indylawi5021 Před měsícem ⁺³
This is fantastic! I can't wait to try it out.
@LaHoraMaker Před měsícem ⁺²
I really like the idea of a Threadripper configuration but... does anyone have a reference machine configuration for that? I'd like to compare the price to existing alternatives like the dual RTX4090 setup that is mentioned!
@dbreardon Před měsícem ⁺¹²
He said,, "Who remembers using the original Netscape Navigator?" ........to that I say, who remembers using the original Mosaic browser? And then telnet before the graphical internet?
@WoodyWilliams Před měsícem ⁺¹
[raises hand] Doh!
@tinkerman1790 Před měsícem ⁺²
“Who remembers the handshaking tone in dial-up process” 😂
@smthngsmthngsmthngdarkside Před měsícem ⁺¹
Who remembers the original smoke signals?
@Atonsha Před měsícem ⁺¹
How about BTX?
@vncstudio Před měsícem
We do! and Gopher!
@craigscott4205 Před měsícem ⁺⁵
Justine an absolute champion!
@FirstNameLastName-fv4eu Před měsícem
These cloud companies trying their best to keep the valuation high!!! This guy is the new CDO manager!!
@delq Před měsícem
Awesome, exactly what I have been looking for, no more virtual heavy environments, no more heavy nvidea cuda drivers ! Lets fricking go !!!
@Alice_Fumo Před 27 dny
well... I just took a look at the repo for the llama 3 70b llamafile repo and found this info about performance:
"AMD Threadripper Pro 7995WX ($10k) does a good job too at 5.9 tok/sec eval with Q4_0 (49 tok/sec prompt). With F16 weights the prompt eval goes 65 tok/sec."
70b would be the lower bound for model I would enjoy using, but getting like 6 tokens per second output on a 10k$ CPU... At that point I could just as well build a GPU machine...
So, even though I think this is in concept an amazing project, either it or hardware in general has a long way to go still before it is in my opinion usable for an average person such as myself..
(I'm assuming the performance data on the huggingface repo are at least somewhat accurate and not outdated)
@constantinegeist1854 Před měsícem ⁺¹
All of this was already possible before... Already back in early 2023. What they did was just save you 15 minutes (otherwise you'd have to download an inference program and weights separately)
@RomuloMagalhaesAutoTOPO Před 25 dny
Amazing. Thank you.
@spookymv Před 28 dny
it was the first time I had the chance to listen to one of his speeches. bro i like this guy. D:
@rayhere7925 Před měsícem ⁺³
This is a game-changing breakthrough. Can't underplay this any other way.
@leejacksondev Před měsícem
This is utterly brilliant. What a fantastic presentation. Amazing project.
@aiforsocialbenefit Před měsícem ⁺¹
Awesome. Great project and presenters!
@GandalfTheBrown117 Před měsícem ⁺¹
Justine is a GOAT
@raiumair7494 Před měsícem ⁺¹
Refreshing indeed - tokens per seconds is one measure and I like eval speed but what and how do you measure that?
@tollington9414 Před měsícem ⁺¹
Absolutely fascinating and totally genius
@Jason_RA Před měsícem
This is absolutely amazing!
@KevinKreger Před měsícem
Amazing❤
@eggmaster88 Před měsícem ⁺¹
Awesome work!
@omercelebi2012 Před měsícem
What about quality trade-off? Did they mention about that?
@NeXTOoOoOoO Před měsícem
Wow! Really great work!
@CaptainSpoonsAlot Před měsícem
this is just fantastic.
@john_blues Před měsícem
Is there a way to get Windows to run llamafiles bigger than 4Gb? Without being able to do that, it is very limiting in the models you can run.
@XEQUTE Před měsícem
Love it!!
@johnkost2514 Před měsícem
This is better than the Nvidia NIM solution (which is just containerization). Way better ..
@masbuba Před měsícem ⁺¹
Oh shit, CPU prices is going to hike
@Charles-Darwin Před měsícem
Awesomesauce
@ShieldsWebDesign Před 26 dny
Why is no one talking about this?
@philly_eddie Před měsícem
very cool
@Godkidz7 Před měsícem
Freedom and Justices are more expensive than Money and Power. No one live and rule forever.
Respects and Salute to you guys...
@7T7Soulz Před měsícem
this is future
@cholst1 Před měsícem
*checking on RAM prices*
@pandoraeeris7860 Před měsícem
The Singularity is here.
@GandalfTheBrown117 Před měsícem
Tired -> wired around @9:30 😂
@romanbauer Před měsícem
👏🏻👏🏻👏🏻
@erb34 Před měsícem
Don't forget the browser.
@JohnnysaidWhat Před měsícem
this guy is a fkn rockstar on stage I was totally blown away 🎉
@hope42 Před 29 dny
Am I the only one that someone AI generated Matt Perry?
@timchapman8539 Před měsícem
I need an AI that can access the files on my hard drive. Does anyone have a suggestion? I don't want to upload them to the AI. I want the AI to access them directly.
@bigglyguy8429 Před měsícem
ChatGPT4all has RAG
@ravishmahajan9314 Před 21 dnem
NVIDIA has hired CIA agents to make sure this technology is not reaching in hands of public. Be safe sir !😝
@snow8725 Před měsícem
Fuck yeah!!!
@fkxfkx Před měsícem ⁺¹
well this feels like something out of left field.🤷‍♂️
Seems too good to be true. What are the catches?
@projectsspecial9224 Před měsícem
As an AI Design Engineer and developer of original works in Unified Language Models (predecessor to LLMs) for over 20 years, this compact framework, GPU or custom hardware independence, and resource efficient methodology is the correct approach. 😊
@fkxfkx Před měsícem
⁠”a” correct approach but maybe not “the” correct approach. It’s not clear what downsides there are yet.
@bigglyguy8429 Před měsícem
@@fkxfkx I'm not sure how you're supposed to run it? GGUF I can run but what the heck is the 14GB "llamafile" thing?
@maxd3946 Před měsícem
@@bigglyguy8429 actually, you don't need a 14GB llamafile. It's even unable to be run on windows (4GB max executable size limit). You can keep a llamafile without embedding any model in it and call it with the -m parameter to specify the model file to load.
@JimAmos Před měsícem
Hats off for the engineering feat. But in terms of application, we are still just talking about text summarization. And the image generation in your own demo was just as disappointing as ever. There's no killer app for LLMs yet even though we keep throwing money and science at it. What are we even doing?
@TalsBadKidney Před měsícem ⁺²
let's go to the gym
@bobtarmac1828 Před měsícem
Free candy, I mean, Free open source Ai for everyone. It’s a like a trick. Don’t fall for it. Cease Ai.
@WenRolland Před měsícem
Great work!

Další v pořadí

Automatické přehrávání

Every Developer Needs a Raspberry Pi

Every Developer Needs a Raspberry Pi

Run your own AI (but private)

Run your own AI (but private)

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Co to ti kluci zase vymýšlí?🤭😅

Co to ti kluci zase vymýšlí?🤭😅

TOHLE JSEM FAKT NEPOTŘEBOVAL VĚDĚT 😅

TOHLE JSEM FAKT NEPOTŘEBOVAL VĚDĚT 😅

Секрет фокусника! #shorts

Секрет фокусника! #shorts

Sergei Barracuda - Ostrava (OFFICIAL VIDEO)

Sergei Barracuda - Ostrava (OFFICIAL VIDEO)

The Future of Knowledge Assistants: Jerry Liu

The Future of Knowledge Assistants: Jerry Liu

AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"

AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

A Day in the Life of a Machine Learning Engineer (at a *small* startup)

A Day in the Life of a Machine Learning Engineer (at a *small* startup)

Why Nvidia, Tesla, Amazon And More Are Betting Big On AI-Powered Humanoid Robots

Why Nvidia, Tesla, Amazon And More Are Betting Big On AI-Powered Humanoid Robots

Official PyTorch Documentary: Powering the AI Revolution

Official PyTorch Documentary: Powering the AI Revolution

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

I Built a CoPilot+ AI PC (without Windows)

I Built a CoPilot+ AI PC (without Windows)

AMD’s New Laptops: Dumb Name, Wicked Performance - Strix Point Ryzen AI 300 Review

AMD’s New Laptops: Dumb Name, Wicked Performance - Strix Point Ryzen AI 300 Review

Přilepili si tetování na druhou ruku.

Přilepili si tetování na druhou ruku.

Finding out a genie's loopholes in advance

Finding out a genie's loopholes in advance

This Famous Athlete Shocked the Olympics 👟

This Famous Athlete Shocked the Olympics 👟

Když se Netrefíš, Tento Elixír Musíš Vypít!

Když se Netrefíš, Tento Elixír Musíš Vypít!

KONEC PRÁZDNIN…

KONEC PRÁZDNIN…

NEJLEPŠÍ KVÍZ NA YOUTUBE @Duklock @EvilBender47

NEJLEPŠÍ KVÍZ NA YOUTUBE @Duklock @EvilBender47

Wait for it… 😱 #shorts

Wait for it… 😱 #shorts

Gender reveal 🤰🩵 #hannahstocking #shorts

Gender reveal 🤰🩵 #hannahstocking #shorts