Zero to Hero LLMs with M3 Max BEAST

Alex Ziskind

zhlédnutí 108 730

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 1. 06. 2024
M3 Max is a Machine Learning BEAST. So I took it for a spin with some LLM's running locally.
I also show how to gguf quantizations with llama.cpp
Temperature/fan on your Mac: www.tunabellysoftware.com/tgp... (affiliate link)
Run Windows on a Mac: prf.hn/click/camref:1100libNI (affiliate)
Use COUPON: ZISKIND10
🛒 Gear Links 🛒
* 🍏💥 New MacBook Air M1 Deal: amzn.to/3S59ID8
* 💻🔄 Renewed MacBook Air M1 Deal: amzn.to/45K1Gmk
* 🎧⚡ Great 40Gbps T4 enclosure: amzn.to/3JNwBGW
* 🛠️🚀 My nvme ssd: amzn.to/3YLEySo
* 📦🎮 My gear: www.amazon.com/shop/alexziskind
🎥 Related Videos 🎥
* 🌗 RAM torture test on Mac - • TRUTH about RAM vs SSD...
* 🛠️ Set up Conda on Mac - • python environment set...
* 👨‍💻 15" MacBook Air | developer's dream - • 15" MacBook Air | deve...
* 🤖 INSANE Machine Learning on Neural Engine - • INSANE Machine Learnin...
* 💻 M2 MacBook Air and temps - • Why SILVER is FASTER
* 💰 This is what spending more on a MacBook Pro gets you - • Spend MORE on a MacBoo...
* 🛠️ Developer productivity Playlist - • Developer Productivity
🔗 AI for Coding Playlist: 📚 - • AI
Timestamps
00:00 Intro
00:40 Build from scratch - manual
09:44 Bonus script - automated
11:21 LM Studio - one handed
Repo
github.com/ggerganov/llama.cpp/
Commands
//assuming you already have a conda environment set up, and dev tools installed (see videos above for instructions)
Part 1 - manual
brew install git-lfs
git lfs install
git clone github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
make
git clone huggingface.co/teknium/OpenHe... openhermes-7b-v2.5
mv openhermes-7b-v2.5 models/
python3 convert.py ./models/openhermes-7b-v2.5 --outfile ./models/openhermes-7b-v2.5/ggml-model-f16.gguf --outtype f16
./quantize ./models/openhermes-7b-v2.5/ggml-model-f16.gguf ./models/openhermes-7b-v2.5/ggml-model-q8_0.gguf q8_0
./quantize ./models/openhermes-7b-v2.5/ggml-model-f16.gguf ./models/openhermes-7b-v2.5/ggml-model-q4_k.gguf q4_k
./batched-bench ./models/openhermes-7b-v2.5/ggml-model-f16.gguf 4096 0 99 0 2048 128,512 1,2,3,4
./server -m models/openhermes-7b-v2.5/ggml-model-q4_k.gguf --port 8888 --host 0.0.0.0 --ctx-size 10240 --parallel 4 -ngl 99 -n 512
Part 2 - auto
bash -c "$(curl -s ggml.ai/server-llm.sh)"
💻 MacBooks in this video
M2 Max 16" MacBook Pro 64GB/2TB
- - - - - - - - -
❤️ SUBSCRIBE TO MY CZcams CHANNEL 📺
Click here to subscribe: / @azisk
- - - - - - - - -
Join this channel to get access to perks:
/ @azisk
#m3max #macbook #macbookpro
- - - - - - - - -
📱 ALEX ON X: / digitalix
Věda a technologie

Komentáře • 312

@AZisk Před měsícem
JOIN: youtube.com/@azisk/join
@AdamsTaiwan Před 12 dny
Just tried the LM Studio and my desktop. Was able to connect my 8 year old notebook's vsCode using Code GPT to it. Pretty nice, but still looking for a solution that can scan all my vs solution and tell me where to fix my problems.
@giovannimazzocco499 Před 6 měsíci ⁺¹²
Excellent stuff. I searched CZcams for weeks to find benchmarks of DNN models on M3. This is the first and only one I've found so far. There's is a ton of videos on video editing, graphics, gaming and music production on M3s. But for what concerns fresh material about machine learning on Apple Silicon I'm pretty convinced you're the only game in town. Keep it up. Looking forward to seeing more benchmarks.
@Kevin-hx3ci Před 4 měsíci ⁺⁸
Alex I am so happy I found your videos on CZcams because I had been looking for someone to help tutor me on tech stuff on Mac. Can’t express how helpful this has been for me.
@atldeadhead Před 5 měsíci ⁺²
I enjoy all your videos but this one was particularly interesting. I look forward to future videos that explore machine learning leveraging the power of the M3 Max. Fantastic stuff, Alex. Thank you!
@catarinamoreira4805 Před 6 měsíci ⁺⁵
This is fantastic! Thank you so much! More content on LLMs, please!
@MaxTechOfficial Před 6 měsíci ⁺¹¹⁰
Keep up the good hustle, Alex! -Vadim
@AZisk Před 6 měsíci ⁺¹¹
Thanks Vadim!
@univera1111 Před 6 měsíci ⁺¹
@@AZisk if I may ask, can you replicate this on a Linux or windows and see which is easier for users. Or u can just say here
@zt9233 Před 6 měsíci
@@univera1111also benchmarks
@abhishekjha9041 Před 6 měsíci
@@AZisksir please make a video for MacBook pro specifications for Machine learnings . I'm so confused about what to buy 16inch with 30 core 96gb ram Or 16inch with 40 core 64 GB ram. Or I have to buy a m3 pro 18 core 36gb ram. I'm so confused and like me other people also so please make a separate video on that it's a request
@abhishekjha9041 Před 6 měsíci
@@AZiskAnd I have a question that I do some research and find out that MacBook pro in Delaware have zero sales tax which means if I buy MacBook pro in 2500 dollars so I don't have to give any tax on it. It's is true sir.
@anthonyzheng7274 Před 6 měsíci
You are awesome! This is great, I bought an M3 Max several days ago and really having a great time playing around with LLM's.
@JohnSmith762A11B Před 4 měsíci ⁺¹
Excellent. Many thanks for putting this together! 🥂
@juangarcia-wp2zr Před 6 měsíci ⁺²
very cool content, thanks, I feel very curious now to try out some of this llms
@SebastianWerner82 Před 5 měsíci ⁺¹
Great to see you creating videos with this type of content as well.
@facepalmmute3619 Před 6 měsíci ⁺¹
the bass in your voice on the MBP speakers is phenomenal
@RadAlzyoud Před 6 měsíci ⁺²
Brilliant. Thanks for sharing.
@suburbanflyer Před 5 měsíci
Thanks for this Alex! Just got an M3 Max so it'll be great to try out some new things on it, this definitely looks interesting!
@bdarla Před 6 měsíci
Super helpful! I hope you will continue with further relevant videos!
@joshgarzaBI Před 2 měsíci ⁺¹
Awesome video here. I'm bummed I didn't do it sooner. I have never seen my M1 (16GB) freeze before. Great teaching here!
@nikolamar Před 6 měsíci ⁺¹
Alex this is AWESOME!!! Thank you!
@ismatsamadov Před 5 měsíci ⁺¹
I subscribed a few months ago, but I have never seen such quality content. Thanks, Alex! Keep going.
@AZisk Před 5 měsíci
thx 🙏
@mr.w7803 Před 5 měsíci ⁺¹
Dang!! Dude, this video sold me on that M3 Max configuration… this is EXACTLY what I want to do on my machine
@estebanguillen8110 Před 5 měsíci
Great video, looking forward to the LLM fine-tuning video.
@tonbii Před 3 měsíci ⁺¹
i bought M1 Max with 64GB 3 years ago to do this kind of works. I am so happy to find this video.
@_mansoor Před 4 měsíci
Awesome, Thank you.
Halo Alex!!!🎉🎉
@bawbee27 Před 3 měsíci ⁺³
Incredibly helpful - this is the video everyone with an Apple Silicon machine trying to do LLM’s should see!
@DivineZeal Před 2 měsíci
Great video! Thinking about getting the MBP M3 for llm
@scosee2u Před 5 měsíci ⁺⁵
I really love your videos and how you explain these cutting edge concepts! Would you consider researching or interviewing someone to make a video about quantizing options and how it impacts using llms for coding? Thanks again for all you do!
@AZisk Před 5 měsíci ⁺²
Possibly!
@jorgeluengo9774 Před měsícem ⁺¹
Thank You Alex, this is an amazing video. I will look into the software development tools installation.
@AZisk Před 14 dny
Awesome! Thanks
@dennisBZC Před 20 dny
Hey Alex,
I’ve been watching many of your videos, mostly for comedy - as I find you hilarious the way you explain things to a non-tech mortal, but occasionally, try to copy your instructions and try my luck to test out a few things for fun. I’m not one for cutting code, but I still watched the whole thing, just to get to the LM Studio to download a model to try out on my M3 Max. I tried the Phi3, thinking Microsoft might be better than the others.
I don’t have a clue what I’m doing, but it seems to work a little.
You are a LEGEND!
Keep up the great work. Love to see how you train your AI in due course.
I keep shouting at it to “sit”…my MacBook hasn’t moved, so I guess, it is quite obedient.
@jameshancock Před 6 měsíci ⁺²
Nice! Thanks!
FYI when you change the preset you’re changing how it inputs into the LLm. Which caused it to go nuts.
@gargarism Před 6 měsíci ⁺¹²
I think the very first thing I will try out on my already ordered M3 Max, will be to follow what you did. The whole reason I bought the M3 Max is to work with machine learning. So thanks a lot!
@AZisk Před 6 měsíci ⁺¹
Good choice!
@zt9233 Před 6 měsíci ⁺¹
@@AZiskis m3 max as good as nvidia for this?
@pec8377 Před 6 měsíci ⁺²
@@zt9233 no it's not, unless you want to run large model that won't dit into nvdias cards, they Will Always beat M3 GPU. Maybe not when ANE IS activated, but none of thé tools présentes hère supports core ml
@MikeBtraveling Před 6 měsíci
If you are looking for a laptop to work with LLMs on you cant really beat the Mac for models larger than 7bP and you want them to run locally@@zt9233
@camsand6109 Před 6 měsíci
Glad i subscribed. you've been on a roll lately (new subscriber).
@MikeBtraveling Před 6 měsíci
very interested in the topic and would love to see you do more in this space.
@SergeyZarin Před 6 měsíci ⁺¹
Thanks great video explaining !
@AZisk Před 6 měsíci
Glad it was helpful!
@LukeBarousse Před 6 měsíci ⁺³
Interesting, I didn't know about LM Studio; that makes things A LOT cleaner
@geog8964 Před 5 měsíci
Thanks, Alex.
@devdeal4146 Před 3 měsíci ⁺¹
Just got the m3 max with 48gb ram. Excited to see how it works with your tutorial. Thanks!
@TimHulse Před měsícem
That's great, thanks!
@Mrloganphillips1 Před 2 měsíci
I had so much fun with this project. I just got a m3max and wanted a project to work on. After I got llama running I made a bash script to run the command and trigger a second bash script to open a browser window to the ip address after a 5s delay to let the server get up and running first. then I made a shortcuts button to run it. now I have on demand llm with an easy to use on/off button.
@aimademerich Před 2 měsíci
Thank you for the GPU setting in LM Studio at 15:00!! Can you do more videos on proper GPU setup on LLM's for M1-3?
@ChitrakGupta Před 6 měsíci
That was really good. I learnt something and was fun to run on the new M3 Max
@JasonHorsnell Před 6 měsíci ⁺⁵
Just got myself an M3 Max and found your videos. You’ve saved me SO MUCH TIME…..
Very much appreciated…..
@danieljohnmorris Před 2 měsíci
How much ram?
@JasonHorsnell Před 2 měsíci
⁠36GB max base. More than enough for my purposes atm.
@TimHulse Před měsícem
Same here!
@theperfguy Před 6 měsíci ⁺¹¹
I have to commend you for your effort.
I havent seen any other reviewer showing any other usecase than media comsumption, synthetic benchmarks and video encoding and editing.
You are perhaps the only youtuber I know who tries out other things like code compile time and ML workloads, which is what is going to run on majority of the high end machines.
@AZisk Před 6 měsíci
Glad it was helpful!
@eldee8704 Před 3 měsíci
Awesome tutorial! I bought the 14" MacBook Pro M3 Max base model for this to try out.. lol
@user-kj4ik3qm9d Před 6 měsíci
Thank you so much for making this video, it was really helpful. Please do more this kind of coding videos and testing on m3 macbook, and push them to the limits, I think you are the best channel for this because you have the knowledge and intention to do these things and it will be win win situation for both of us
@sujithkumar8261 Před 5 měsíci
Are you using macbook m3 base variant?
@MikeBtraveling Před 6 měsíci ⁺⁴
I bought a maxed out M3 max to do this, please run the larger models with ollama, when using LM studio you need to make sure you are using the correct prompt template for the model, i think that was your issue.
@kman41000 Před 6 měsíci ⁺¹
Awesome video man!
@AZisk Před 6 měsíci
Glad you enjoyed it
@kingmargie1182 Před 5 měsíci
Great job!
@astrohgamingZero Před měsícem
Looks good. I use text-generation-webui and the chat/chat-instruct modes or input presets can make or break some models.
@SimoneFolador Před 6 měsíci
Thanks about the video man! I loved it and it helped me a lot since I wanted to try some models on my machine. What's your experience on fans on the M3 Max machine? I've read that they are pretty noisy and it becomes pretty hot as well. I still have an Intel machine (last generation) with 64GB ram and 2TB drive but i wanted to buy a new M3 max
@XNaos Před 6 měsíci ⁺¹
Finally, I waited for this
@juliana.2120 Před 6 měsíci ⁺¹
ohh i love that you use conda here because it really helps me keep my hard drive clean with all those different AIs :D im an absolute beginner so i'm afraid of installing stuff i cant find later on.
some people say its "outdated" and runs in errors too often but i cant really judge that. is that true?
@BenWann Před měsícem ⁺¹
I couldn’t agree more - I wanted to really sink my teeth in ML since it’s been a while - and I bought a MBP m3 max after seeing your comparisons. Sorry I couldn’t use an affiliate code - micro center had a killer deal on it :(. I look for your videos to drop now, and look forward to what you come up with next.
@justisabelll Před 6 měsíci ⁺⁴
Great video, really looking forward to the next few ML related ones. You might have had better results with LM studio though if you disabled mlock after enabling Metal GPU. Also the model output looks nicer if you enable markdown in the settings as well.
@yinoussaadagolodjo4549 Před 5 měsíci
How to disable mlock ? Can find it !
@joshbarron7406 Před 6 měsíci ⁺¹¹
I would love to see a token/second benchmark between M2 Max and M3 Max. Trying to decide if should upgrade
@abhinav9058 Před 4 měsíci
Hey did you upgrade?
@tomdonaldson8140 Před 6 měsíci ⁺²
Love it! Looking forward to the training video(s). Now I want a Mac Studio M3 Ultra! Oh, no such thing yet? Come on Apple! We’re waiting!!!
@jigyansunanda Před 6 měsíci
looking forward to your training models video
@AliHussain-jh3iq Před měsícem
Insightful video Planning to get a MacBook Pro M3 Max for LLM work. Should I go for 1TB or 2TB, 14 or 16-core CPU, and 64GB or 128GB RAM? Thanks for your insight!
@mercadolibreventas Před 5 měsíci
Kep it up! Good Job! Can you do a video on getting Llama Factory set up on the M3, Thanks!
@keithdow8327 Před 6 měsíci ⁺²
Thanks!
@AZisk Před 6 měsíci
🤩 thanks!
@amermoosa Před 6 měsíci ⁺¹
amazing. just shrinking the whole second grade of engineering college in 17 minutes. incredible 😊
@stephenthumb2912 Před 6 měsíci ⁺¹
thanks for testing. it's interesting that even with enough memory, still some slowness on the bigger model quants. my base M2 8gb can run the q4 7b's barely.... prefer ollama using cli which will run at usable tps. it's sort of ok with LM Studio, but generally I need to run 3b's or below with q4 quants. Orca-mini 3b is sort of the default test standard for me on 8gb mac's incl. the mac metal air. can confirm, using the mac metal checkbox, causes runaways. textgen funnily runs fine with mac metal suport as well.
@user-th8rb5gz3p Před 6 měsíci ⁺¹
Alex, thanks.
@AZisk Před 6 měsíci
You bet!
@pbdivyesh Před 6 měsíci ⁺¹
You're a good lad, thank you!🎉😅
@JunYamog Před 6 měsíci
Thanks for this content, more dev tilt which is useful for me. I am contemplating on getting MBP after giving my old dev MBP to my niece. Based on your video it seems I would be best to buy as much ram as I can afford and roughly 30B models would need 32Gb shared ram. Possibly better than PC as limited by VRAM? Also I wonder how practical compared to using a cheaper MBP and renting out GPU cloud for the occasional big models? Got more budget constraints.
@theoldknowledge6778 Před 6 měsíci
This LM Studio is Lit 🔥
@stanchan Před 5 měsíci ⁺³
The performance of the M3 is amazing. Waiting for the refreshed Studio, as the M3 Ultra will be a beast. Hoping it will have the 256GB RAM as predicted.
@timelesscoding Před 4 měsíci ⁺¹
Interesting stuff, I wish I could understand a little more. Thanks
@saitaro Před 6 měsíci
Thanks for the video, Alex! How does M3 Max compare to M2 Max for ML?
@uninoma Před 4 měsíci
cool thank you !!!!🤟
@Xilefx7 Před 6 měsíci ⁺¹
Can you test the LLM perfomance in low power mode? I believe Apple needs to optimize how they handle the thermals of the MacBook Pro with the m3 max.
@marabgol Před 5 měsíci
Thanks Alex! great videos watched 2 so far, do you have videos or plan to make videos how to fine-tune Llama2 models on Metal ?
@AZisk Před 5 měsíci ⁺¹
Not yet! but considering digging more into this area on the channel
@innocent7048 Před 6 měsíci ⁺¹
Very interesting article. I will try this :-)
@AZisk Před 6 měsíci
🤩 thanks so much!
@chillymanny714 Před 6 měsíci ⁺¹
This is a great video, I think if you were to make videos to teach intro/intermediate data analyst how to build LLMs or a series of videos to try different application creation using Macs M chips, that it would be a big hit. I will try to replicate your approach
@syedanas2083 Před 6 měsíci
I look forward to that
@TheMetalMag Před 6 měsíci
it's written MBP 2 on your terminal on your MBP3? that's a big job again. You're well into that dev stuff. well done
@AZisk Před 6 měsíci
yep. i have to properly name my machines :)
@user-wg3rr9jh9h Před měsícem
Best LLM build video CZcams ❤. I’m buying my 36GB MacBook Pro M3 Max 14 Core cpu with 30 core GPU. Planning on launching a CZcams AI/Ml channel soon 🧐.
@hamiltonwmr189 Před 5 měsíci ⁺¹
If you are going to do any intensive task on MacBook then keep it charged at 80% using al dante. Dont let run the models on battery as churning though cycles will damage it's help ,keep it on power adapter and 80% charging. I did some intensive training on my m1 Pro and it went from 100 to 96% battery health in 1 year.
@CitAllHearItAll Před 3 měsíci
4% loss in 1 year is normal. I'm at 2+ years on M1 Pro with 86% battery health. You're either trippin or trollin.
@salahidin Před 6 měsíci
Yesss he did it!!!
@paulmiller591 Před 2 měsíci
Great video. Any chance you could revisit LM Studio now? Is it better supporting M3? I am considering swapping out my old Intel MacBook Pro and I do Generative AI development work.
@justingarcia500 Před 6 měsíci ⁺¹
Hey could you do a low battery mode test on the m3 max as you did with your m1 max a while back
@jakubjan44 Před 5 měsíci
good stuff!
@TarunKumar-yf8sj Před 6 měsíci
Can you please tell how good will these LLMs manipulation with be carried out by M3 Pro Macbook Pro 16 inch.
@davidpsp89 Před 6 měsíci ⁺³
super interesting and useful, I take this opportunity to ask about Matlab again and its real performance, since Apple's on its page is not real
@Renzsu Před měsícem
Hey Alex, can you please do a video on Stable Diffusion on Mac? I'm on the fence on getting one.. the shared memory is tempting, it allows for more than using a discreet GPU. But I wonder how the speed is..
@aalhaimi Před 2 měsíci
Alex, thanks so much for this.. Quick Question: When running the batched-bench command, I noticed that there are no benchmarks printed under the corresponding table. The table is coming out empty. Everything else seems good.. Any idea why?
@abhinav23045 Před 6 měsíci ⁺¹
That fan noise is like feel the power of AGI.
@AZisk Před 6 měsíci
😆
@juliana.2120 Před 6 měsíci
have you used localAI yet and would you recommend it if so? as far as i understand it uses the same API formatting like GPT so its working with a lot of already existing GPT tools
@anastassogoldschmied Před 6 měsíci
Can you run LLMs like Llama or Mistral on the Apple Silicon NPU?
I think it was possible with stable-diffusion but that is a completely different thing.
@AmpiroMax Před 6 měsíci ⁺¹
Please, can you compare tokens generation speed of any llama-like model on m3, m3 pro, and m3 max?
@user-ob7fd8hv4t Před 5 měsíci ⁺²
Is it the 96GB version of the M2 Max, what do you think, I want to deploy my own 13B model locally (train the model with some relatively sensitive data), or even become my 'digital clone', do you think the 38c 96GB M2 Max is a suitable choice?
@radnaut Před 5 měsíci
So very awesome 😎
@Stewz66 Před 5 měsíci ⁺²
If you had the M3 Max, 128GP/4TB, and you wanted to do data analysis and visualization in python, which LLM would you use?
@redgenAI Před 4 měsíci
@Alex Should the m3 max with 128gb be able to run a 70b model? And what do you think is the largest model it could fine tune with qlora?
@ericadar Před 5 měsíci
I dont have a local machine with the right specs. What do you recommend for running studio LLM on a cloud instance?
@christopherr8441 Před 5 měsíci ⁺³
If only we could directly access and use the Apple Neural Engine for doing things like this. Imagine the speed and performance gains.
@muhammadyounis7090 Před 29 dny
Hi Alex,
thanks for the great content. currently I'm planning to buy new macbook M3 Max for AI work, and I'm hesitating between M3 Max 14/30 with 96GB Ram and 16/40 with 64 GB Ram, they both have similar price tag so I'm not sure to go for the extra 10 GPU cores or extra 32GB of Ram. note that I'm not a pro user (yet) but I want to be able to run locally on new models, train or fine tune my own models with ease, and of course I plan to have the machine with me for 5 years to come..
what should I choose?
thank you!
@randomdude6205 Před 5 měsíci
Awesome video! Any comments on how does it compare with some modern Nvidia GPU? Also, what about training times for small-ish models?
@tacorevenge87 Před 4 měsíci
Training models locally isn’t efficient even if you do on a pc with the latest nvdia. Why not do it on the cloud ?
@martinluker-brown6569 Před 5 měsíci
Is there anyway to run a local model cia a plugin to JetBrains or VSCode. JetBrains as an AI assistant that use’s OpenAI however from an IP front that’s a no no, using a local model would be acceptable.
@liquathrushbane2003 Před 6 měsíci
Following along using LM Studio on Windows, using the same model (7B), and it appears to be broken. Just typing in "hello" at the prompt and the response shows the AI writing out a question (presumably from the forum where the data was scraped) followed by the answer to that question. Completely ignores me. Tried the 13b and it seems better.
@ClckLabs Před 5 měsíci
Hi, have you tried on the 48GB M3 Max base model?
@bern_stock8946 Před 5 měsíci
Can you please do a tf installation video for m3?
@Jorge-ls9po Před 2 měsíci
Nice vid. Now, with the M3 Max, should I stick to 64 GB of unified RAM for this sort of tasks? A jump to 128 GB will cost me a thousand bucks more. Cheers!
@DavidCampero26 Před 4 měsíci ⁺¹
Hi Alex! I would love to see a comparison between M3 Max 14/30 and M3 Max 16/40 with the same processes for LLMs. I read that many people is going with the base model M3 Max and I would like to see how much difference there is. If you know of someone who did it, please let me know!! I want to buy a laptop as soon as possible!! Thanks!!

Další v pořadí

Automatické přehrávání