LLMs with 8GB / 16GB

Sdílet
Vložit
  • čas přidán 5. 09. 2024
  • Can a modern LLM like llama 2 and llama 3 run on older MacBooks like MacBook Air M1, M2, and Intel Core i5? Sort of and i depends on which model.
    Temperature/fan on your Mac: www.tunabellys... (affiliate link)
    Run Windows on a Mac: prf.hn/click/c... (affiliate)
    Use COUPON: ZISKIND10
    🛒 Gear Links 🛒
    * 🍏💥 New MacBook Air M1 Deal: amzn.to/3S59ID8
    * 💻🔄 Renewed MacBook Air M1 Deal: amzn.to/45K1Gmk
    * 🎧⚡ Great 40Gbps T4 enclosure: amzn.to/3JNwBGW
    * 🛠️🚀 My nvme ssd: amzn.to/3YLEySo
    * 📦🎮 My gear: www.amazon.com...
    🎥 Related Videos 🎥
    * 🛠️ FREE Local LLMs on Apple Silicon - • FREE Local LLMs on App...
    * 💰 Apple's Memory Bandwidth claims - • REALITY vs Apple’s Mem...
    * 🌗 RAM torture test on Mac - • TRUTH about RAM vs SSD...
    * 🤖 INSANE Machine Learning on Neural Engine - • INSANE Machine Learnin...
    * 💰 This is what spending more on a MacBook Pro gets you - • Spend MORE on a MacBoo...
    * 🛠️ Developer productivity Playlist - • Developer Productivity
    🔗 AI for Coding Playlist: 📚 - • AI
    - - - - - - - - -
    ❤️ SUBSCRIBE TO MY CZcams CHANNEL 📺
    Click here to subscribe: / @azisk
    - - - - - - - - -
    Join this channel to get access to perks:
    / @azisk
    - - - - - - - - -
    📱 ALEX ON X: / digitalix
    #machinelearning #llm #softwaredevelopment

Komentáře • 157

  • @TechGameDev
    @TechGameDev Před 3 měsíci +66

    10:05 I believe that, for machine learning, it uses VRAM. On Intel Macs, it does not use unified memory and does not share RAM with the graphics card, as the graphics card has its own dedicated RAM called VRAM. In contrast, Mac Silicon shares RAM with the graphics card, making the GPU's RAM demand almost unlimited.

    • @laloreta798
      @laloreta798 Před 3 měsíci +12

      I doubt a macbook air has a discrete GPU, therefore the memory is still shared but not unified. Unified memory means that both the GPU and CPU can access the data stored on it at the same time. while regular CPU/GPU configuration just divides the memory e.g 8GB is devided 6GB for cpu and 2GB for GPU and they are not aware of what the other has stored in the other partition.

    • @AlixAxel
      @AlixAxel Před 3 měsíci

      This 👆

    • @aptfx
      @aptfx Před 3 měsíci +3

      @@laloreta798 Apple Silicon is a unified memory architecture

    • @ganderthepanda8146
      @ganderthepanda8146 Před 3 měsíci +2

      I was going to answer along these lines. The vram on the intel chip is probably 1-2gb.
      The models are designed to use the gpu.
      Hence system memory isn’t being used

    • @la6188
      @la6188 Před 3 měsíci

      This

  • @nommchompsky
    @nommchompsky Před 3 měsíci +23

    Me and my 16gb M1 Air are thankful for this video

  • @QuantumCanvas07
    @QuantumCanvas07 Před 3 měsíci +15

    Kind of stuff I was searching for. Thanks Alex

  • @burprobrox9134
    @burprobrox9134 Před 3 měsíci +3

    I had a powermac in the 90s with 16 slots for ram, my last new Mac was the first gen Air, and a core duo mini. I was working for Apple at the time and got a crazy discount. I really miss the old days and Jobs was the best boss ever. I’ll never forget his goodbye email to employees, we were literally all tearing up. Feels like a different universe since then.

  • @SvenReinck
    @SvenReinck Před 3 měsíci +8

    Q4 means the weights of the model are saved as 4 bits. The original is in FB16 which is floating point numbers with 16 bits.

    • @jmunkki
      @jmunkki Před 3 měsíci +4

      What I wonder is if Alex actually knows this, but doesn't explain it, because he thinks it's too technical or if he doesn't know it and makes up something. A CZcamsr making videos about AI should know, but I'm guessing he doesn't know. He doesn't even explain the drawback of quantization.

    • @GreatForest-mh7sl
      @GreatForest-mh7sl Před měsícem +1

      @@jmunkki well, at least the vid can save our time to test mem usage on our own lol. and thats the most valuable from such kind vids

  • @dmug
    @dmug Před 3 měsíci +3

    I compared a M1 Mini vs a 2013 Mac Pro, and one of the tests I did was with Ollama. It was one of the very few tests that the Mac Pro 2013 had the clear advantage thanks to the 64 GB of ram

  • @0xCUBE
    @0xCUBE Před 3 měsíci +7

    great videos! You should do a video comparing the various 7B-16B models

  • @halycano
    @halycano Před 3 měsíci +7

    Since you have an older Mac, it would be interesting to see trying to do modern dev work on these older unsupported Macs. If you could do it on a Mac that used OCLP, that would probably be a more interesting video.

  • @propavangameryt405
    @propavangameryt405 Před 2 měsíci +2

    0:20 😅 i still own a MacBook Air 2015 core i5 model, it still works perfectly fine for regular browsing watching movies & stuff but 😂 I don’t have to keep it plugged omg

  • @ZaidAjani
    @ZaidAjani Před 3 měsíci +5

    watching this on 2017 macbook air :)

  • @whoadog8725
    @whoadog8725 Před 3 měsíci +1

    I have a complete oddball m2 Mac Mini with 24gb of ram that I got as a refurb from Apple. I need to try some of the new models out.

  • @synen
    @synen Před 3 měsíci +16

    For the price Apple charges for RAM upgrades you can get years of OpenAI API tokens, no need for localized LLMs.

    • @shapelessed
      @shapelessed Před 3 měsíci +15

      The point of local LLMs is exactly what you'd think (or maybe not, since you didn't get it) - They are *local*.
      Many prefer wasting 300GBs of their diskspace to host a bunch of their own LLMs simply because they don't send off every single spec of data they can to whoever knows how many affiliates and data brokers.

    • @synen
      @synen Před 3 měsíci

      @@shapelessed Another condescending post on the Internet.
      Don't assume please, you make an ass out of u and me.

    • @tacorevenge87
      @tacorevenge87 Před 3 měsíci +3

      Security and privacy

    • @andyH_England
      @andyH_England Před 3 měsíci +3

      With cloud LLMs, you are sharing your life with who knows what. I do not recommend it, especially for anything personal, business or educational. On-device LLMs are the way to go. And if you need to use LLMs then the extra cost of RAM is moot as it will probably pay for itself in no time.

  • @miacodesswift
    @miacodesswift Před 3 měsíci +4

    I’ll try on a mid 2020 macbook air with a 5700XT egpu

    • @TheStopwatchGod
      @TheStopwatchGod Před 3 měsíci +1

      It won't work under macOS because Ollama only supports Apple Silicon GPUs

  • @RichWithTech
    @RichWithTech Před 3 měsíci +4

    Can you do more amd / Apple arm/ snapdragon Comparisons pls

  • @aflury
    @aflury Před 3 měsíci +3

    Quantized models are trained on less data? I thought they were just reduced precision representing the same training. Like turning up lossy compression, it gets pixelated.

    • @AZisk
      @AZisk  Před 3 měsíci +1

      yes exactly. reduced precision not less data

    • @3750gustavo
      @3750gustavo Před 3 měsíci

      Quantized just makes the model less sure of the next token by a tiny bit, the less sure the model is of the next token on that topic, more chance that a high quant will affect its performance on that topic

  • @peterihimire
    @peterihimire Před 3 měsíci +69

    Is 8gb RAM enough in 2024? Apple Yes, others No.

    • @TamasKiss-yk4st
      @TamasKiss-yk4st Před 3 měsíci +3

      Apple even only used 4GB RAM when Android had 16GB models (S20U/S21U..), and guess what, those 4GB models still got upgrades but the S20U with 16GB flagged as not strong enough to get upgrades.. the Windows is the same, made for hundreds of different machine, meanwhile Apple use their OS on way less models. Because others can't make their OS better don't demand the same from everyone else.. or do you also demand 4 doors on the sport cars, because 2 doors not enough for you..? Just simply by that version what fit you.. (you don't need to buy their cheapest midel.. if you want cheap model, there are cheaper models from other manufacturers, if nobody buy their laptops that is a feedback from the customers..)

    • @peterihimire
      @peterihimire Před 3 měsíci +14

      @@TamasKiss-yk4st Well I agree and disagree at the same time. Their operating system are optimized for their hardware. Good point.
      But Apple knew the 4gb sucks in modern smartphone that’s why they didn’t continue with it they had to upgrade. Apps are no longer simple to develop , they need all those resources to run efficiently.
      These generations of MacBooks will become a lot more e-waste faster than the older generations. Those older generations lasted because they were upgradable, same is true for all those upgradable windows laptops.
      Most people especially developers buy MacBooks because without MacBooks it’s difficult to develop for apple platform.
      In my opinion, apple isn’t doing a lot of magic. The reasons why things seems to be pretty good on their end, stems from the fact that they have fewer configurations and hardwares to work on, therefore it’s easier for them to optimize while windows and android don’t enjoy such.

    • @kaustubhkumar797
      @kaustubhkumar797 Před 3 měsíci +10

      Not enough, even for apple. Even for everyday light weight application multitaksing, macbooks start using swap memory which shows that they are in a shortage of ram.

    • @XashA12Musk
      @XashA12Musk Před 3 měsíci +5

      The main question is Why Apple charges 200$ for 8 gigs of ram

    • @andyH_England
      @andyH_England Před 3 měsíci +2

      Apple is knocking $200 off the 512GB/16GB MB Airs, which are cheaper than 95% of premium Intel Windows ultrabooks. So you just be sensible and ignore the 8GB option and realise that ATM the MB Air is a valuable alternative to Intel premium ultrabooks. However, I suspect Windows OEMs will start slashing the prices of their Core Ultra machines, with Apple undercutting them and the new kid on the block making waves.

  • @lalitsharma3137
    @lalitsharma3137 Před 3 měsíci +1

    Please do a Mac mini review when it gets upgraded.

  • @3750gustavo
    @3750gustavo Před 3 měsíci +1

    Should have mentioned to avoid quants that don’t have k_m K_s or _x, q4_00 for example is worse and slower than q3_xs

    • @husanaaulia4717
      @husanaaulia4717 Před 2 měsíci

      What does that mean

    • @3750gustavo
      @3750gustavo Před 2 měsíci

      @@husanaaulia4717 that’s only valid for those that seek downloading the gguf version of the file, if downloaded on huggingface, most models have a table explaining what are the best ones for each size, never download directly from the meta website

  • @broccoloodle
    @broccoloodle Před 3 měsíci +15

    Now we need to activate the special Apple spell to make 8GB become 16GB.

  • @disgustingdust1584
    @disgustingdust1584 Před 3 měsíci

    Just though I would say, I'm running Ollama on my 2013 Mac Pro with 64GB of ram and it runs fine.

  • @marsnotoshi
    @marsnotoshi Před 3 měsíci +1

    Did you use AI to generate all of the sound fx (RAM going up then down) ? They're dope ! :D

  • @froggy5967
    @froggy5967 Před 3 měsíci +1

    Alex, could you please share your new keyboard and sound test?😅

  • @AntonioDellElceUK
    @AntonioDellElceUK Před 3 měsíci

    you should do some comparisons with the 24GB Air.... it is the only Air I would buy and I believe many others that would do real memory intensive work (containers, etc) would peek the 24GB version.

    • @MultiNakir
      @MultiNakir Před měsícem

      i can't recommend anything more than basic users buying a fanless device to cook the almost impoosible to change and serialized soldered ram and ssd ... also 60hz is kinda bad for the price they ask for it ... i'd rather get a 14" M3 Pro with 18 gb ram if 36 is too pricy and have a working fan in there especially since M3 lineup is pretty unhinged and gets hot fast

  • @philipgeorge301
    @philipgeorge301 Před 3 měsíci +1

    Well, it’s not the MacOs as much as it is the architecture of the chips. I have a 2019 MBP 16 that behaves exactly like your old MacBook. I’m no computer expert so this is just theoretical at the end of the day

  • @TheFredFred33
    @TheFredFred33 Před 3 měsíci

    Nice 👍 no… sorry 😅 a very smart video Alex ! So very useful to show LLM is not the same thing than an ML algorithm. The needs of RAM are very high and the workflow very different too. When Tim Cook talk AI it’s mainly ML as Srouji, Millet or Ternus. I saw many questions about the role of ANE in a LLM context. Apple hardware designer and CoreML devs are not really sharp about this subject. ANE is better than GPU for specific and « small » ML algorithms. Very useful but limited from my point of view. No doubt GPU are necessary of LLM and can be optimized as memory management as you said. This memory management and data prioritization concerns SoC hardware implementation and OS software too. Lot of stuff for Apple teams, but as this time only 1 LLM is given in CoreML library : a Google Bert.

  • @tls_9920
    @tls_9920 Před 17 dny

    This may be a silly question but here goes. Would it be possible to load one of these models onto a high speed ssd I understand the model will still need to be loaded into ram so your ram would still need to be enough to hold the model while running, but for “long-term” storage or if you wanted to have multiple models available off line and had a small device storage, would it be viable to store them this way?

  • @matthewstott3493
    @matthewstott3493 Před 3 měsíci

    WWDC is next week and we know Apple has been working on A.I. so there should be some very interesting new LLM features. The M4 SoC has an improved neural engine and CPU / GPU AI acceleration. Will be interesting to see how that shakes out between now and the end of 2025 when all Macs should be refreshed to come with an M4 based SoC..Curious about the comparison to the Qualcomm Snapdragon X Elite SoC that seems to be copying many of the Apple Silicon capabilities.

    • @TechGameDev
      @TechGameDev Před 2 měsíci

      The neural improvements of the M4, according to benchmarks, are somewhat weak compared to the M1 and M3, where there was nearly a doubling in performance.

  • @verim
    @verim Před 2 měsíci

    The problem I see with quantisation of models is that llama3 8B Q4_0 is completely useless compared to llama3 8B without quantisation. llama3 8B Q4_0 completely fails to follow the instructions in the prompt which llama3 8B executes without any problem. if we just want to talk to the model which is Q4_0 then no problem, if we want to build some solution locally and have a high understanding of our prompt, we are left using Llama3 8B-instruct-fp16

  • @OmPatel1211
    @OmPatel1211 Před 3 měsíci +2

    You say M2 and using M3 MacBook Air!

  • @aadarshunniwilson8517
    @aadarshunniwilson8517 Před 3 měsíci +1

    If you have a macbook pro 2019. It has a amd gpu. Could you test on it

  • @vinayakbhosale7750
    @vinayakbhosale7750 Před 3 měsíci +1

    How do I know which model to pick that would work best for my use case? Is there a recommendation catalogue somewhere? Or have the community used them and shared their experience in terms of benchmarks somewhere which I can refer ?

  • @TheStallion1319
    @TheStallion1319 Před 3 měsíci +1

    how is this compared to a cloud solution, would the experience of running them and using the same ?

  • @muffitytuffity5083
    @muffitytuffity5083 Před 2 měsíci

    4bit quantized llama 3 is pretty bad. Llama 3 doesn't quantize as well as lots of other open source models. It was trained for so long that the weights are really saturated and losing precision hurts.

  • @drill_fiend1097
    @drill_fiend1097 Před 3 měsíci

    I have a feeling MS's Phi could run well on Air laptops with low ram. What's your thought?

  • @PrPappia
    @PrPappia Před 3 měsíci

    I installed an LLM on my Air M3 16GB; the Llama 7B ... it was enough but if I get a bigger one I don't think the Mac will like it.

    • @PrPappia
      @PrPappia Před 3 měsíci

      Mainly because of the lack of fans for cooling, as RAM is fine, but the CPU heated up quite quickly 🥲

    • @andyH_England
      @andyH_England Před 3 měsíci +1

      The general consensus is that 32 GB of RAM is the optimal amount of RAM to run an LLM as it will allow multitasking while the LLM stays in memory. So, there are better choices than the MB Air. If you can afford the M3 Pro or Max, wait for the M4 with the Gen2 3nm.

    • @PrPappia
      @PrPappia Před 3 měsíci

      @@andyH_England It was mainly out of curiosity that I installed the LLM (and when I'm working in areas with no connections). But thank you very much for your message, it's really interesting!

  • @cryshot8071
    @cryshot8071 Před 22 dny

    It is concerning to see that Apple is offering 8GB RAM as base since 2017 and it hasn't changed at all.

  • @Hairyson-g5j
    @Hairyson-g5j Před 3 měsíci +1

    I’m quite curious as to whether 64gb upgrade on Mac Studio is worth it, or I could spend the money on 512gb-1tb upgrade

    • @TechGameDev
      @TechGameDev Před 2 měsíci

      better to buy a 64 go of ram and buy a external ssd enclosure and ssd nvme you can upgrade I would advise you to wait for the M4 because the M2 is really outdated.

    • @Hairyson-g5j
      @Hairyson-g5j Před 2 měsíci

      @@TechGameDevyeah its really frustrating that m3/4 mac studio is not announced at wwdc this year

  • @manoharmeka999
    @manoharmeka999 Před 3 měsíci +1

    Question-2: How often do you see swap memory being used in 16GB, 32GB etc? Even if with lite work, would you see the swap being used all the time)

    • @chri-k
      @chri-k Před 3 měsíci +2

      You would see swap being used 100% of the time on all memory sizes, but that's just because macOS always uses/reserves some swap.

    • @manoharmeka999
      @manoharmeka999 Před 3 měsíci

      @@chri-k Then will that cause same amount of damage to the SSD in the long term meaning we can't leave the things on there without taking backup? Do you think the swap will kill the SSD in 10-15 years on regular usage?

    • @andyH_England
      @andyH_England Před 3 měsíci

      @@manoharmeka999 Apple uses disk RAM for swap. That is usually 1-2GB of DDR4 per RAM module, so as Apple is using two modules at 256GB and 512GB, that equates to 2-4GB of swap being stored on disk RAM. The flash storage is therefore rarely touched, so the longterm affect on SSD life is negligible for average users. If you are a pro using RAM-intensive apps, you must buy hardwired RAM for your needs.

    • @zapomnij2126
      @zapomnij2126 Před 3 měsíci

      I have M2 Air with 16GB and it really rarely uses swap. Even when it something is loaded into it, it usually is a one request and the OS simply doesn't remove it from swap.

    • @chri-k
      @chri-k Před 3 měsíci

      @@manoharmeka999 SSD damage due to writes is not something the average person should think about.
      The main factor causing SSD damage isn't even writes, it's firstly the physical damage to the drive caused by random unwanted chemistry occurring over time, and secondly manufacturing defects which cause some select blocks, or in rare cases the entire drive to fail much faster than normal.
      And according to a couple of studies, including an internal one by Google, SSDs do not actually fail faster than hard drives do under use.
      ( but unlike HDDs, SSDs will degrade even when not actively in use, which is why people still use hard drives as external storage, eg for "archiving" movies )
      That said, the swap won't matter much, since that swap isn't being _actively_ used.

  • @kakaaika3302
    @kakaaika3302 Před 3 měsíci +1

    so which is better on LLMs between 200GB/s M2 Pro and 150GB/s M3 Pro?

    • @TechGameDev
      @TechGameDev Před 2 měsíci +1

      The M2 Pro has more graphical power than the M3 Pro, which seems strange since the M3 Pro has fewer transistors. In any case, the M3 Pro will be more performant in 3D applications because it has several accelerations such as mesh shading and ray tracing. However, for LLM, it mainly requires pure graphical power, so the M2 Pro is better than the M3 Pro. Additionally, the M2 Pro has better bandwidth. I advise you to wait 4 months for the M4 Pro MacBook Pro, which will be much more powerful and catch up with the M3 Pro.

  • @manoharmeka999
    @manoharmeka999 Před 3 měsíci

    Glad your 8GB one didn't blast

  • @pweddy1
    @pweddy1 Před 3 měsíci +1

    Could you do a comparison of AI on a PC vs this?

  • @christianfelix1869
    @christianfelix1869 Před 2 měsíci

    hi guys, it might be a little OOT here, but here goes nothing:
    so, I'm considering to buy an M1 Macbook Air Base Model (8/256) to do some Machine Learning tasks (mostly are csv files, i don't do LLMs or anything that need a lot of storage for datasets).
    If eventually someday I need to clear up some storage space for much more bigger and complex datasets, how should I store/manage my storage efficiently?
    do you think using external hard drives solve this issue? or is there more efficient to tackle this problem?
    thanks for answering!

  • @user-pp3dl8id7r
    @user-pp3dl8id7r Před 3 měsíci +2

    Great content. Let's say an LLM is installed and running in the background and another AI program is running simultaneously (Opera browser or Gemini for example). What happens when they " bump" into one another at the same time? Do either of them work? Do they compete with each other for RAM, CPU, GPU and neural engine resources? What do you think? Has anyone figured this out?

    • @AZisk
      @AZisk  Před 3 měsíci

      thx!

    • @user-pp3dl8id7r
      @user-pp3dl8id7r Před 3 měsíci +2

      ​@@AZiskany comment on my questions?

    • @lbgstzockt8493
      @lbgstzockt8493 Před 3 měsíci +1

      They do compete for resources, but they don’t collide, as that would be a massive security problem. As far as one program is concerned, the others don’t exist. It is the job of the scheduler to grant access to resources to different programs over time.

  • @GabrielThaArchAngel
    @GabrielThaArchAngel Před 3 měsíci

    In your opinion what is the best model to use that will have the best results when returning/fixing JS, HTML, and CSS?

    • @AZisk
      @AZisk  Před 3 měsíci

      i still use github copilot for that, mostly

  • @ritammukherjee2385
    @ritammukherjee2385 Před 3 měsíci

    Works fine on my zephyrus g14 2021 16gb in silent mode on battery power... Similar speeds as m1 and m2.....
    x86 still has some hope 😂

    • @zapomnij2126
      @zapomnij2126 Před 3 měsíci

      Disconnect it from the power source and the you'll realize that it doesn't have any hope.

    • @ritammukherjee2385
      @ritammukherjee2385 Před 3 měsíci

      @@zapomnij2126 I have written battery power... Read again ..get some hope😃

    • @SimonVaIe
      @SimonVaIe Před 3 měsíci

      @@zapomnij2126 maybe they added it in after your reply, but it says "in silent mode on battery power"

    • @zapomnij2126
      @zapomnij2126 Před 3 měsíci

      @@SimonVaIe oh i didn't see it 💀

    • @zeppelin0110
      @zeppelin0110 Před 3 měsíci

      @@zapomnij2126 Why would anyone do that? With x86/x64, it's a given that if you want the full performance, you have to be plugged in.

  • @dansanger5340
    @dansanger5340 Před 3 měsíci

    Apple's RAM policy is going to have to change if they want developers to continue to use Macs. Running LLMs takes a ton of memory, and there's a difference between spending a few hundred extra and a few thousand extra to get a Mac.

    • @andyH_England
      @andyH_England Před 3 měsíci

      At least Apple offers 128GB RAM laptops, which are rare in Windows land. Most businesses using LLMs can afford Apple prices because that is how they make their living. Currently, average users download the LLM that fits their RAM configuration. However, Apple will announce its own LLM at WWDC, so things will change as it is better optimised and uses the neural engine.

    • @zapomnij2126
      @zapomnij2126 Před 3 měsíci

      They want apple kids to use Macs. Developers are the second class which they don't really care about.

    • @horsecrow6258
      @horsecrow6258 Před 3 měsíci

      They probably want people to use OpenELM models, much smaller

    • @JosepCrespoSantacreu
      @JosepCrespoSantacreu Před 3 měsíci

      Probably you will won the prize for the most stupid opinion in this year. Congrats…

    • @dansanger5340
      @dansanger5340 Před 3 měsíci

      @@JosepCrespoSantacreu Please elaborate.

  • @scottfranco1962
    @scottfranco1962 Před 2 měsíci

    These things need to be in RAM and not disk?

  • @whatever1538
    @whatever1538 Před 3 měsíci

    @AZisk Does locally running a LLM affect the lifespan of the RAM?

    • @TechGameDev
      @TechGameDev Před 2 měsíci

      No the RAM is designed to last a very, very long time. In the next 15 years, you might start to notice some wear, but by then, you will likely have changed your computer.

  • @chetangupta2612
    @chetangupta2612 Před 3 měsíci

    How much storage is enough for this

  • @coldspring22
    @coldspring22 Před měsícem

    Seems very odd you want to run LLM on non memory upgradable macs. My old dell 7480 has 64GB ram and cost just $150 including 64GB of ram. Memory is the king as far as running large memory intensive programs are concerned.

    • @AZisk
      @AZisk  Před měsícem

      what’s your bandwidth though

  • @madeniran
    @madeniran Před 3 měsíci

    I have 16GB MBP from 2013 (1.5GB VRAM Iris Pro & 2GB VRAM NVIDIA GT 750m), it can’t run any MacOS beyond Big Sur 😂

  • @RSV9
    @RSV9 Před 3 měsíci

    I am surprised by the low speed of your internet. I thought you would have at least 1Gbps.
    Good job

  • @aibi5532
    @aibi5532 Před 3 měsíci

    make a video fir windows

  • @The_s_d
    @The_s_d Před 3 měsíci

    i hope that Apple AI gives me a better approach to Shortcuts. Shortcuts are way to complicated and too restricted. For example Me „Hey Apple AI, create a shortcut for the camera with a 10 second slomovideo in landscape and i want to have access with the action button“
    Apple AI: „Ok“

  • @whatever1538
    @whatever1538 Před 3 měsíci

    Isn't that a M3 and not a M2?

  • @AKBARESFAHANI
    @AKBARESFAHANI Před 3 měsíci

    Why not use Phi3?

  • @comrade_rahul_1
    @comrade_rahul_1 Před 3 měsíci

    Can't we transfer these onto the NPU (In Apple terms, Neural Engine) instead of GPU?

    • @TheFredFred33
      @TheFredFred33 Před 3 měsíci

      No NPU is adapted for small machine learning algorithms not a huge LLM. It is a small component of the SoC, optimized for specific IA tasks

    • @comrade_rahul_1
      @comrade_rahul_1 Před 3 měsíci +1

      @@TheFredFred33 NPUs aren't limited to ML tasks as far as I know. LLMs can be run on the NPU cores I would say. If that's not possible, 38 and 40 TFLOPS are actually of no use. They aren't any gimmick and as far as my understanding goes, they are totally capable of running at least medium-sized LLMs.

    • @TheFredFred33
      @TheFredFred33 Před 3 měsíci

      @@comrade_rahul_1 ok but i’ve never see a recent LLM used with Ollma or MLX activated the ANE 🤷🏻‍♂️

  • @vishwamartur
    @vishwamartur Před 3 měsíci

    M1 8GB ram hangs runs slowly llama3 8b

  • @user-cf9ir4gw2c
    @user-cf9ir4gw2c Před 3 měsíci

    X Elite laptop with 32 GB incoming 🙂

  • @MrSamPhoenix
    @MrSamPhoenix Před 3 měsíci

    Moral of the story… get more RAM for your Mac.

  • @manoharmeka999
    @manoharmeka999 Před 3 měsíci

    Question: do you advise people to go for M2 Pro on offers? Or wait for M4 Pro or Max? What is the new M4 series bring in with respect to power, nueral processing? How much of an improvement could we expect?

    • @chri-k
      @chri-k Před 3 měsíci

      The M4 isn't that much better in general benchmarks than the previous ones ( although i haven't seen any AI-focused ones ), however i wouldn't trust the benchmarks until full release.
      It's likely not going to be much of an improvement since it's an extremely heavy redesign, (although that might be different for AI specifically)
      I don't see any reason to not wait for M4 if you do want to buy a new one though, since even if you don't buy it, M4 fully releasing will make the previous models cheaper.

    • @andyH_England
      @andyH_England Před 3 měsíci

      Wait for the M4 models, as they are significantly better than the previous models. They are the first actual chip cycle upgrade since the M1.

  • @kyrsid
    @kyrsid Před 3 měsíci

    RISK vs SISC

  • @jerickojamestano626
    @jerickojamestano626 Před 3 měsíci

    is that a keychron q1 max there? 0:30

    • @AZisk
      @AZisk  Před 3 měsíci

      yes. what do you think of it?

    • @jerickojamestano626
      @jerickojamestano626 Před 3 měsíci

      @@AZiski love the keycaps. I will buy it later haha!

  • @jj-yb3no
    @jj-yb3no Před 2 měsíci

    haha, obviously don't try fine tuning in any of these

  • @yorkan213swd6
    @yorkan213swd6 Před 3 měsíci

    Why not using the npu ?

    • @TheFredFred33
      @TheFredFred33 Před 3 měsíci

      Because the NPU doesn’t have enough ressources to host all neurones structures. It is better for contained Machine Learning algorithm but it has few limits. Mainly modern LLM needs GPU power and a bunch of RAM memory !

    • @yorkan213swd6
      @yorkan213swd6 Před 3 měsíci

      @@TheFredFred33 Don't understand. In my Mac mini the NPU has also access to the RAM.

    • @TheFredFred33
      @TheFredFred33 Před 3 měsíci

      @@yorkan213swd6 NPU is not an unlimited engine, there are sram memories associated to the Neural Engine cores and a limited of computational component. A NPU is optimized to work with a range of IA algorithms but it is a small IP bloc in comparison of GPU IP bloc of a Apple chip. Devs do not address NPU directly, they use CoresML which routes to the right unit CPU-AMX, ANE or GPU.

  • @g4vI7
    @g4vI7 Před 3 měsíci

    Download speed: 62 mb/s.
    Wow.😐

    • @AZisk
      @AZisk  Před 3 měsíci +1

      a lot of people must be downloading those models :)

  • @EsquireR
    @EsquireR Před měsícem

    1 upvote = 1 MB of extra RAM, 1 comment to download your RAM

  • @leomogiano27
    @leomogiano27 Před 3 měsíci +1

    First comment 🤯

  • @dinoscheidt
    @dinoscheidt Před 3 měsíci

    Can’t wait for the noobs coming in, when apple offers 0GB of ram. Me, as an ML engineer, I don’t care about ram. I care about memory bandwidth to the cores. And apple has done a tone of work, to basically make the disk the ram. The speed from disk to GPU is already inane. Stuff x86 machines simply can’t do. But pc noobs wont understand that 😮‍💨

  • @bobbastian760
    @bobbastian760 Před 3 měsíci

    Yeah but we want the non woke uncensored truth based models…