Run a GOOD ChatGPT Alternative Locally! - LM Studio Overview

MattVidPro AI

zhlédnutí 27 614

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 23. 05. 2024
LM Studio is a desktop application that allows users to run large language models (LLMs) locally on their computers without any technical expertise or coding required. It provides a user-friendly interface to discover, download, and interact with various pre-trained LLMs from open-source repositories like Hugging Face. With LM Studio, users can leverage the power of LLMs for tasks such as text generation, language translation, and question answering, all while keeping their data private and offline.
▼ Link(s) From Today’s Video:
LM Studio: lmstudio.ai/
Uncensored Models: huggingface.co/Orenguteng/Lla...
► MattVidPro Discord: / discord
► Follow Me on Twitter: / mattvidpro
► Buy me a Coffee! buymeacoffee.com/mattvidpro
-------------------------------------------------
▼ Extra Links of Interest:
AI LINKS MASTER LIST: www.futurepedia.io/
General AI Playlist: • General MattVidPro AI ...
AI I use to edit videos: www.descript.com/?lmref=nA4fDg
Instagram: mattvidpro
Tiktok: tiktok.com/@mattvidpro
Second Channel: / @matt_pie
Let's work together!
- For brand & sponsorship inquiries: tally.so/r/3xdz4E
- For all other business inquiries: mattvidpro@smoothmedia.co
Thanks for watching Matt Video Productions! I make all sorts of videos here on CZcams! Technology, Tutorials, and Reviews! Enjoy Your stay here, and subscribe!
All Suggestions, Thoughts And Comments Are Greatly Appreciated… Because I Actually Read Them.
Věda a technologie

Komentáře • 194

@MattVidPro Před měsícem ⁺¹⁵
A decent uncensored model for everyone to install into LM Studio: huggingface.co/Orenguteng/Llama-3-8B-Lexi-Uncensored-GGUF/tree/main
@LouisGedo Před měsícem ⁺²
I love local installed AI
@AmazingArends Před měsícem ⁺¹
I LOLed when you decided to turn it into ChaosGPT or SupremacyAGI 😂 !!!
@LouisGedo Před měsícem
@@AmazingArends
😘
@spadaacca Před měsícem ⁺³
Is there a big difference between the Q4, Q5, Q8 models? They're similarly sized, so not sure if it's worth getting a bigger one.
@bigglyguy8429 Před měsícem ⁺¹
@@spadaacca Yeah, the higher the number the better, but generally a Q4 is pretty good. Q3 really goes downhill. The larger the size the slower it will run on your machine, so you need to find the model and speed that suits you.
@LilBigHuge Před měsícem ⁺¹¹
Finally someone covering LM Studio! It's the very best out there.
@MazdaSpeedBee Před měsícem
for baby pcs
and people have been talking about lm man.
@amkire65 Před měsícem ⁺¹⁸
Totally agree with recommending Llama 3 8B Lexi Uncensored. I've used the System Prompt to give mine it's own personality, age, sense of humour, mood, etc. A bit of fun, but who wouldn't want an assistant that's tailor-made to suit them? Now, just need to figure out how to give LM Studio a voice, some one has done it, but I get errors when I try following along.
@bloxyman22 Před měsícem
Alltalk with koboldcpp is very easy to setup.
@SiCSpiT1 Před měsícem ⁺²
I make the standard Llama 3 take me to the "dark web" to launder money. It's just a roleplay but pretty funny.
@Pepius_Julius_Magnus_Maximu... Před měsícem ⁺⁴
Awesome tool, I had no idea this existed, thank you so much Matt
@MrPablosek Před měsícem ⁺²
This is so great. I rarely ever wanted to goof around with local LLMs because the oobabooga UI was honestly pretty horrible to understand and do anything with it. This one is simple and clean.
@Deljron777 Před měsícem ⁺⁴
Thank you Matt running AI locally is super important
@MichaelLaFrance1 Před měsícem ⁺²⁸
LM Studio + Ollama 7B is the way to go. Don't need a crazy hardware setup, and it's uncensored.
@MuktadirAlam Před měsícem
what setup are you using? tia
@PossumsDont69 Před měsícem
What is the practical implication of being uncensored?
@stickmanland Před měsícem
ollama 7b? Hmm...
@Fustercluck06 Před měsícem ⁺⁵
Amazing video man, thank you!
@spadaacca Před měsícem ⁺⁵
God, I love how unfiltered this local LLM is. It's not the smartest, but it's the most honest discussion I've ever been able to have with any LLM...or really any human for that matter!
@PopoRamos Před měsícem
nice, what topics did you discuss about?
@TPCDAZ Před měsícem ⁺¹
Been using LM Studio for awhile now. Great piece of kit especially since they have added the GPU offload option which now makes the LLM's wizz along.
@Streeknine Před měsícem
This is a great new setup. I had an old uncensored LLAMA local setup but it was very small and not very useful... but this one has multiple chats and works well. Thanks for the video and information.
@bobbykingAiworld Před měsícem ⁺³
Your videos bring fresh insights and kindle a flame of curiosity within me.🌟🎥🤔
@dalecorne3869 Před měsícem ⁺¹⁴
I tried making a bogus ad about a bogus Head Shop to use as a radio spot, and none of the gpts thought it was a thing to do. They all refused me. I just now installed the LM Studio and am running that Llama 3 llm, and it has already spit out 5 different styles of that ad for me. This is great. Thanks.
@user-bc2kc9hn1p Před měsícem ⁺¹
blaze 24 7
@dalecorne3869 Před měsícem
@@user-bc2kc9hn1p Me too
@rheymanda1074 Před měsícem
I just asked ChatGPT for a newspaper ad and it doesn't have an issue
---
**[Header: Bold and Eye-Catching]**
**Grand Opening of Edmonton Smoke & Research!**
---
**[Body Text]**
**Elevate Your Experience with the Best in Legal Highs!**
Edmonton, get ready to explore new heights with *Edmonton Smoke & Research*! We are your ultimate destination for premium glassware, unique rolling papers, top-tier accessories, and cutting-edge legal highs.
**Grand Opening Celebration**
Join us this Saturday for our grand opening bash! Enjoy exclusive discounts, live music, and a chance to win epic prizes. Don’t miss out on the latest and greatest in the world of heady innovation.
**Why Choose Edmonton Smoke & Research?**
- **Premium Glassware:** Handcrafted pieces to suit every style.
- **Unique Rolling Papers:** Add flair to your sessions.
- **Top-Tier Accessories:** Everything you need to enhance your experience.
- **State-of-the-Art Legal Highs:** Explore our wide range of research chemicals and legal highs, all compliant with the latest regulations. *(Not for human consumption, wink wink)*
**Knowledgeable and Friendly Staff**
Our team of experts is here to guide you through our extensive selection, ensuring you find exactly what you need.
**Location**
Visit us at 123 Edmonton Avenue, right in the heart of the city.
**Stay Connected**
Follow us on Instagram @EdmontonSmokeResearch for updates, special offers, and the latest news in legal highs.
**Edmonton Smoke & Research**
Where Quality Meets Innovation. Be there!
@religionisapoison2413 Před měsícem ⁺¹
The censorship is real. I never imagined it would get this out of hand. Adults get their adult tools censored more than young adult books. Wtf is going on
@sydroyce Před měsícem
Thanks so much for introducing me to this amazing AI assistant! I'm really excited to explore the possibilities. Your content is always inspiring and informative, and I appreciate how you share your knowledge with the community.
@nathanbanks2354 Před měsícem ⁺⁷
Note that you can change the system prompt using the OpenAI Playground or using the API (9:25). In this case, you'll have to pay per token, but $5 goes a long way with either GPT-4o or GPT-3.5.
@Earl_E_Burd Před měsícem ⁺¹
Great video thanks for the demo
@michaelandremovies Před měsícem ⁺²
You are the freaking bomb man!! This is insane!
@RealQuickComics Před měsícem ⁺¹
Great work thanks brother 👍
@timomustamaki5407 Před 7 dny
What I love about LM Studio is that it really is a hassle-free install. No need to download half dozen developer toolkits on your machine, pull random stuff from github and wonder why it still does not compile. Just download and install.
What I do not love is the performance. Or rather, I do not understand how it scales. I have tried three different GPUs on LM Studio (GTX1060 6GB, Tesla M40 24GB and P100 16Gb) and they all perform about the same (same hardware and software otherwise and 100% gpu offloading). On some models the 1060 is actually faster (tokens/sec) than the P100 which just does not make any sense.
Bottom line: it is an extremely easy way to running your own language models that costs nothing, highly recommended :)
@thegooddoctor6719 Před měsícem ⁺¹
Great content as usual !!!!
@MahsaShirazian Před 13 dny
simple and informative
thank you!
@AllenParks1 Před měsícem
Nice review , Ive been running Lm studio for a while. I like neural beagle and a couple of others.
@michaelmcwhirter Před měsícem ⁺¹
Thanks for another great video 🔥 Do you do all your own edits?
@24-7gpts Před měsícem
Fire 🔥🔥 video as usual!
@davidpurple3698 Před měsícem
Super, thanks a lot
@lpanebr Před měsícem
Thanks!
@juancarlosgonzalez8950 Před měsícem ⁺³
Wouldn't it be funny if we had just watched Matt doom the entire human race to an AI apocalypse at 8:53?
@Otis_Isaacs Před měsícem
Good video, keep it up
@64jcl Před měsícem
LM Studio is great. I use the server mode and can call it from my own AI agent software.
@brockly7916 Před měsícem ⁺²
GPT-4o voice and uncensored but locally... HOLY F**** imagine the possibilities.. also create him or her own voice or accent.
@davidoswald5749 Před měsícem ⁺²
I've been using LM Studio for a while, it's pretty great for accessing different models, as long as your system can handle different ones
@SiCSpiT1 Před měsícem
All you really need is 8GB of VRAM and a model that's under 6GB to fit the context window into memory.
@alewar01 Před měsícem
Check out Stability Matrix, by Lykos AI. Same concept, but for Diffusion models.
Cool video as always Matt.
@MilesBellas Před měsícem ⁺²
Offline = amazing!!!
@gabrielsandstedt Před měsícem ⁺¹
If you had set GPU offload setting to max layers (the one your left at 10) it would reply about 10 times faster if your GPU can fit the model on its VRAM.
@SonOfTamriel Před měsícem
If you install one of these on an SSD with space (ie. My E: drive), will it use your main C: drive for temp/cache? My C: drive isn't very big. Some software I have just defaults a temp folder to the OS drive and all of a sudden I have no space... I plan to build a new rig soon so that won't be an issue, but in the meantime
@MrDonCoyote Před měsícem ⁺³
Can this be used for image generation, models? Because then I could use the LLM to create the image and Stable Diffusion to draw it, similar to ChatGPT with Dall-E. That would be really nice.
@starblaiz1986 Před měsícem ⁺²
No, but it's honestly pretty straight forward to create a Python script to talk to the local LLM, get it to generate a more detailed prompt for Stable Diffusion, and then feed that detailed prompt to the Stable Diffusion API. Just make sure that you start the LMStudio server and the Stable Diffusion server on different ports and point the code to the API's accordingly.
@MrDonCoyote Před měsícem
@@starblaiz1986 Why would I need it to create a prompt? I already know the prompt. My point is Stable Diffusion can only generate images based on what it's been trained on. Thus the need for more detailed LLM instructions.
@robxsiq7744 Před měsícem
really wish they would offer things like: connect to SD so it can generate images (with SD up and running) the same way ChatGPT can pop in an image from Dall-E, and voice...and persona files with a bit of depth...basically copy ChatGPT a bit closer. Currently downloading/installing Ollama which has a closer function to CGPT...mostly because I want to run OpenRouter API through it...have a model beefier than what I can load, but less expensive than ChatGPT overall..
@RamonGuthrie Před měsícem ⁺⁷
Just wait till Matt finds out about Open WebUI his mind will be blown ....you need to do a video on that!
@zrakonthekrakon494 Před měsícem ⁺¹
I’ve never heard of it, blow my mind
@perschistence2651 Před měsícem ⁺²
I would say Llama 8B is definitely more intelligent than GPT 3.5 Turbo but GPT 3.5 is a bit more reliable and can speak more languages better.
@GES1985 Před měsícem ⁺¹
Can you train it further, like a Lora, by using E books?
@isajoha9962 Před měsícem ⁺¹
Does LM studio support the LLMs reading local files or eg describe images locally?
@SiCSpiT1 Před měsícem
I think you'll need coding knowledge to make that work. Anything LLM is an app that has build in RAG function but it's not very robust, I haven't played around with it enough but I'm not convinced it's useful for anything I need.
@isajoha9962 Před měsícem
@@SiCSpiT1 I used something similar (GPT4All) to LM studio a while back that had a diminished kind of way of reading files, but it totally went bananas when I updated it, so I deleted that app.
@Cylonick Před měsícem
How does it compare to Jan (another desktop application that runs LLMs)?
@aftsfm Před měsícem
What about Jan? It doesn't have a lot of features but its nitro engine is quite fast.
@Arc_Soma2639 Před 25 dny ⁺¹
Where is the download path of the models? like suppose I want to erase some models to make some space on my SSD.
@xaratemplate Před měsícem ⁺¹
Is their a LLM for generating images locally? Do you have a video tutorial on it?
@SiCSpiT1 Před měsícem ⁺¹
czcams.com/video/KTPLOqAMR0s/video.htmlsi=TaHZIcQpFs-maVmp
In my option, this is the easiest way to get Stable Diffusion installed and running on your home machine. I'd recommend using movie stars as a prompt templets for your subjects while you're getting use to how to prompt and what all the dials and knobs do. If you want to learn more he has a helpful playlist as well, including dad jokes. Have fun.
@nebuchadnezzar916 Před měsícem
I really want a vision capable model, let us know when one of those is available please.
@trelligan42 Před měsícem
My use case: House Mind. I basically want my own Jarvis. Multiple personalities so I can switch from Spanish tutor to Math tutor to computer file sorter/duplicate finder, and always have control of house lights, appliances etc. I'll wait a while for the holography suite😜 #FeedTheAlgorithm
@henkejohansson8585 Před měsícem ⁺⁵
What model is preferable to run on 48gb ram and a 4090?
@MattVidPro Před měsícem ⁺⁴
You should be able to do llama 70b fairly well
@joelface Před měsícem ⁺¹
@@MattVidPro I'd love if you were able to upgrade your PC to run Llama 70b. Something you'd consider?
@SiCSpiT1 Před měsícem ⁺²
Stick to the smaller models if you care about speed. Ignore models that are larger than the size of your VRAM.
@TransformXRED Před měsícem ⁺¹
Set "GPU offload" to max.
You'll see a BIG increase of speed ;)
I think you have a 3090 or 4090 right?
I always put it on max with my 3090, and it generates so much faster
@temp911Luke Před měsícem
Howdy, how many tokens/sec do you get when using Q4 or Q5 model ?
@SiCSpiT1 Před měsícem
Pro tip: max out the GPU slider on the right. To maximize speed you want to be able to fit the entire model into your VRAM, rule of thumb, the model should be 2GB smaller than the GPUs VRAM. The quants can be viewed as a compression technique, the smaller the number the lower the quality of the model. Generally, 4Q is a nice sweet spot for testing models and 5Q almost gives full quality outputs. This isn't always the case but following this rules you'll have a good time.
Bonus point if you can make the standard Llama 3 take you to the darkweb. Have fun.
@GES1985 Před měsícem
If you have a really good pc, can you give things more ram/vram/etc? Like, if comfyui needs 16, can I give it 24? If I have 128RAM can I give it some of that too?
@aleeez007 Před měsícem
Hi, can we generate copyright free images with this model?? Also can we change the system prompt to work as a writing assistant for blog post writing or any other writing task??
@FryadSaeed Před měsícem
Can you do a video on Coze?
@paulhill1662 Před měsícem ⁺²
❤ Can it be used to make AI agents to run a small etsy shop? ❤❤❤
@okolenmi7511 Před 15 dny
I'm running 34B model on my 4GB VRAM with speed of 3tokens per second. I'm using 3GB of VRAM out of 4 to avoid problems with other graphic software. I think, there is no problem to load something even bigger on good GPU.
@einlinguist Před měsícem
Seems that you like to play "The World ends with you" on DS ;-)
@esmaeilalkhazmi Před měsícem ⁺²
does LM Studio require internet to run the model?
@SiCSpiT1 Před měsícem ⁺¹
nope
@CDIGS-EI-hv3cf Před 27 dny
im realy confused by the roles... what is the difference between assistant and system? who is responding, when i write a message?
@GES1985 Před měsícem
Are the larger ones objectively better? Like 70b vs 8b
@joelface Před měsícem
That's what I want to know as well.. how much better is 70b compared to the 8b. I'm actually amazed that the 8b model is only like 5 gbs. I assumed it would be more like 30gb.
@JChaosMaster Před měsícem
Still wanting for more A.i based games. T.T
@temp911Luke Před měsícem
You forgot to set "GPU Offload" to MAX,
hence you get barely 9 tokens/sec.
On 4060 you will get between 30-42 tokens/sec (Q4/Q5)
@L_tlu Před měsícem
when i try to launch it, it just shows the logo in the taskbar, and when my mouse hovers over it, it disapears. what do i do?
@aleeez007 Před měsícem
And can we install this llama3 model on Google Collab?
@apache937 Před měsícem
For whatever reason LM Studio doesn't fully offload the models to your GPU by default. You have to increase the layers to offload to max yourself. It can be so much faster!!
@tichpo8411 Před měsícem
are you able to create images and surf the web?
@BlackMita Před měsícem ⁺¹
It just needs a pdf reader :D
@DihelsonMendonca Před měsícem
💥 What model can accept voice imput and output ? Text to speech, like Chatgpt Voice for Android ? 🎉❤
@MilesBellas Před měsícem
Ask it technical questions about Comfyui, does it answer them ok ?!
@SINYC02 Před měsícem
So I can generate copyrighted logos with one of these models?
Před měsícem ⁺¹
i used llm studio a lot, but i cant load a visual model as yet succesfully. this would be hugh. not generating images more recockniton. If you have it working. Would be a nice video.
@okolenmi7511 Před měsícem
You can run Stable Diffusion and some other types of models in ComfyUI. If you want to run only stable diffusion models you can use Automatic1111's UI - it have more "user friendly" web interface, but also it's not so optimized as ComfyUI.
Před měsícem
@@okolenmi7511 yes i do this as well. What i didnt't could ran was Image recogmition Like vllava
@brainwithani5693 Před měsícem
Greetings
@AlvinBrinson Před 21 dnem
System prompts sadly are broken because after a few pages it "forgets" the system prompt.
@3djimmy Před měsícem
Great stuff many thanks
@noahbalboa5714 Před měsícem
wondering if this app is blind usable.
@FSK2 Před měsícem
Can i run roop or face fusion
@cleverai2270 Před měsícem
I would like to integrate it if possible to my game Cursed Dungeon Raider so that you could chat with the NPCs at the Black Market or the Historical Museum. But probably not important quest relevant ones.
Moreover, an extra 5 GB RAM while running the game itself can be too much for some people's PC.
Nevertheless, I really like to test that out. Let's see if this is possible with that.
@justinwescott8125 Před měsícem
User: "...Oh my! Master Chief!"
AI: "It's me, Patrick."
Hmmm, not very impressive
@MattVidPro Před měsícem
better prompt would get the correct results - also for this example a completion tuned model would work much better (Not fine tuned for chat)
@sleeplesstortoise Před měsícem
Yo bro, suno 3.5 just dropped!
@PunjabiGhazal Před 27 dny
how can you run the model using 2 computers.
@tracyrose2749 Před měsícem
The license says they use your CPU power when not in use for CRYPTO.... check what you're signing up for
@ASENDOMUSIC Před měsícem
woah what?
@thanesbusiness5001 Před měsícem
gpt4all just crashes with llama, i'll give this a shot
@user-zw1yl2vd9y Před měsícem
Can this be used on a laptop?
@InnocentiusLacrimosa Před měsícem ⁺²
Yes. If it has good hardware.
@neon_Nomad Před měsícem
Mlc-ai is great
@Ben_D. Před měsícem ⁺¹
Zeh-'Fer if your are american, Zeh-fuh if you are a brit
@Earl_E_Burd Před měsícem
Yup, rhymes with heffer
@charliestephens4909 Před měsícem
If you knew how to code man imagine the possibilities brother
@user-yi2mo9km2s Před měsícem
It hasn't build in with search Docs and search engines.
@noxplayer-rt9tj Před měsícem
You have a powerful graphics card in your computer. When starting the model you made probably 1 mistake-you did not set GPU Offload to the maximum value. If someone has a weak graphics card and the model does not want to load-must turn this option off.
@MrEthanhines Před měsícem
2:22 you mean GPU VRAM not ordinary RAM right?
@MattVidPro Před měsícem ⁺¹
It can be put on both but vram is much faster
@okolenmi7511 Před měsícem ⁺¹
It's RAM. VRAM is another requirement. It depends on what are you using (GPU, CPU, Apple M chip, etc.). GPU is the most common case as it's fast enough, default CPU is the slowest option, but I'm not sure they implemented model work on CPU as this is not a good way to run LLMs.
@1Know1tHurts Před měsícem ⁺³
I tried to use these LLMs but they are light years behind Claude or ChatGPT.
@bluesailormercury Před měsícem ⁺¹
Bigger models like Mixtral 8x7B or Llama 3 70B require more resources (RAM or VRAM) but are not far from GPT 3.5. 8B models are indeed too small for that
@InnocentiusLacrimosa Před měsícem
@@bluesailormercuryindeed. 8B models are pretty disappointing for larger use cases. I need better hardware or more patience 😁
@jonsantos6056 Před měsícem ⁺¹
Is it private though? Is our data safe?
@IIlIIllII Před měsícem
Explain to me, what do these llms really offer, when I could just ctrl-f the dataset itself or even make a simple software filtering program for the dataset, and likely get less hallucinations and other benefits. What is really being offered with an llm over the dataset itself.
@okolenmi7511 Před měsícem ⁺³
Good luck to make a program that will filter several trillions of words to get a single short answer. LLMs have some sort of creativity - not so much but it's enough to generate something that doesn't exist in dataset.
@apache937 Před měsícem ⁺¹
go try that then
@avi7278 Před měsícem
Content crunch eh?
@IntelliMindA.I. Před měsícem ⁺¹
Funny !
@bolon667 Před měsícem
Tbh, Ollama is better, because it's fastest LLM backend out of the box.
@nowshinnur Před měsícem
clone...still gonna try
@UFOgamers Před měsícem
it's VRAM not RAM right?
@okolenmi7511 Před měsícem
RAM. VRAM settings in another place. You can set more VRAM to speed up your model. For example in this video were used 10GB VRAM to get that speed.
@UFOgamers Před měsícem
@@okolenmi7511 Thanks!
@kex0 Před měsícem
"I'm going to exit out of my browser as I no longer need that"
You weirdo
@hipjoeroflmto4764 Před měsícem
You finding that weird, is the weird thing. You must be a weirdo
@Ahm.elzain Před měsícem
How about for iPhone yall ?
@joelface Před měsícem
Honestly I think the next iPhone is going to be custom built to run a local model about on par with llama-3 that will be able to control all of your apps and understand all of your requests with ease.
@peterkonrad4364 Před měsícem
there are newer windows pcs that dont have avx2 support. for example mini desktop pcs and tablets. they do have 8 gb or 16 gb of ram and can run local ai models. you just need another program for that. i use ollama. it is very slow, but it works.
@the_stray_cat Před měsícem
fuck yeah it works blindly easily for whatever hehe

Další v pořadí

Automatické přehrávání

GPT-4o is WAY More Powerful than Open AI is Telling us...