Run ANY Open-Source Model LOCALLY (LM Studio Tutorial)

Matthew Berman

zhlédnutí 133 007

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 13. 11. 2023
Get UPDF Pro with an Exclusive 63% Discount Now: bit.ly/46bDM38
Use the #UPDF to make your study and work more efficient! The best #adobealternative tool for everyone!
In this video, we look at LMStudio and I give you an in-depth tutorial of the easiest-to-use LLM software. You can run any open-source AI model easily, even if you know nothing about how to run them.
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? ✅
forwardfuture.ai/
Rent a GPU (MassedCompute) 🚀
bit.ly/matthew-berman-youtube
USE CODE "MatthewBerman" for 50% discount
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
Media/Sponsorship Inquiries 📈
bit.ly/44TC45V
Links:
LMStudio - lmstudio.ai/
Věda a technologie

Komentáře • 297

@matthew_berman Před 7 měsíci ⁺⁸
The best discount for Black Friday: bit.ly/46bDM38
@MrAndi1281 Před 7 měsíci ⁺⁶
Hi Matthew, i love you videos, watching all of them lately, but i have to ask, did you forget the Autogen Expert Tutorial??
@amandamate9117 Před 7 měsíci ⁺¹
how to run deepseek-code-7B in ML Studio? its perfect for coding but i dont get a good answer. I dont know which Preset (on the right) to use for this model.
@SDGwynn Před 7 měsíci
Fake?
@Mike-Denver Před 7 měsíci ⁺¹
It would be great to see how it works with autogen and memgpt. And thank you Mat for this great job your are doing! Keep doing!
@matthew_berman Před 7 měsíci
@@MrAndi1281 haha no, but that’ll take a bit longer to put together
@kanishak13 Před 7 měsíci ⁺¹²
I m blown by the possibilities it brings to the users who are not comfortable with the earlier present methods.
@jsmythib Před 4 měsíci ⁺⁵
I just tried the local server of LM Studio...using it, and its examples I had a c# console app setup and talking to it in about 15 minutes. Easiest API to use, maybe ever. So good I came here to mention it! :)
@pipoviola Před 7 měsíci ⁺⁴
That was amazing. You are helping us so much introducing to all this tools. Thank you very much.
@travotravo6190 Před 7 měsíci ⁺¹
I've been trying this out and it honestly delivers. So easy to run your own AI's!
@Boneless1213 Před 7 měsíci ⁺¹⁶
Do you have a running list of the best models for each category? I can't always remember which one you tested last for either coding or uncensored ect. Thanks for any comments.
@OutdoorsHappiness Před 7 měsíci ⁺¹
LMStudio looks pretty awesome, great job on giving us a tour, going to try it, thanks !
@issiewizzie Před 7 měsíci ⁺²
got difusionbee for local picture generation
so its about time we had an easy way to use LLM on our local Maschine
@64jcl Před 7 měsíci ⁺²⁰
In your demo you seem to only use 1 GPU layer. For my "old" Nvidia 2060 with 6GB I can easly do 40 layers on the GPU and it is very fast with for example the mistral dolphin 2.2.1 Q5 models. The API feature is brilliant, I use it for developing my own agent using a system message to give it some interesting features in its output (calling functions).
@irakmendez9985 Před 7 měsíci ⁺⁴
Any link?
@bigglyguy8429 Před 5 měsíci
@@irakmendez9985 You can search from inside the LM Studio software
@TheZanzz27 Před 7 měsíci ⁺⁹
I like how with nearly no context, "Mario" just pumped out a romance novel scene.....
@leandrotami Před měsícem
OMG I stopped the video to read it and honestly I never imagined Mario in such a context. What did he do to Peach!? Is it even Peach!?
@chessmusictheory4644 Před 7 měsíci ⁺¹⁷
LM studio is awesome . running the server and operating open source models from an IDE I was able to get it to perform pretty much on par with gpt 3 j. just a bit slower . running the server is the way to give your llm the most tokens possible for inference while you formulate your questions around json's and SPR sparse primary representation prompts in the IDE. At one point I had dolphin 2.2 telling a story for over an hour strait with out stopping and not even repeating itself until I shut it off . Massive unexplored potential there.
@retex73 Před 7 měsíci ⁺¹
OMG! What was the quality of the story like ? You think it could readable and enjoyable novels on demand ?
@bigglyguy8429 Před 5 měsíci ⁺¹
@@retex73 I too am interested in this magic?
@Pietro-Caroleo-29 Před 7 měsíci
Good afternoon Mr Berman... You have a talant doing these videos, you come over as clear as glass. well done.
@user-bk4ri5ur9v Před 7 měsíci
brilliant! thank you Matthew and thank you LM Studio
@MichaelRamkissoon Před 7 měsíci ⁺⁹
Love this!!! Thanks for always giving a walkthrough.
@ManiSaintVictor Před 7 měsíci ⁺⁶
Just in time! Thank you. How is the MemGPT setup process? I’m gonna try this out after work. Thanks.
@danielsmithson6627 Před 7 měsíci
Thanks for this video! I was confused why you hadnt shown or seen this before. LM Studio has been my go to, it runs fast and has GPU / CPU support. I dont know another tool that works as well.
@theresalwaysanotherway3996 Před 7 měsíci ⁺³
looks nice, but I wouldn't rely on it for testing in your videos until you can specify prompt formats (there's a good chance the model might be handicapped by the wrong format, currently it only lets you edit context, not the full prompt format). Also it only uses llama.cpp, which means anyone with an nvidia GPU could double their speed by switching to ExLlamaV2 and EXL2 quants.
@unajoh6472 Před 2 měsíci
This is so helpful tutorial. Thank you so much!
@kalvinarts Před 7 měsíci ⁺⁷
I know this is very easy to use but there are plenty open source solutions to do the same. It would be good to inform about the data collection these companies are doi g on the users who use their software.
@64jcl Před 7 měsíci ⁺¹
Do you know if LM Studio actually collects anything? Has anyone run a packet sniffer to check if it actually sends packets somewhere?
@RZRRR1337 Před 7 měsíci ⁺³
Like which one? Can you tell us some open source examples?
@RZRRR1337 Před 7 měsíci ⁺¹
Like which one? Can you tell us some open source examples?
@NoidoDev Před 4 dny
Any recommendation for doing stuff in CLI?!
@nasimobeid2945 Před 7 měsíci ⁺¹
Awesome content as always!
@mdekleijn Před 7 měsíci
Love this! Thanks for sharing.
@xdasdaasdasd4787 Před 7 měsíci ⁺¹
Great video! Id love a lm studio with memgpt and autogen video if possible
@lukasareskog9230 Před 7 měsíci ⁺³
Is it possible doing document retrieval within LMStudio? For example, a chatbot that can chat about .pdfs / .csvs / .txts, given to it?. If not possible, would privategpt be a better alternative? It seems very intuitive there.
Couldn't find anything on google.
@ezygoat Před 3 měsíci
I accidentally subscribed to you a long time ago, best decision I ever made.
@ajaypranav1390 Před 7 měsíci
Wow in your previous video commented on LMstudio and now I see a video on it. Wow you are the best
@paveljanetka2864 Před 7 měsíci ⁺²
thanks for video, please could you advice how to work with local documents with the model?
@debashispanigrahi676 Před 4 měsíci
Super One ! Thanks for this Video !
@svcupc Před 7 měsíci ⁺³
This looks much easier than TextGen WebUI. I haven't looked into it but I hope LM Studio will not record my usage for anything. Another interesting thing would be if we can use AutoGen or MemGPT to extend its capabilities. And if we can "chat with our own doc" using LM Studio.
@benscottbongiben Před 7 měsíci ⁺¹
Autogen and memgpt sound good with this.
@kai_s1985 Před 7 měsíci ⁺⁶
Do they have a document upload feature, so that we can chat about our document like the custom GPTs?
@saintsscholars8231 Před 7 měsíci ⁺²
Seconded.
I’m wondering about this too
@Appleloucious Před 4 měsíci
One Love!
Always forward, never ever backward!!
☀️☀️☀️
💚💛❤️
🙏🏿🙏🙏🏼
@DikHi-fk1ol Před 7 měsíci ⁺¹
Off-topic question- how can i save a fine-tuned model that i fine-tuned using gradientAI to run it locally.
Please reply, love your videos!❤❤
@ciacioz Před 7 měsíci
Thanks for your videos, they're are always very very usefull. I tried LM Studio with autogen but even for the simpliest tasks it doesn't give me the correct answers. Ho do I should set up LM Studio to run autogen and use it to generate code? I see that the LM Studio presets change a lot the prompt format so maybe depends on that? Thanks in advance :)
@trashboat2821 Před 7 měsíci
Awesome! are you going to create a video on OpenAI's upcoming 'create your own gpt'? would love a video covering that, and exploring any alternatives for Mistral or Llama (ie open source).
@aminalyaquob1387 Před měsícem
awesome review! I wonder how to make the LLM constrained to read and analyze local files?
@Leto2ndAtreides Před 7 měsíci
TheBloke also gives recommendations for which models to use or not use - not necessarily which one is the biggest that you can run.
@rakly3473 Před 7 měsíci ⁺²
Those 'should work' etc is not based on your system, it's about compatibility with the LM Studio app. (GGUF models)
I have 128GB system and 40GB VRAM, and it also shows the 30GB+ required warning.
@lucademarco5969 Před 7 měsíci ⁺¹
Is it possible to upload documents and query them?if yes, can you show how? Is it also available through the API server? Thanks in advance!
@zikrullah1101 Před 3 měsíci
awsome man thanks for that
@beeeev Před 7 měsíci ⁺¹
But can you fine tune the models or have it access your private documents locally on your computer?
@BetterThanTV888 Před 7 měsíci
Great video. How would you host this on provider like Linode or AWS?
@SYEDNURULHasan1789 Před 6 měsíci
crisp and concise content...
@youtubetruthlife4750 Před 7 měsíci
Its funny how everything that is "dead simple" is just simple enough for most people to start using. But yes, LM studio is maybe the best entrance to using open source LLM's.
@007topless Před 6 měsíci
this was actually a really good video
@imperialGaming.2473 Před 3 měsíci
GPT killer other than Sora. This will be what LLM will look like in the near future! So excited to get my hands dirty! 😮
@spencerfunk6697 Před 7 měsíci ⁺⁴
please do a tutorial with this for memgpt. ive been using lm studio for a couple weeks now. ive seen people get memgpt to work with the server but some people have issue, me included
@spencerfunk6697 Před 7 měsíci
or with anything that calls an openai api for that matter i just really wanna try memgpt and chat dev with this thing
@RZRRR1337 Před 7 měsíci
Is there any playground studio like that but for commercial llms where you put your API keys and can play with anthropic, openAI, Cohere models in one interface?
@Kivalt Před 7 měsíci ⁺⁴
I'm waiting for an open model to implement OpenAI's function stuff reliably. That would make up for a lot regarding the differences in intelligence between GPT-4 and open models.
@Hypersniper05 Před 7 měsíci
Have you tried airoboros ? It's trained on function calling and works for me
@14supersonic Před 7 měsíci ⁺⁴
I'd say at the rate open llms are advancing, we'll probably have this ability within a years time. Although it's nice that we have the framework for when that does happen.
@raulbrebenaru2211 Před 7 měsíci
Check out Gorilla open functions
@samet107 Před 6 měsíci ⁺¹
First I want to thank for sharing the useful AI content.
The LM Studio software was a key step to bring AI assistants a step closer to the customers and consumer.
I made use of the software as well and was recently experimenting with dolphin mistral llm 2.2.1 and wondered after a while what the token count 4984/2048 at the bottom right below the chat input means. As far as I understood, it's some sort of counter how many tokens the llm already has written and answered, but why does it matter? Is the chat history fed into the language model each time we enter something new, and this happens somehow behind the scenes? When these language models are working like this, I would understand that the natural limit of the input the language model supports also is the maximum size of the chat history.
I am not very familiar with LLM s and just started experimenting with them. Could someone please explain why the token count: yxcd/yxcd number is there and how it affects the Assistants' performance or affect the chat in which way?
Thanks in advance
@m12652 Před 4 měsíci
How many millions of tons of carbon are being wasted listening to these models apologise?
@rogerbruce2896 Před 7 měsíci
quick question, when you download how do you specify what hard drive to download to?
@markelshnops Před 7 měsíci ⁺¹²
Would be a little more useful if the system would allow you to upload documents so you could perform actions like summarization
@norhloudspeaker Před 7 měsíci
You can do that with GPT4ALL with a plugin.
@coinheadz1942 Před 7 měsíci
learn how to code lol
@Axxis270 Před 7 měsíci
I have been yelling about this and Faraday (my favorite) for quite some time now, but for some reason you never see any of the ai channels telling you about. These are the easy to use programs that the majority of ai users want.
Před 6 měsíci ⁺²
Is there a way to add your own text files, datafiles etc.? So when using the chat, it also knows the specific info about a subject from the files I provided?
@davidhendrie6061 Před 4 měsíci
I am also very interested in this. i want to add tons of local video and audio content to the chosen LLM. would love to batch it in. anyone else doing that sort of thing?
@zoltanabonyi3307 Před 7 měsíci
Brilliant. Can you possibly use llava with lm studio to talk to images?
@dylanalliata4809 Před 7 měsíci
Very well done.
@aketo8082 Před 2 měsíci
Looks great, Thank you. But LM Studio didn't work with own text, PDF or Docx files, right? Also no dialogue mode possible.
Is there a video that shows how to create own LLM? Thank you.
@friendofai Před 6 měsíci ⁺¹
Would you be able to cover more in-depth about the developer side? I would like to host on my local PC, but be able to access it from my android phone.
@mutleyeng Před měsícem
im a complete coding/compter numty and got it running fine. Quest i dont know is how to take a basic base model and add learning to it. It told me it can extract information from webpages, but it dosnt seem very effective
@keithprice3369 Před 7 měsíci
Are you saying if we import OpenAi api to this the syntax is the same as openai but we're actually using the open source model?
@Skettalee Před 5 měsíci
There are so many models out there Ive been looking and my question is really can i find a model that would be able to answer any computer error messages i get and how to fix them, what model would i pick if I wanted that? ANd i guess as a side note im also a musician / music producer and would like to find a model that is best for music production or creative writing songs both chord structure/progressions as well as song ideas and lyrics. How do i find that? Ive already searched any and all the keywords I could think up to find that stuff but nearly any of the keywords I try does not bring up anything so im guessing I just dont understand how to look for things. Any help like a link of something to read or anything would be amazing.
@steveyantis Před 3 měsíci
Thanks!
@donaldparkerii Před 7 měsíci ⁺⁴
I believe that if you are enabling Apple Metal requires specific models that was trained with Apple Metal. Also if you are on Mac you can run - open -n -a "LM Studio" - to spawn multiple instances to run different models.
I am going to try to do the Linux beta and see if you can get more configuration via CLI for a real server.
@infinitytrading-ai Před 7 měsíci ⁺¹
can you make a tutorial on how to run and test local llm models on a linux server for business us. also using vector embeddings to allow way more data to chat with. ?
@ISK_VAGR Před 7 měsíci
Can you train or fine-tune the models?
@MrBuddylee7 Před 7 měsíci ⁺²
Wish they would add the chat with your docs feature
@chrisbraeuer9476 Před 7 měsíci
This is awesome.
@TrevorMatthews Před 7 měsíci ⁺²
Thanks @matthew_berman One challenge I haven't solved yet is moving an environment. At the office I have the OK to explore LLM potential BUT within the existing software and hardware constraints. My PC is good enough, but our network is so locked down none of the scripts can pull down requirement files and libraries. I'd need to setup an environment on an 'internet facing' computer and then be able to move it. And run it. Is that possible??
@OpenLLM4All Před 7 měsíci ⁺²
Could try using a VM. I noticed a company called Massed Compute has VMs specifically for Matthew. All of the tools he has used in his videos are pre-loaded
@murraymacdonald4959 Před 7 měsíci
Will it serve multiple models like ollama?
@bobbytables6629 Před 6 měsíci ⁺¹
LMStudio lacks local documents, what a bummer I will continue to use GPT4All
@SteelWolf13 Před 5 měsíci
So i could run this (10:34) local server as a local lan server on a PC that family could use? Not following the replace the base section.
@ydmoskow Před 7 měsíci ⁺¹
We are about 1 year into gpt pandemonium and the momentum is only getting faster and everything is easier
@matthewlaborde1080 Před 5 měsíci
Is there a way to chat with docs or doc repos from LM Studio?
@fossil98 Před 7 měsíci ⁺⁴
10:04
😂 Indeed. I think we know what its finetuned on hahaha.
@adamstewarton Před 7 měsíci
Mario is definitely a hor.y little llm😂
@Steve.Jobless Před 7 měsíci ⁺⁴
Running the open-source models, but the software itself is not open source, lol
@kaion8957 Před 7 měsíci
hi nice video how did you make the intro transition ? thanks
@-UE-PR0 Před 7 měsíci
Is there a way to run big models like this on a free cloud? Run pod requires money but is there a program or application that I can use that can run these models on cloud
@WINTERMUTE_AI Před 7 měsíci
Very cool, me and GPT recently parted ways on bad terms. I want a machine that follows my instructions and caters to my needs without its own opinion getting in the way, which model would work best for the best AI friend? Specifically one that will agree with the FACTS I give it, without argument.
@RichardGetzPhotography Před 7 měsíci ⁺³
Matthew, can these models be DLed to an external drive and used from there? Can you set up Agents? No capability to upload files? Can you report how the M processor does against a GPU? How well does the locally ran dev version scale? Obviously based on the size of the computer this is running on, but will it handle multiple requests from developers?
@just..someone Před 7 měsíci ⁺¹
you can def. have the models on a separate drive, which is super useful. not sure about the rest, but to the last question: via the API mode (emulates style of open AI api) you can have several requests, that then get queued up one after the other.
@RichardGetzPhotography Před 7 měsíci ⁺¹
@@just..someone thanks for the reply
@nasiksami2351 Před 5 měsíci
This looks awesome. I have a query, let's say I finetune the llama2 model on my custom data. In this case, can I use that new model and integrate with LM studio?
@davidhendrie6061 Před 4 měsíci
waiting for this answer or updates if you get it working
@johnne86sd Před 5 měsíci
I have a GTX 1660Ti with 6GBs VRAM and I got way faster results from my Nvidia card when setting n_gpulayers to around 20-30, instead of leaving at 0. Haven't tried anything higher than that, but the difference was night and day. I tried it on mostly 7B 4QKM/S models around 4-5Gbs.
@amandamate9117 Před 7 měsíci
how to run deepseek-code-7B in ML Studio? its perfect for coding but i dont get a good answer. I dont know which Preset (on the right) to use for this model.
@yerneroneroipas8668 Před 7 měsíci ⁺²
Mario started writing 50 shades of grey for you 💀
@CherryOverride Před 7 měsíci
How does this handle models installed by ollama? If it does at all
@tcb133 Před 7 měsíci
Could you make a video about using this server Mode to power aider? I couldn't figure how to do it
@dlbet4110 Před měsícem
I'm trying this on an older computer. I got the message "This processor does not support AVX2 instructions." is there a way to get this to work? Obviously, it would work on a newer processor, but I don't want to test it on computers I actually use. Or, is there a model that will work on less than AVX2 instructions?
@Parisneo Před 7 měsíci ⁺⁶
Very cool tool. Thanks for this nice tutorial.
I wish some day you give lollms a try. It has a models zoo and can run multiple types of models including GGUF , gptq and now awq. It has a persona system and can be installed with a single file install script. It supports basically all remote and local LLMs. Asits name suggests, it is built to support everything that crawls out there. It can be used to generate text, image and audio. It has an extension system (WIP) and It took me hell lot of effort to make. it is 100% free under apache 2.0 licence and there are documentations on my modest youtube channel. I think you can present it way better than I do :) .
@stickmanland Před 7 měsíci
Look who's here, guys!
@jtabox Před 7 měsíci
lollms is absolutely worth giving a try, I installed it and have been using it for a week now. It has so many features and functions it's almost unbelievable that it's just a couple of devs behind it all. It's a bit rough around the edges in some aspects, but still very much functional and new additions and bugfixes are published constantly. It's been my favorite so far.
@Skettalee Před 5 měsíci
I asked earlier about the best models for computer error messages and ChatGPT told me "bert-large-uncased-whole-word-masking-finetuned-squad" But I tried putting all OR even ANY of most of those words in the search and cant find it. I even looked for that model on Hugging Face and got the URL to paste into LM Studios search box (like it told me it could do) and yet it still didn't find anything for me to download. Is this program still good to try to get this stuff?
@wilkerribeiro1997 Před 7 měsíci
Could you explain more about how that "Apple Metal" configuration works? Is it only for models trained on apple metal? What changes if it is enabled or not?
@Pyriold Před 7 měsíci
I think training and inference are totally decoupled, so it does not matter how it was trained, you can use whatever hardware for inference.
@lanvinpierre Před 6 měsíci
can you use lm studio with a custom front end?
@mishlaev Před 4 měsíci
Thank you for your tutorial and the channel. It would be nice if you can teach how to process files with LM Studio. For example, I have an email (HTML) that I want to parse and structure. I would be interesting to learn all the details how to tune temperature, tokens, context window, etc.
Thanks
@parthwagh3607 Před 6 měsíci ⁺¹
Can you please provide a specification for PC build of $2400, which will run ai models locally in fastest way possible at this price. What things we should consider when building PC solely for running ai models locally and rarely gaming? What really helps to run this model fastest locally? please provide related information also. I want to build a PC with budget of $2400. Thank you.
@ByteBop911 Před 7 měsíci
IG the prompting has been improved for autogen with lmstudio...can you make a video on that?
@peterwan小P Před 7 měsíci
真係好撚正XD
Its JUST AWESOME!!!
@SDGwynn Před 7 měsíci
Will watch. But question… Ollama or Llm studio?
@profittaker6662 Před 3 měsíci
can you make a video about how to make that server in localhost, python and curl versions
@Beauty.and.FashionPhotographer Před 3 měsíci
is this an alternative to Midjourney , so text to image for macs?
@petarstoev4848 Před 3 měsíci
can you load pdf files for summarization etc. in those models?
@rajvora2876 Před 7 měsíci
would love some tips on which specs can run it or some recs on laptops
@davidhendrie6061 Před 4 měsíci
keep me updated
@dhiraj223 Před 29 dny
Can we load safetensor models as well ?
@user-bd8jb7ln5g Před 7 měsíci ⁺¹
LM studio is great. My only issue with it is that it's got very small font that I haven't found a way to change.
@1Know1tHurts Před 7 měsíci ⁺¹
Same issue here.
@cloudd901 Před 7 měsíci
What do we know about their telemetry gathering? I'd hate to use this for personal stuff just to have a key or personal info leaked into someone else's training data.

Další v pořadí

Automatické přehrávání

Create Custom GPTs 🫡 OpenAI's AGENTS Are Here! (No Code)