How to Create LOCAL Chatbots with GPT4All and LangChain [Full Guide]
Vložit
- čas přidán 26. 07. 2024
- 📚 My Free Resource Hub & Skool Community: bit.ly/3uRIRB3 (Check “CZcams Resources” tab for any mentioned resources!)
🤝 Need AI Solutions Built? Work with me: bit.ly/3K3L4gN
📈 Find out how we help industry experts and entrepreneurs build and scale their AI Agency: bit.ly/skoolmain
In this video I show you how to setup and install GPT4All and create local chatbots with GPT4All and LangChain! Privacy concerns around sending customer and organizational data to OpenAI APIs is a huge issue in AI right now but using local models like GPT4All, LLaMa, Alpaca etc can be a viable alternative.
I did some research and figured out how to make your own version of ChatGPT trained on your own data (PDFs, docs etc) using only open source models like GPT4All which is based on GPT-J or LLaMa depending on the version you use. LangChain has great support for models like these so in this video we use LangChain to integrate LLaMa embeddings with GPT4All and a FAISS local vector database to store out documents.
Mentioned in the video:
Code: github.com/wombyz/gpt4all_lan...
GPT4ALL (Pre-Converted):
huggingface.co/mrgaang/aira/b...
Embedding Model:
huggingface.co/Pi3141/alpaca-...
Mac User Troubleshooting: docs.google.com/document/d/1J...
Timestamps:
0:00 - What is GPT4All?
2:28 - Setup & Install
7:13 - LangChain + GPT4All
11:42 - Custom Knowledge Query
19:42 - Chatbot App
24:19 - My Thoughts on GPT4All - Jak na to + styl
Leave your questions below! 😎
📚 My Free Skool Community: bit.ly/3uRIRB3
🤝 Work With Me: www.morningside.ai/
📈 My AI Agency Accelerator: bit.ly/3wxLubP
This is an extremely important issue that many people are forgetting.
Using ChatGPT means training ChatGPT, and this means that your data can be used in the future by OpenAI or any other accessing ChatGPT. Not something you want to do with your own market or confidential data!
Thanks a lot for this very important information!!
I just subscribed.
Agreed! Thanks for the support mate 💪🏼
How can we make it private??
@@studywhatever4063For OpenAI you can now either opt out of history and training or you can pay for the API or plus (or soon to launch business model). You can't disable the training portion without disabling history... Almost like they are punishing us....hmmm
@@neogeo9117 thanks for your reply
Cue in the Samsung engineering document leaks as to why your privacy is important! Don't ChatGPT at work!
Thank you for constantly looking into this topic, preparing demos, and presenting them on CZcams. I very much enjoy all your work.
I CANNOT BELIVE LIAM IS GIVING THIS SHIIIIIT AWAY WTF, in depth tutorials on cutting edge stuff for anyone to watch and learn. amazing
Thanks for making this video. Really enjoyed following along. I think it's got just the right amount to get experimenting before diving into more specific topic areas. I did have some issues with slight changes to the latest releases but a bit of research solved these. I think this is because things are just moving so quickly. I am still struggling to convert old ggml models to new ones though.
You are doing an amazing job Liam, thanks!
My pleasure, thanks for watching!
Been waiting for this!
💪🏼
Good stuff, thanks for the effort you put into this Liam!
Appreciate the support mate 🤙🏼
great job Liam! keep on producing value!
Great Video! Even though its a difficult topic, you make it quite easy to understand!!! :)
Thanks mate, I appreciate the feedback!
Great video! Clear instructions. Cheers Liam 💪🏽
💪🏼
thanks you so much Liam, it helps me so much to understand about langchain and NLP related stuff. thanks
Thank you very much for this helpful video. I was struggling with M1 chip related problems and I succeed to solve it thanks to your explanations.
Amazing content. Thanks Liam!
Thats insane! I truely love you!
Good tutorial mate, merci beaucoup !
I hope everyone reading this becomes successful!
💪🏼
I think I can speak for all of your subs, we really appreciate your work and effort on this! i will need to view this presentation several times before trying it myself. A few questions:
1.) You mentioned that _Your GPT4ALL Instance_ (That sounds so good just writing that it's *_Yours_* ) is slow. What hardware are you running on? CPU/RAM/type of storage/GPU(s)?
2.) Would the basic local setup for GPT4ALL be pretty much the same as for other open source models?
3.) Your in Columbia?? What happened to Dubai? 😮
Thanks again!
Hi mate!
1. I have a MacBook Pro M1 Max 64GB so it should be able to run fairly well
2. Yes, very similar especially if you're using langchain. Just need to swap out the LLM name and import it correctly
3. Most people don't stay in Dubai year round, gets too hot about now. I've started my travels for the year starting in South America!
Thanks for the support!!
5:24 appreciate this model source!
Been trying to find a converted version that aligns with the default documentation/github
Liam, you're doing a super great job, amigo. I love all your videos, for real. This one in particular. Bad thing is that I haven't been able to install Chromadb no matter what I do. I have read a thousand files regarding the issue and nothing can lead me to a solution. Error: Building wheel for hnswlib (pyproject.toml) did not run successfully. But if you can help... you will help not only me because seemingly there's a lot of people with the same issue. Keep it up, Liam. Awesome work!
I think local models are the future, they have a lot of possibilities.
Thanks Liam for ur job from Italy!
My pleasure mate thanks for watching!
Great video Liam! Could you make a similar video but with access to websites, not text files.
Thankyou so much
That's a nice solution you can't use for commercials applications.
Good content by the way.
A better alternative is to use models like T5 for long context reasoning or UL2 for more general purpose and use multichains with facts checking to prevent allucinations.
Ask us if you need any help with that 🎉
Will take a look!
I’d love more info
Which python version would you recommend for both Langchain and GPT4All? Great tutorial by the way.
Great video! Would you be able to make a similar one but with Semantic Kernel?
Hey, nice work. Explained it in simple way and very effectively. How about GPU processing? I know this can be a little tedious task. But, I can't even find the GPU documentation anymore on github. Could you help in providing the documentation link?
Amazing Tutorial, thank you so much.
the custom knowledge part is very slow on my Mac M1 16Gb (10 minutes per asnwer). Would a fine tuning be a solution ? How could we achieve that ? Or would it be faster on a public cloud ?
Thank you again
God bless this man
What an awesome video! Could you do a video on how to integrate this to slack? Like deploying it to a cloud server and then connecting it to slack. Thanks!
yes please
Great video. Thank you. I got it to work my M1 MacBook using your instructions. Wondering what type of Mac did you use? My M1 was excruciatingly slow. What was your speed?
M1 Max, 64gb. Very painful I know. For the custom knowledge chatbot it would take a minute or two to respond. Would be good to try getting it working in graphics card not CPU but not sure there is support for mac graphics yet
Hi great video and good steps! I am on a mac and am currently experiencing issues even getting the gpt4all to complete the once upon a time prompt. Following the document right now for Mac troubleshooting and it seems like the cloning repository link does not work? if possible could you provide the updated cloning repository link or step? Thank you!
Liam your is an amazing job, thanks a lot. I was looking for this, honestly with so many amazing free models around (vicuna and koala.... too) it was a shame that no one (except you) was even giving a try!
BRAVOOOO
I run all of this on my Macbook pro, no M1, 16 Gb ram, a bit slow but no issues..
some more dependencies to install ( FOR the community pip install llama-cpp-python, pip install faiss-cpu, and for using you last code for the CHAT history we need also transformers - pip install transformers)
- In case you will do a video with Vicuna I will be there for you.
- can I refer to your channel in aa Medium article?
thanks again
Hi mate, thanks for the kind words! I'll take a look into Vicuna today, and feel free to mention in a medium article!!
Is a custom knowledge base different from fine tuning? I saw your videos about fine tuning with the NBA stats example... With this langchain with knowledge base example, does it achieve finetuning the LLM as well?
Really Interesting topic and precisely explained, I wonder whether it would be possible to query from a Knowledge Graph (Json,etc,... ) and how this can boost the interpretability and explainability of the model's responses
Worth investigating for sure. I'll take a look
What are the alternative we might try instead of using the llama model you used for index? I am wondering if we want to use that in a produvtion env as llama is still not open it ?
How much time does it take on your mac?
Do we need gpu on Linux server for vector embedding phase?
How much space we need on hdd?
Imade 5bucks and iam pumped tyvm
I would be interested in how you can set this up this on a cloud server and create an endpoint that i cant hit with my requests from my web app
hi boss, which software are you using for those burnt subtitles please
Is this being ran in the CPU of your MAC or are you using GPU acceleration for inference?
Great video!
CPU, can't imagine the nightmare of getting this setup on a mac GPU 😂
could we not use mpnet embedding from huggingface instead of llama embedding? Also, in my
experience, gpt4all-j has better performace. Have you tried a langachain integration with "J" models?
Can u also put a requirements file generally to help know what are the dependencies in a single place..
Thats a phantastic video, still working through the pieces. I did us a windows environment after you scared me with the mac issues. But why is nomic needed(still getting the not supported on windows message). But with GPT4ALL it works. In your custom knowledge query code i got some errors, line 23 : loader = TextLoader('./docs/shortened_sotu.txt', encoding='utf-8') -> added encoding and did need to run : pip install --upgrade llama-cpp-python
okay, break is finished, need to go to the next lesson from you
Did you get it working in the end?
@@LiamOttley Yes, of course, it works even in german. Now playing with other embeddings and chromadb on top of a nicer Interface in streamlit :-) And desperatly waiting for your next videos.
What did you have to do to get it working in windows?
i am using huggingface to load GPT4all, how can i change the input token limit?
Hi,
1. What will be the answer if you had write something not familiar in the text?
2. is there any option for score?
3. About the speed, i believe you run on cpu. does on gpu it will be fast, right?
Thank u for your great video!
When I use to run it, I faced the below error:
./models/gpt4all-converted.bin: invalid model file (bad magic [got 0x4f44213c want 0x67676a74])
Segmentation fault (core dumped)
Could u please guide me on how to solve it?
other links mentioned in the video are also missing. for example the link to pypi. i was able to see the url IN your video and copy it but dont say you put something in the description if you dont
other wise i do enjoy your videos and learned alot , so thanks :)
To increase can we use gpu in windows?
Will I have to delete the pickle file if I am using another PDF?
Could you do a video on this for pretty much a complete beginner in coding, like what apps you'll need, are there any ways to do it all completely for free, etc.
Can you also make a video on how to connect GPT4ALL with chrome gpt ?
hey i know this might come off as a completely stupid question, but what is the IDE youre using to read and edit code? Totally new to this but fully invested so please let me know!
What is the config of your computer? Custom knowledge query is dead slow in my computer.
Can generate an index of a CSV with 300 rows using langchain CSVLoader take more than 3 hours? I'm running into that situation right now.
How can I make the Langchain LLM only print out the answer, and not the prompt again in the terminal?
Can you show example on csv or excel files? Thanks for the demo.
Does it need to be a local chatbot or using APIs is ok?
@@LiamOttley local chatbot without openAI api
Can you please upload a new Tutorial for this deploying with streamlit?
Does this process require GPU? Or can it be run using CPU alone?
Would it be better (and potentially cheaper) to run it in private cloud instances?
I have Alpaca running on my MacBook (Intel Mac, a few years old), but it’s struggling a bit & I’m weighing up building a rig vs. using cloud compute.
Worth a shot, these models are quite big so would need to have plenty of storage
I am getting error while installing nomic using pip in windows, it says it is not supported? What should I do
Can a chatbot be made where it has a custom knowledge base to query from, and at the same time, have internet access to answer questions or finish tasks?
Yep! You can use LangChain tools.
am i missing somthing? there is no link to the github of gpt4all , you mention it in the video but no link
and if i search gpt4all in github' the page i get is diffrent then whst in the video
A master class in telling you just enough so you can't do it without contacting the maestro, Sad! Effort still appreciated though
What? 😂😂
@@LiamOttley Bro, give more details so the noobs can replicate. If you want I'll send the instruction on how to make it happen on Linux.
This is a great video. Trying to make this work. I had installed this Mac M1 with 16G. GPT4all process loads and stops abruptly. I have not used python bindings yet. Is this because of the memory being low?
Could be a million things. Just use the bindings as I do in the vid
@@LiamOttley yep. trying the bindings now. :) will let you know.
Got it working. There were bugs where I had to match the versions for pyllamacpp and langchain etc. finally got it working. Thanks for your video.
I'd like to hear you say more about privacy, but in relation to how you comprehend what only a local representation of chatgpt is? How are we compromising an otherwise 100B parameter model into an offline version? And how come AI tokenization isn't even touched by the top AI channels on CZcams? If we knew better how to plan, then we might not be so misled about the actual practicality of realistically implementing anything on Githubs top monthly trending. There's character calculators laying about, but no one's making a dent in an auto-token calculator or token budget tool. What is everyone doing? Checking back to the hard limit setting on their token settings every five minutes?
How did you convert the model to working format?
I found one that was already converted, see description
Could not load Llama model from path: ggml-model-q5_1.bin. Received error (type=value_error). Please help me with this error, I have installed llama-cpp-python==0.1.48
If possible I'd love to see a version of this that runs on the GPU version, I'm running into so many hiccups and am at a crossroads, can't seem to find others with these problems and using GPT to try and solve the problem isn't working out well either.
GPU versions are a whole different level of pain unfortunately. I’d say near impossible on Mac at this point. Best to wait for the infrastructure to catch up a bit I think
hi, i have gpt4all on the my system but does not run?
I am getting assertion error after putting question How can I solve this?
result = qa({"question": query, "chat_history": chat_history})
For you to create a high quality plug-in would it require money for the Plugins?
You mean for users or for you as the developer?
@@LiamOttley well I mean someone that’s wants to try and make Money from these plugins?
@@LiamOttley well I mean someone that’s wants to try and make Money from these plugins?
How do you teach gpt4all on your dataset?
Will look into this
Hi, I got this error "NotImplementedError: Your platform is not supported: Windows-10-10.0.19045-SP0. Current binaries supported are x86 Linux and ARM Macs." where my pc is x86 Intel(R) Core(TM) i5-10310U CPU @ 1.70GHz 2.21 GHz, 16.0 GB (15.7 GB usable), and x64-based processor? is it possible to use it in this specification?
Same error. Found any solution?
How can u get access to the ChatGpt Plugins?
Apply and wait!
pls provide all the links you mentioned in the video. thanks
Please make a video on autogpt!
Working on it!!
@@LiamOttley Lets gooo
@@LiamOttley autogpt with gpt4all as the model would be awesome. i've been trying to figure that out for a few days with limited success. and thanks for this video, super helpful!
Even with 8 threads on M2 its unfortunately too slow to be useful for me. Are other models faster than the one you provided, or is a 8gb M2 Mac just too slow?
Yep, need GPU support I think ASAP then these will be usable
Also, the quality of responses and context is very poor compared to Open AI's models.
Great concept and model needs a lot more improvement and possibly tuning capabilities.
Maybe I spoke too soon... With the default Embedding model all-MiniLM-L6-v2 the results were not great. But quality of response and context improved substantially when I changed it to all-MiniLM-L12-v2. In fact for some of the questions, I felt response was more accurate than what I received with Open AI Chat GPT 3.5.
It will be interesting to learn how to tune the models and embedding further with custom corpus.
It is slow, but for my experimentation, it is acceptable.
Does it work offline ?
Yes thats the point!
got this error right off when trying to run nomic setup: NotImplementedError: Your platform is not supported: Windows-10-10.0.22621-SP0. Current binaries supported are x86 Linux and ARM Macs.
same error. Found any solution?
Good information from what I saw. I do have one comment to give you some viewer insight if you’d like: Your audio goes super quiet and then normal, probably due to your distance from the mic changing. It makes the video a little bothersome to listen to for me personally. I hope I don’t sound rude, just providing some viewer feedback❤
If I create LOCAL Chatbots with GPT4All and LangChain, what can I do from it? Is that same as chatgpt4?
Getting this error?
NotImplementedError: Your platform is not supported: Windows-10-10.0.22621-SP0. Current binaries supported are x86 Linux and ARM Macs.
Hello. Do you, Liam Ottley, or anyone else watching this video / reading this comment, know of a way to log the chat conversations to a .txt or a .log file from gpt4all?
Just so I understand, this uses local version of GPT and costs zero as there are no openai calls?
Yes it only costs the compute power of your laptop or server, requires no OpenAI calls either but important thing is that this is NOT GPT. It is an open source language model trained on a similar but different dataset. Not the same but similar
@@LiamOttley I tried the this code and other utilities to create chatbot for querying PDF. I have to say the tools are in infancy for customizations, still promising. I will give 1-2 years before the AI tools are useful in engineering world. They have lots of errors and needs to become robust
Does this work on windows?
Yep! Should be easier than mac
I am trying to use a replit. Has anyone seen this error?:
Traceback (most recent call last):
File "main.py", line 2, in
m = GPT4All()
File "/home/runner/customLocalGPT/venv/lib/python3.10/site-packages/nomic/gpt4all/gpt4all.py", line 81, in __init__
self._download_model()
File "/home/runner/customLocalGPT/venv/lib/python3.10/site-packages/nomic/gpt4all/gpt4all.py", line 144, in _download_model
f.write(chunk)
OSError: [Errno 122] Disk quota exceeded
These models are 4gb or so, way too big for replit. You won’t be able to do it on replit
HI Liam, im stucked o nomic install, i have an x64 processor and i get raise NotImplementedError(f"Your platform is not supported: {plat}. Current binaries supported are x86 Linux and ARM Macs.")
NotImplementedError: Your platform is not supported: Windows-10-10.0.19042-SP0. Current binaries supported are x86 Linux and ARM Macs. could you please share some light here?
thanks awesome vids
Looks like you need to follow the same steps as mac users had to. Check the doc in description and figure out how to convert the commands to windows commands
@@LiamOttley thanks Liam
@@LiamOttley Don't see which steps can be followed for Windows.
Same problem. Tried on 2 separate machines.
I getting 4-5 minutes for for genrate prompt.
Why has nobody built a ready made software that does all of this?
i got lost again, i think someone needs to do an over view of the frame work of WTF is actually needed to run A GPT option. a video with multiple options for multiple OSs is just confusing
are you able to share a colab, helps us No Coders :). Thanks for the great vids. PC USER
Just added the github repo to description! Unfortunately because these models are a few Gb in size it may not run on Colab :(
@@LiamOttley Thank you! I didn't know that 😀
This is an easy and automatically earning method and long time earning process
I encourage everyone to take a look at section 7 of their terms of service
The libraries and files used in this tutorial are mostly deprecated and outdated. sad :(
Hi, I'm very interested in your videos. Like this- But not everyone knows English and they have to read the translation with subtitles and it's a problem. you go too fast. it's a pity because you could expand your views much more
Hi mate, thanks for the feedback.
@@LiamOttley ❤️🤟
Thanks for the guide, great info! But that haircut looks AI generated.
that was in 20 minutes
Very nice video inwhich you describes that how to earn more money in this business