Build and Run a Medical Chatbot using Llama 2 on CPU Machine: All Open Source

AI Anytime

zhlédnutí 149 283

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 22. 07. 2023
In this tutorial video, Ill show you how to build a sophisticated Medical Chatbot using powerful open-source technologies. Learn how to use Sentence Transformers for embeddings, Faiss CPU for vector storage, and integrate Llama 2, a large language model, using the Chainlit library for a conversational interface. Follow along, as we guide you step-by-step to create an intelligent and efficient Medical Chatbot, all while using freely accessible tools. No prior experience required - dive into the world of conversational AI and healthcare innovation today! 🤖💬
Llama 2 Model (Quantized one by the Bloke): huggingface.co/TheBloke/Llama...
Llama 2 HF Model (Original One): huggingface.co/meta-llama
Chainlit docs: github.com/Chainlit/chainlit
Faiss GitHub: github.com/facebookresearch/f...
AI Anytime: github.com/AIAnytime
Langchain Docs: python.langchain.com/docs/get...
Sentence Transformers Hugging Face: huggingface.co/sentence-trans...
CTransformers GitHub: github.com/marella/ctransformers
LLM Playlist: • Large Language Models
WhatsApp Group:
#ai #generativeai #llama
Věda a technologie

Komentáře • 554

@paullopez_ai Před 8 měsíci ⁺¹⁸
Hey Sonu - this is one of first YT tutorials with a thorough explanation I've seen in a while. I got this running the first time 'out of the box' ; it did ask me to pip install ctransformers, but after that it came up just fine. I am going to experiment with other documents. Some people don't like to sit through writing code, but it's good for us! Especially when you mention other tools we could try and why you picked what you use. Excellent!
@AbhishekShivkumar-ti6ru Před 10 měsíci ⁺²
What a fantastic video! Probably the only one that goes into complete details!
@AIAnytime Před 10 měsíci
Glad you liked it!
@Christopher_Tron Před 10 měsíci ⁺⁷
Thank you man so much! I am very grateful for your content. I appreciate your passion for open source ai and your teachings are helping bring this technology into my reach. I was so happy when this ran! :) Excited to see your future videos.
@AIAnytime Před 10 měsíci ⁺¹
Glad you like them! I have many videos already. More coming soon. Pls stay tuned 🙏
@ThinAirElon Před 10 měsíci
you did semantic search and no finetuning involved . is this accurate ?
@sunderchakra Před 11 měsíci ⁺¹³
Wow. You packed a lot here - very helpful, thanks.
@AIAnytime Před 11 měsíci ⁺¹
Glad it was helpful! Thank you 🙏
@anas._.rehman Před 11 měsíci ⁺⁹
Hey, somehow ended up on this extremely underrated channel and I gotta say I love it!, I loved each and every part of this tutorial, something I was looking for quite some days now. Thank you so much, deffo subscribing and looking forward for such content.
Regards,
Anas from Pakistan
@AIAnytime Před 11 měsíci
Loved the comment. Thanks
@wgillett Před 10 měsíci
Thanks!!! Great presentation, super useful, amazing that you had the energy to do this while sick : )
@AIAnytime Před 10 měsíci
Glad you enjoyed it!
@user-qs3zq7tb9o Před 6 měsíci
Thank you for a smart and precise explanation of such a difficult topic
@rafaelperez1981 Před 11 měsíci ⁺¹⁰
Just wanted to drop in and say congrats on your CZcams tutorial! 🎉🎥
Seriously, I'm so impressed with your content! Keep up the fantastic work!
Best wishes,
Rafael from Belgium
@AIAnytime Před 11 měsíci ⁺¹
Hi Rafael, thanks for your lovely comment. Let's connect if you feel like..... Best, Sonu!!
@vivekmenonm1289 Před 9 měsíci ⁺¹
Amazing video, you have saved the time of a lot of people. Keep up the excellent work.
@AIAnytime Před 9 měsíci
Glad it helped... plz look at the LLM playlist.
@skyy4548 Před 6 měsíci
Great video! thank you for sharing your expertise. Keep up the good work!
@holgerespinola1345 Před 3 měsíci
You are incredible professor!!! Thank you so much for your tutorial, i got very good insights. Best regards for you
@turboc31 Před 11 měsíci
Loved the content, it was beautifully explained. Thank you :)
@AIAnytime Před 11 měsíci
Glad it helped!
@MoustafaSaleh Před 4 měsíci ⁺⁷
Thanks for the great tutorial. It is really helpful.
A hint for anyone stuck with some errors in model.py, here are some fixes (original -> fix):
chain = cl.user_session.set("chain")-> chain = cl.user_session.get("chain")
res = await chain.acall(message, callables=[cb]) -> res = await chain.acall(message.content, callbacks=[cb])
@ch-qk8dk Před 2 měsíci
thanks man
@santhoshreddy-dn9tw Před 2 měsíci
thank you,
@santhoshreddy-dn9tw Před 2 měsíci ⁺²
another addition - under #QA model function, change -> db = FAISS.load_local(DB_FAISS_PATH, embeddings) ------> db = FAISS.load_local(DB_FAISS_PATH, embeddings,allow_dangerous_deserialization=True)
@Akashgupta-em6zo Před 10 měsíci
The best channel for LLMs.. thanks
@sivad2895 Před 11 měsíci
Well done and appreciate the efforts. You have made my weekend interesting !
@AIAnytime Před 11 měsíci
Glad to hear that! Please subscribe and check out the other videos too.
@sivad2895 Před 11 měsíci
@@AIAnytime Sure, Thanks. Got struck at "could not reach server". 😞
@gunjansheladia Před 11 měsíci ⁺¹
Simply amazing ! this video can help a lot to,who wants to start working with LlaMa 2. thanks for sharing this.
@AIAnytime Před 11 měsíci
Glad it was helpful! Please consider subscribing if you like other videos as well.
@micbab-vg2mu Před 11 měsíci ⁺¹
Amazing video thank you - I wanted to build similar chatbot based on open source model - now it will be easer to do it.
@AIAnytime Před 11 měsíci
Thank you for your comment! As I am new on YT, your support can help me grow and creat more such videos.
@sugamverma2894 Před 9 měsíci
May be you are new but not for long time. Sooner your such videos are going to Rock@@AIAnytime
@amaluddin11 Před 7 měsíci
Many thanks for a great video. Fantastic tutorial!
@AIAnytime Před 7 měsíci
Glad it was helpful!
@Nishanthkj Před 10 dny
Thank You For create this video .This video was relly help full 😃
@anandmandilwar8707 Před 10 měsíci ⁺²
Nice video and great learning. Liked your confidence and knowledge. Going to build this bot on over the weekend and hopefully should be a breeze by looking into your code base and video.
@AIAnytime Před 10 měsíci
Glad it was helpful! Thanks.
@itzmranonymous Před 9 měsíci
bro what is your pc specs? and plz tell minimum system requirements for deplying llama on a computer
@@AIAnytime
@862_shekharmanhas3 Před 7 měsíci
@@AIAnytime how to make it run on gpu too ??
@SnehaRoy-pf9cw Před 11 měsíci ⁺¹
Wow... This is what I was looking for 😇
@JustEntertain Před 11 měsíci ⁺¹
Thanks, It's very useful. Upload more videos like that
@AIAnytime Před 11 měsíci
Thanks for your comment! Please check my LLM playlist.
@ajithprabhakar Před 10 měsíci
Great work, this video was really informative.
@AIAnytime Před 10 měsíci
Glad it was helpful!
@yongxing1848 Před 7 měsíci
Thank you for the video and I learn a lot from you.
@AIAnytime Před 7 měsíci
Glad to hear that!
@SnehaRoy-xf3zv Před 10 měsíci
What an amazing video... Thank you.
@be_present_now Před 6 měsíci
Fantastic video! A 1080p quality video would make the watching/learning experience much better. Just a candid suggestion.
@AIAnytime Před 6 měsíci
Thanks for the tip! My recent videos have improved. Share your feedback on those if you have any.
@SnehaRoy-xf3zv Před 11 měsíci ⁺¹
Wow thanks for this video... Really helpful
@AIAnytime Před 11 měsíci
Glad it was helpful!
@lupleg Před 11 měsíci
Great tutorial am looking to learning this skills as soon to take new role
@AIAnytime Před 11 měsíci
You can do it! Best of luck.....
@desimulga815 Před 10 měsíci
Thank you very much sir amazing video, very knowledgeable amazing teaching ❤
@AIAnytime Před 10 měsíci
Thanks and welcome
@malikrumi1206 Před 9 měsíci
Excellent!!!
@user-ys3sx8yy9e Před 6 měsíci
Good job, thank you
@SravanKumar-cj4uu Před měsícem
Thank you for your detailed explanation. Your classes are quite interesting and are building confidence to move further forward. I need some suggestions: I saw a medical chatbot using Llama 2 on a CPU machine, which was all open source. Similarly, I need to build an image-to-text multimodal model on a CPU using all open-source tools. Please provide your suggestions.
@elrecreoadan878 Před 9 měsíci
Awsome content! When is it adecuate to fine tune an llm instead of working or as a complement for the botpress knowledge base?
@RajendraVenkata Před 11 měsíci
Good one. Fantastic
@AIAnytime Před 11 měsíci
Glad you liked it
@user-iu4id3eh1x Před 10 měsíci
Outstanding video... Thank you
@AIAnytime Před 10 měsíci
Glad you enjoyed it!
@TheCannabisScientist Před 10 měsíci
Amazing Video!
@AIAnytime Před 10 měsíci
Glad you enjoyed it
@ReconCadre Před 10 měsíci
awesome channel man..more power!
@AIAnytime Před 10 měsíci
Thanks for the visit!
@sneharoy3566 Před 10 měsíci ⁺¹
Thanks
@sanjayas7398 Před 9 měsíci ⁺¹
Amazing video thank you,
I had a question.
1. Unable to retrieve the answers for the question for the content out of the pdf, if we want to get a ans from pdf if not found then from pretrained model. how to configure it.
@user-ev2ek2wn7x Před 11 měsíci ⁺⁶
This was a really well put together Tutorial thank you so much. Just one question what all needs to change to run this on GPU instead of CPU. Thank you so much for your time. Keep up the awesome work!!!!
@GPalem Před 11 měsíci ⁺²
Pick a GPU LLM model from the bloke instead of CPU model. Usually GPU models have GPTQ in their name
@matiasgrimaldi9585 Před 7 měsíci
Quality content, thank you very much
@AIAnytime Před 7 měsíci
Very welcome
@shreeshaaithal- Před 11 měsíci
Very Helpful
@AIAnytime Před 11 měsíci
Glad it helped
@techmontc8360 Před 11 měsíci ⁺⁴
Hi Sir, thank you so much for the tutorial. Do you know how to enable GPU support for this model ?
@goelnikhils Před měsícem
Very Good Video
@dekompose Před 11 měsíci
This is amazing Thank you
@AIAnytime Před 11 měsíci
Glad you like it!
@user-ln3kh4mj8b Před 4 měsíci
Thanks, Open source AI Advocate
@anonymousambassador101 Před 5 měsíci
good work. keep it up
@zeeshanhaleem3807 Před 11 měsíci
Thanks a lot. It really very helpfu.
@AIAnytime Před 11 měsíci
Glad it was helpful!
@Vijay-Khanna Před 10 měsíci ⁺¹
Thank you for the efforts to explain this in very simple way.
Am new to LLM's. Tried your GitHub code, When am asking the question it gives the error "Async generation not implemented for this LLM." Could you please help with a workaround.
@bardaiart Před 11 měsíci ⁺⁶
Appreciate the great work!
Most of the tutorials out there are just trying these LLMs on Colab notebooks, makes you eager for more.
Would appreciate if you can also cover the deployment part, thank you :)
@AIAnytime Před 11 měsíci ⁺¹
Glad you like them! There are a few deployment videos on my channel. Please check out.
@mohankrishnan08 Před 9 měsíci
Really, it was a wonderful video!! Can I train this model in Google Colab or any other cloud GPU's??
@tomwawer5714 Před 5 měsíci
Great content thank you!
@AIAnytime Před 5 měsíci
Glad you liked it!
@Pradip_Deshmukh Před 11 měsíci ⁺¹
Hi sir, thank you so much for the video we are looking for the same type of video. ( I have one request- can you please make a video for data extraction from different types of invoice data with the help of open source model or libraries.)
@manjeshtiwari7434 Před 10 měsíci
Great video
@AIAnytime Před 10 měsíci ⁺¹
Thanks for the visit
@jaychauhan2933 Před 8 měsíci
Great job dude non technical person can also understand your explaination. Thanks n respect for sharing the open source Ai. I have one question how i can restrict this chatbot not to answer any question outside of the document/PDF. For example if i ask chatbot what is python then it is giving the answer but this information is not present in PDF. How i can restrict it and make it only PDF specific bot?
@mrityunjayupadhyay7332 Před 10 měsíci
Amazing Content
@AIAnytime Před 10 měsíci
Thank you.
@silent.-killer Před 5 měsíci
Quality!
@SM-he3qm Před 11 měsíci
Good video!
@AIAnytime Před 11 měsíci
Glad you enjoyed it
@ferdelyszys Před 10 měsíci
super work, thx
@AIAnytime Před 10 měsíci
Thank you !
@garyw7535 Před 6 měsíci
Hey thanks for this, how would one separate the model from chainlit UI? i.e separating concerns and running in two containers if possible.
@suviandal Před 10 měsíci
Very helpful bro
@AIAnytime Před 10 měsíci
Glad it helped
@seanzhang3135 Před 10 měsíci
thanks for sharing
@AIAnytime Před 10 měsíci
Thanks for watching!
@abhayjoshi2121 Před 10 měsíci
Thank you so much for the video genuinely I learned something from this 1 hour , Just one question for GPU we have to change just cpu to gpu or any other package to be updated. Once again great video
@AIAnytime Před 10 měsíci
Not much of change, use the CUDA kernels instead CPU. Couple of changes ofcourse. You can also use the original model for better performance.
@pedroavex Před 11 měsíci ⁺⁶
Hello from Portugal! Thanks for your video, Sir. Could you make a follow up video on how to run it on GPU? As you see there are many viewers interested on it. Being a non-programmer, it would be nice to see a video showing what and where to change on the code. I was able to follow this video and make it work eventhough i don't know coding at all, so i believe you would generate a great video for GPU usage too. Maybe something like a follow up video. Thanks, Sir!
@AIAnytime Před 11 měsíci ⁺³
Hi Pedro, thanks for your lovely comment! I will create a video soon for the GPU as well. Stay tuned....
@pedroavex Před 11 měsíci
@@AIAnytime thanks. Looking forward to it
@franciscoguerreroaraya5921 Před 10 měsíci
Very good content, it is very helpful
on a Mac these developments cannot be run due to the GPUs. However, I understand that in Google Colab it could be carried out, right?
@Borland12345 Před 3 měsíci
Have you made a follow-on video showing how to incorporate GPU acceleration (CUDA for Nvdia) into your codebase?
@Kamranjabbar Před 9 měsíci
Beautifully explained each step, Would you like to confirm, what GPU is best for llama-2 (7B and 13B) model on PC/Laptop.
@AIAnytime Před 9 měsíci
Get anything which has 24GB VRAM if that's in your budget.
@itsyourboyt4683 Před 8 měsíci
Subscribed bro
@motubkchod3758 Před 10 měsíci ⁺¹
Good work, How to get a stream response like chatGpt and output the stream word by word as soon as we get. If possible reply with a code example of current video.
@user-qb3rg6sh5x Před 9 měsíci
Hello! I am curious... Do you know if Could I use ChatOpenAI class for Llama2.cpp deployed on another server? I have OpenAI working already but Llama2 it is impossible without using huggingfaces or loading the model path (I dont have access...)
@_Learn_With_Me_EraofAI Před 9 měsíci
Hi great work thank you 🎉
@AIAnytime Před 9 měsíci ⁺¹
Thankyou.
@inchane4933 Před 11 měsíci
legend!
@BabaHamoudy Před 11 měsíci
I wonder how you could get it so that you type in a bunch of symptoms and it asks follow up and then gives you possible diagnosis.
@radskowak Před 11 měsíci
POWERFUL!!!!
@AIAnytime Před 11 měsíci
Thanks
@subhankarbhattacharya2940 Před 7 měsíci
For some reason for me from the second / third question in a session - i am getting all misconstructed answers with lot of word repetitions and in the terminal window - I can see "Number of tokens (638) exceeded maximum context length (512)." ,... Any idea what it is and how to prevent this ? I can try increasing the max_token, but first want to better understand the issue
@prudhvirajdevireddy1023 Před 11 měsíci ⁺²
Amazing! Can you explain the problems with Langchain in production and provide alternatives for Langchain?
@AIAnytime Před 11 měsíci
Fantastic questions. Let me answer it..... Harrison chase and team has done a great job with Langchain but atm they aren't enterprise ready:
1. It's an arbitrary code execution. Prone to prompt injections.
2. Edge cases issues are identified with integrations.
3. High compute costs due to CPU and memory spikes.
4. Many other vulnerabilities
Let's give Harrison some time on it.
Many other developments are happening. Stay tuned.
@haiderraza1191 Před 10 měsíci
@AIAnytime perfect one! As we saw responses took upto 2 mins. LLMChain primarily consumed all the time. How we can further tune it to speed up while staying on cpu (with respect to both hardware specs or some parameter config/code tuning that can speed up the replies). And what's a good hardware config to run this solution to get chatgpt like responses.
@roamingh Před 11 měsíci
Thanks!
@AIAnytime Před 11 měsíci
Welcome! Thank you for the support.
@gangs0846 Před 6 měsíci
GENIUS
@AIAnytime Před 6 měsíci ⁺¹
Thanks
@ilanosyou Před 10 měsíci
This video is too good, thank you. I have two questions - one, sometimes the answers are very short, is there a way to get longer, more friendly communication? The second question is, can you use chainlit on huggingface spaces, or do you have to stick to streamlit? Thank you so much!
@AIAnytime Před 10 měsíci
Thanks for your comment. Increase the max tokens length. Maybe 1048. I will recommend to deploy this on AWS or Azure. Is there anything specific you want to deploy on HF spaces? I don't think it currently automatically supports, yes there are workarounds.
@fonlesjuridico Před 9 měsíci ⁺¹
Great video, I have a question for you, what model can I use to do it in Spanish, or does it work with the same one?
@LeoAr37 Před 11 měsíci ⁺¹
It would be nice to see how we could actually stream the responses. Also, that quantized version you are using is old, the new quantized versions that have a "K" are better.
@UncleDavid Před 11 měsíci
how do you know did you use it?
@pooja7294 Před 6 měsíci ⁺¹
hii. so while i am on venv and trying to install the requirements.txt, facing issues for installing torch in mac m2 air. Is there any solution for the same?
@philtoa334 Před 9 měsíci
Very Nice .
@AIAnytime Před 9 měsíci
Thank you! Cheers!
@cgmiguel Před 10 měsíci
Nice video! Just one minor question: why not use LlaMA's own encoder layer for the embeddings?
@ruksharalam173 Před 10 měsíci
what's LlaMA's own encoder embedding layer?
@rd2615 Před 6 měsíci
Hey! Thanks for the video. I am running it on a Macbook Pro with M2 chip but it is taking ages for even a single response to come in. Any suggestions?
@MrDarksmartz Před 11 měsíci
Thank you for the tutorial, So how to get streaming chat response?
@buharxan6506 Před 6 měsíci
great tutorial. could you please tell, how to fine tune the model? is it possible?
@pavithrak3989 Před 2 měsíci
Amazing video.
Is it possible to add the translation feature to the response using llm model?.if it is possible, can you tell me how to do it.
@niloufarabaei2722 Před 11 měsíci ⁺⁵
Amazing! This pipeline doesn't work well with CSV files though. Could you make a video explaining how to use csvs with these open-source models?
@AIAnytime Před 11 měsíci ⁺³
Great suggestion! Will come up with something...
@nysagarg3107 Před 10 měsíci
@@AIAnytimeCould you please suggest videos or websites I could use to create a csv chatbot using llama?
@AIAnytime Před 10 měsíci
Hi nysa, find this: czcams.com/video/MUADZ97GgZA/video.html
@ippilitriveni6724 Před 4 měsíci ⁺¹
In the video at 4:39 you have run a command in the terminal. Where should we run it in the windows Operating system. How to do that please give an explanation.
@Kim-kr6of Před 11 měsíci
But does this also work without network - I could see you were having issues with the server one time in the video
@simonbarbeaux6401 Před 6 měsíci
hello i want to use this model epfl-llm/meditron-7b instead of llama but what should i put for the model_type ? thx
@Gabi1000show Před 5 měsíci ⁺¹
Thank you very much for this great tutorial.
I have an error and I am struggling to solve, maybe could you help me?
The error is:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
and I have already tried to uninstall, downgrade, and upgrade versions of faiss-cpu but it does not solve this issue.
@harinathreddya8439 Před 10 měsíci ⁺¹
Hi @AI anytime, thanks for creating this video. It's very informative and simple to understand. I have a simple question, can I create a chatbot like this which could display images as well with the related text. I know there is GPT 4 which is capable of it but I want to know whether it is possible with LLama or any other models?
@heetshah9394 Před 7 měsíci
I am no expert, but I have been looking into the same thing for quite some time and the simple answer is no. But the thing is the larger LLaMA2 models like the 70b parameters one is very capable. It beats GPT-3.5 turbo in benchmarks but fall shy of GPT-4. And well currently PaLM 2 does kinda equal GPT-4 in performance but it is a little bit worse (my personal experience). But the main advantage of such models (LLaMA 2) is that, well obviously they are free, and you can finetune them for your own use case making it more efficient than models that do better generally (like GPT-4). This is especially true for easier use cases like this chatbot, which would not even utilize the entire power of GPT-3 so we wouldn't even consider using the much much costlier GPT-4. I hope you're getting my point. For complex tasks, GPT-4 may be better but reaching the potential of such huge models for most use cases is rare. And remember how mind blowing Chat GPT was just above a year ago when it was released. LLaMA is better than that.
@NeuroNuggets01 Před 8 měsíci
Hi, Thank you for great tutorial. I did follow with you along but I am facing issue that my Llama chat model did answer based on document but it answers generic questions like speed of light and other questions too, Is there anyway that I could get answer only based on documents ?
@vijaivikram4757 Před 8 měsíci
If I ask any questions that is completely unrelated to the data given in PDF, it is giving some random answers from the pdf. How to handle that error and make it give a response like "sorry couldn't able to answer the question"?
@tusharbhatnagar8143 Před 11 měsíci
I tried the same with Instructor embeddings and llama2 model that you prescribed. I am continuously getting this error,
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.7\\bin'
Any idea why? Wasnt it supposed to run on CPU only?
@hanumanparida8131 Před 10 měsíci
Thank you for this, but can u make a similar bot which not only gives response with text but rich media(like images,gif,links) etc. Just like how u create embeddings on the text can u do embedding on images in pdf.Would love to see ur video on this
@aritramukherjee4645 Před 11 měsíci
I am getting an error name 'custom_prompt_template' is not defined in chainlit interface. I have no idea why?

Další v pořadí

Automatické přehrávání

Chat with CSV Streamlit Chatbot using Llama 2: All Open Source