2 months ago? super impressive! I believe ollama has its own embeddings now. Thank you so much for the video. Could you do a video about datastore to have a persistent memory or something with autogen using multiple ollama models as agents if possible?
This is great. Easy understanding. Thanks very much for sharing. Also,How to point to a directory. I would love to point to a vault of obsidian to have a summary of it .
Thanks for your video, I really enjoy your relaxed, straightforward and easy-to-follow style. Since Ollama now supports embeddings and you mentioned the advantages of storing vectors, would you like to make a new video, showing - for example - Ollama with langchain and lancedb in action? Thank you for your great content and especially for Ollama, of course!
It would be very nice to have a video with javascript. There are plenty of them using OpenAi, but with ollama it has to be a bit different. We could learn a lot from it and thanks if you do it!
@@technovangelist this is true, but this video was not made using javascript. I think it would be a very useful thing to do, because it shows me the practical use. 🤟 I just suggested it. 😁 And you said at the end of the video that there would be one. 😇
That vecctor store is created at runtime. What if we want to create chroma database? Will the retriever be the same and how do you think we can utilize that while prompting the LLM?
So, i find the problem with gpt is it refuses i/o functionality. Im looking to be able to say "open the pdf, read some details and fill out 10 cells in excel".
With gpt or with open source llms? Usually there are two parts. There is the runner that runs the model. Then there are integration pieces that you write to get it to integrate with whatever you want to do. There are more and more tools out there that work with the runners to provide similar functionality to what you are wanting. So it doesn’t refuse, the models themselves are just not designed to do that.
Why do I get ModuleNotFound error with both import ollama and from langchain.llms import Ollama after I have downloaded them both in the env? Why are they not there? What am I missing?
Hmm, I don't know. Python is a crazy system and its very easy to have multiple versions of python on your machine and have everything set up in one but using another. If you are more familiar with other languages, you can use those too.
@@noname-deadend777 maybe too but Im specifically talking about the langchain way to call Ollama, funny thing I have to use it now and I dont remember how it is :-'(
hi Matt, I'm going crazy with this error "ImportError: cannot import name 'Ollama' from 'langchain.llms'" I SWEAR I installed langchain's last version. When I go to see the package content, in the llms folder, there's nothing called "Ollama" so I understand there must be some problem there. I already tried pip install 'langchain[all]'
How can i stream the response, I am using stream lit but unable to stream the response, will appreciate your help ``` chat_model = ChatOllama( base_url=ollamaPath, model=modelName, temperature=temperature, verbose=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])), ) qa = RetrievalQA.from_chain_type( chat_model, chain_type="stuff", retriever=vectorstore.as_retriever(), chain_type_kwargs={"prompt": promptRepository.basicPrompt()}, ) qa({"query": prompt})['result'] ```
here's what works for me: ``` from langchain.llms import Ollama from langchain.callbacks.manager import CallbackManager from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = Ollama(model="llama2", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), temperature=0.2) ```
@@technovangelist no but first of all i know quite little about python and second of all i am experiencing an error when using spyder and anaconda. it says ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate/ (Caused by NewConnectionError(' and others say it is some sort of problem with a container and localhost
Ok, I would assume that’s because the server is not accessible to the way you are calling it. There are a number of reasons that could be. I'll look at creating a video in the next week that covers these issues.
Hi Matt, thanks for this and many other tutorials that you've been putting together. I've found them really valuable. When trying to complete the above tutorial I've hit an issue with the GPT4All library use. It's throwing this error File "C:\Users\peppe\AppData\Local\Programs\Python\Python312\Lib\site-packages\gpt4all\_pyllmodel.py", line 339, in generate_embeddings raise RuntimeError(f'Failed to generate embeddings: {msg}') RuntimeError: Failed to generate embeddings: locale::facet::_S_create_c_locale name not valid Not sure if you or other have come across this one or not. I've done a lot of Internet searching and anything related to that error doesn't seem to fix this for me. I've also noted that there is an OllamaEmbeddings now, but that just hangs for me. No errors at all. Really appreciate any thoughts.
man you didnot talk a single bs word, thats great respect to saving audience time, much appreciated!
2 months ago? super impressive! I believe ollama has its own embeddings now. Thank you so much for the video. Could you do a video about datastore to have a persistent memory or something with autogen using multiple ollama models as agents if possible?
This is gold! thanks a lot, very clear and easy to understand
This is great. Easy understanding. Thanks very much for sharing. Also,How to point to a directory. I would love to point to a vault of obsidian to have a summary of it .
i am not good in programming ,but this video help me start my local RAG easliy
Thanks for your video, I really enjoy your relaxed, straightforward and easy-to-follow style. Since Ollama now supports embeddings and you mentioned the advantages of storing vectors, would you like to make a new video, showing - for example - Ollama with langchain and lancedb in action? Thank you for your great content and especially for Ollama, of course!
Awesome, thank you!
SentenceTransformerEmbeddings can allow encoding using sentence-transformer.
It would be very nice to have a video with javascript. There are plenty of them using OpenAi, but with ollama it has to be a bit different. We could learn a lot from it and thanks if you do it!
Actually that was one of a very few with something other than typescript/javascript
@@technovangelist this is true, but this video was not made using javascript. I think it would be a very useful thing to do, because it shows me the practical use. 🤟 I just suggested it. 😁 And you said at the end of the video that there would be one. 😇
Do you have any video about using tools/custom tools with ollama and Langchain?
That vecctor store is created at runtime. What if we want to create chroma database? Will the retriever be the same and how do you think we can utilize that while prompting the LLM?
Is there anyway to keep the model loaded on the GPU so that I do not have to wait for a long time to get the output (I am using notebook)
Thanks so much, Matt! Please post your code when you get a chance.
Nice ending ;-P But really, thanks for sharing.
Any idea why the GPT4ALL would download the .bin file then say invalid model type?
I don’t know much about gpt4all
Great video! Please consider revisiting this topic with Ollama embedding and fulfill your promise to see this done in JavaScript. 😅
So, i find the problem with gpt is it refuses i/o functionality. Im looking to be able to say "open the pdf, read some details and fill out 10 cells in excel".
With gpt or with open source llms? Usually there are two parts. There is the runner that runs the model. Then there are integration pieces that you write to get it to integrate with whatever you want to do. There are more and more tools out there that work with the runners to provide similar functionality to what you are wanting. So it doesn’t refuse, the models themselves are just not designed to do that.
whch app you use for coding python?
Vscode mostly
Why do I get ModuleNotFound error with both import ollama and from langchain.llms import Ollama after I have downloaded them both in the env? Why are they not there? What am I missing?
Hmm, I don't know. Python is a crazy system and its very easy to have multiple versions of python on your machine and have everything set up in one but using another. If you are more familiar with other languages, you can use those too.
the awkward silence at the end. 😄
The tutorial is great, just be careful with langchain imports, some of them are deprecated and apparently its something usual.
is it something about Chroma?.. because in my pc there is no Chrom but i have chroma instead
@@noname-deadend777 maybe too but Im specifically talking about the langchain way to call Ollama, funny thing I have to use it now and I dont remember how it is :-'(
can you do this with pdf?
hi Matt, I'm going crazy with this error "ImportError: cannot import name 'Ollama' from 'langchain.llms'" I SWEAR I installed langchain's last version. When I go to see the package content, in the llms folder, there's nothing called "Ollama" so I understand there must be some problem there. I already tried pip install 'langchain[all]'
Hey can you post this in the discord. May be easier to support. Go to ollama.ai and then click the discord link at the top.
@@technovangelist thanks! it turns out it was a problem with my Python installation
How can i stream the response, I am using stream lit but unable to stream the response, will appreciate your help
```
chat_model = ChatOllama(
base_url=ollamaPath,
model=modelName,
temperature=temperature,
verbose=True,
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])),
)
qa = RetrievalQA.from_chain_type(
chat_model,
chain_type="stuff",
retriever=vectorstore.as_retriever(),
chain_type_kwargs={"prompt": promptRepository.basicPrompt()},
)
qa({"query": prompt})['result']
```
Ollama defaults to streaming. With langchain I think there is a stream function but best to refer to their docs
here's what works for me:
```
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = Ollama(model="llama2",
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
temperature=0.2)
```
how did you find the url for ollama?
ollama.ai
can you please tell me what ide youre using?
It’s just vscode. Does it look different?
@@technovangelist no but first of all i know quite little about python and second of all i am experiencing an error when using spyder and anaconda. it says ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate/ (Caused by NewConnectionError(' and others say it is some sort of problem with a container and localhost
Ok, I would assume that’s because the server is not accessible to the way you are calling it. There are a number of reasons that could be. I'll look at creating a video in the next week that covers these issues.
That would be awesome!@@technovangelist
Wait a second you need to start a server? I just did what was in the tutorial. Is there some important setup work to do?@@technovangelist
Hi Matt, thanks for this and many other tutorials that you've been putting together. I've found them really valuable. When trying to complete the above tutorial I've hit an issue with the GPT4All library use. It's throwing this error
File "C:\Users\peppe\AppData\Local\Programs\Python\Python312\Lib\site-packages\gpt4all\_pyllmodel.py", line 339, in generate_embeddings
raise RuntimeError(f'Failed to generate embeddings: {msg}')
RuntimeError: Failed to generate embeddings: locale::facet::_S_create_c_locale name not valid
Not sure if you or other have come across this one or not. I've done a lot of Internet searching and anything related to that error doesn't seem to fix this for me. I've also noted that there is an OllamaEmbeddings now, but that just hangs for me. No errors at all.
Really appreciate any thoughts.
I can confirm that this has been fixed and is working with the latest v2.3.2 release of gpt4all now. Just in case others see this same issue.