Love your content man thanks can you make some boot in function calling with some of the new models video locally with Ollama ? It’s one of the easiest tools to use programmatically and you are one of the best explainers
You can use that . If on Linux just make sure you install nvidia drivers properly , I suggest just using endeavor os or pop is for easy integration and updating of the nvidia drivers
Do a video with big datasets RAG. Because this seems to be an issue. GraphRag let's say works on small data, as soon as you try to embed larger datasets it will not work. Furthermore, sql databases, same problem because of token context lenght 128k....
I want to understand 2 things, I hope you answer. How do ollama bring the model's on their platform so fast, are they quantized? supressed. Do they keep the model on their servers so we go and download it from there? secondly, does huggingface not allow to download the model locally?
Ollama and Hugging face working hard day and night to get this available on their platform , from what I could see. You can download from both ollama and also from hugging face . Ollama generally is a quantised version by default
@@user-im8bv8po2w quantized = reduced in resolution. Think image X with resolution R1. Reduce the number of pixels and you get resolution R2. There is a difference between R2 and R1 but both still allow you to see the end image. Same with quantised model it changes storage from 4F to 2F - there is a loss but not enough to stop the model working to 90% of original R1 Sorry best example i can think of.
some important things missing from these none gpt models, there is no spelling correction, it only shows you that you misspelled a word, you cant upload pics, you cant have the model speak and not just type, and you cant talk to it and have your audio become text, all of these things can be done with gpt 4
except it doesn work ←[?25lpulling manifest ? ←[?25h←[?25l←[2K←[1Gpulli [1Gpulling manifest ? ←[?25h←[?25l←[2K←[1Gpulling Error: pull model manifest: file does not exist
Love your content man thanks can you make some boot in function calling with some of the new models video locally with Ollama ? It’s one of the easiest tools to use programmatically and you are one of the best explainers
using praison chat with ollama isn't working. Praison chat will always display a login page for some reason to me
How did you configure LM Studio to take advantage of the 128K context window and Metal on Mac?
Damn man this is off the hook. I wish there was a way to have it integrate with my windows file system and control programs
@mervinpraison
which GPU i need to run this model locally ??
i have nvidia rtx p4000
You can use that . If on Linux just make sure you install nvidia drivers properly , I suggest just using endeavor os or pop is for easy integration and updating of the nvidia drivers
thank you im not an expert and you made it simple thank you very much
How would you build a rag with ollama for your company? And big thx for your short but gorgeous contents ❤
can you advise me with a low model , i want to train it with my own data , for example you are einshtein , what do you advise me to do , thx a lot
What is the benchmark website?
can you run this on m3 with 70b?
Do a video with big datasets RAG. Because this seems to be an issue. GraphRag let's say works on small data, as soon as you try to embed larger datasets it will not work. Furthermore, sql databases, same problem because of token context lenght 128k....
I want to understand 2 things, I hope you answer.
How do ollama bring the model's on their platform so fast, are they quantized? supressed.
Do they keep the model on their servers so we go and download it from there?
secondly, does huggingface not allow to download the model locally?
Ollama and Hugging face working hard day and night to get this available on their platform , from what I could see. You can download from both ollama and also from hugging face .
Ollama generally is a quantised version by default
@@MervinPraison what does quantised mean?
@@user-im8bv8po2w quantized = reduced in resolution.
Think image X with resolution R1. Reduce the number of pixels and you get resolution R2.
There is a difference between R2 and R1 but both still allow you to see the end image.
Same with quantised model it changes storage from 4F to 2F - there is a loss but not enough to stop the model working to 90% of original R1
Sorry best example i can think of.
```json
{
"data": {
"memory": {
"ram_capacity": "15.77 GB",
"ram_unused": "3.38 GB"
},
"gpu": {
"gpu_names": [
"Quadro M2000"
],
"vram_recommended_capacity": "4.00 GB",
"vram_unused": "3.32 GB"
},
"os": {
"platform": "win32",
"version": "10.0.19045"
},
"app": {
"version": "0.2.29",
"downloadsDir": ""
},
"model": {}
}
}```
getting this eerro. any solution
some important things missing from these none gpt models, there is no spelling correction, it only shows you that you misspelled a word, you cant upload pics, you cant have the model speak and not just type, and you cant talk to it and have your audio become text, all of these things can be done with gpt 4
my system struggling so bad to run even the 8b version
nice video but i am having issues installing praisonai when I pip install "praisonai [chat]" i have error the system cannot find the file specified
Are you using windows?
You might need to have python and pip pre installed
I am on a Mac M2 pro, Doesnt work, I get chainlit login window every time I enter a prompt, I set the model to llama 3.1.
Please use the username and password as admin and admin
@@MervinPraison thank you
what is praisonai doing....I dont actually want to be tied to anything proprietary at this stage.
As I am still learning all the base tooling
except it doesn work
←[?25lpulling manifest ? ←[?25h←[?25l←[2K←[1Gpulli
[1Gpulling manifest ? ←[?25h←[?25l←[2K←[1Gpulling
Error: pull model manifest: file does not exist