Full text tutorial: www.mlexpert.io/prompt-engineering/private-gpt4all Get the Google Colab notebook: github.com/curiousily/Get-Things-Done-with-Prompt-Engineering-and-LangChain Prompt Engineering Guide: www.mlexpert.io/prompt-engineering Thank you for watching!
Добро е GPT4All и аз го тествах, но наистина е бавно. Според Gartner, близката година за технологиите ще е с фокус Edge AI, което ще значи, че тези модели би трябвало да се оптимизират да работят на embedded PC-та .. дори на телефони. Но "поживем - увидим" както се казва :). Браво за видеото. Супер е.
This is an excellent and comprehensive demonstration. Thank you for being realistic. You have well described the performance limitations of this Experiment, But the privacy still makes it attractive.
I am trying to do on server. I am getting issue Using embedded DuckDB with persistence: data will be stored in: db Illegal instruction (core dumped). can you help me out regarding this?
you can use other models instead of the huge gpt4all models e.g: model_path = "/LaMini-Flan-T5-248M" from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from langchain.llms import HuggingFacePipeline tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSeq2SeqLM.from_pretrained(model_path) pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True, "temperature":0.2}) llm = HuggingFacePipeline(pipeline=pipe)
I got the following answer: "I do not have access to information regarding specific companies' financials or dividends paid in previous years due to data privacy concerns and restrictions imposed by laws such as GDPR (General Data Protection Regulation). However, you can find dividend amounts for a company's historical period on their website if available." LOL, what am I miising?
I am getting this error, llama_model_load: loading model from './models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ... llama_init_from_file: failed to load model. Please help me to get this resolved.
You video was really inspiring for me, many many thanks. I use a google colab environment with 35 GB RAM. When I ask a question for the second time it crashes due to RAM limits. Do you know why.? Also I would like your opinion in the following. I have a csv file where each row holds a mathematical definition (generally it holds some text) What would be the best way to return the text that is more relevant to the user's question?. Should I use gpt4all with langchain or there is a simpler way to do it? One problem I am facing is that I want it to work for the Greek language so far I can't find an LLM that works with Greek, only GPT from openAI, but I want an open source solution. I am struggling with this many years now. : (
you can use other models instead of the huge gpt4all models e.g: model_path = "/LaMini-Flan-T5-248M" from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from langchain.llms import HuggingFacePipeline tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSeq2SeqLM.from_pretrained(model_path) pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True, "temperature":0.2}) llm = HuggingFacePipeline(pipeline=pipe)
Thanks @@hillhitch for your response, I will try it. Is there a chance that you know any open sourced LLM which supports the Greek language? Until now I have only found some fine-tuned gpt2 models "nikokons/gpt2-greek" , "lighteternal/gpt2-finetuned-greek" and tried to fine tune them further for my task, but with no luck ☹ since my data is small.
you can use other models instead of the huge gpt4all models e.g: model_path = "/LaMini-Flan-T5-248M" from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from langchain.llms import HuggingFacePipeline tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSeq2SeqLM.from_pretrained(model_path) pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True, "temperature":0.2}) llm = HuggingFacePipeline(pipeline=pipe)
can you please make these videos? 1. finetuning the 4bit quantized models e.g., anon8231489123/vicuna-13b-GPTQ-4bit-128g, step by step 2. top 4 opensource embedding models how to finetune them and use in langchain 3. scaling with langchain, how to have multiple sessions with LLM, meaning how to have a server with the LLM and serve to multiple people concurrently. What will be the system requirements to run such a setup. I believe we will be needing kubernetes for the scaling
Full text tutorial: www.mlexpert.io/prompt-engineering/private-gpt4all
Get the Google Colab notebook: github.com/curiousily/Get-Things-Done-with-Prompt-Engineering-and-LangChain
Prompt Engineering Guide: www.mlexpert.io/prompt-engineering
Thank you for watching!
Добро е GPT4All и аз го тествах, но наистина е бавно. Според Gartner, близката година за технологиите ще е с фокус Edge AI, което ще значи, че тези модели би трябвало да се оптимизират да работят на embedded PC-та .. дори на телефони. Но "поживем - увидим" както се казва :). Браво за видеото. Супер е.
nice wutang shirt and great content. thanks
This is an excellent and comprehensive demonstration. Thank you for being realistic. You have well described the performance limitations of this Experiment, But the privacy still makes it attractive.
It immediately asked for permission to read all sources of data (documents, desktop etc.) There is a workaround but that was chilling. (MacOS GUI)
Can you please make a video on using LangChain for Pandas DataFrame with GPT4All querying.
Woah. This is so cool! Thanks for video
Thank you so much! this is really awesome content!
Thanks man! go ahead
I am trying to do on server. I am getting issue
Using embedded DuckDB with persistence: data will be stored in: db
Illegal instruction (core dumped). can you help me out regarding this?
your solution is superb i really enjoyed this but i want to ask one thing like in this code query time is too much how can we reduce this?
you can use other models instead of the huge gpt4all models e.g:
model_path = "/LaMini-Flan-T5-248M"
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from langchain.llms import HuggingFacePipeline
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True,
"temperature":0.2})
llm = HuggingFacePipeline(pipeline=pipe)
I am trying this, but its not give output only give connection timeout. Why ? Could you help me in this
I got the following answer:
"I do not have access to information regarding specific companies' financials or dividends paid in previous years due to data privacy concerns and restrictions imposed by laws such as GDPR (General Data Protection Regulation). However, you can find dividend amounts for a company's historical period on their website if available."
LOL, what am I miising?
I am getting this error,
llama_model_load: loading model from './models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
llama_init_from_file: failed to load model. Please help me to get this resolved.
explained thoroughly, however quickly outdated, code can not run properly now.
Can I use the new model LLAMA 2 to chat with PDF ? Or this is good and better ?
It's normal +- 10 min on request?I think it's very long time. I have other methods for it task? I can run on GPU?
How many files do you think this would be able to successfully query?
I'm a total beginner. Once all packages are locally saved, can I use this offline or do I need to be connected to Internet?
does it compromise my data?
did you find any way to speed up the response?
You video was really inspiring for me, many many thanks. I use a google colab environment with 35 GB RAM. When I ask a question for the second time it crashes due to RAM limits. Do you know why.? Also I would like your opinion in the following. I have a csv file where each row holds a mathematical definition (generally it holds some text) What would be the best way to return the text that is more relevant to the user's question?. Should I use gpt4all with langchain or there is a simpler way to do it? One problem I am facing is that I want it to work for the Greek language so far I can't find an LLM that works with Greek, only GPT from openAI, but I want an open source solution. I am struggling with this many years now. : (
you can use other models instead of the huge gpt4all models e.g:
model_path = "/LaMini-Flan-T5-248M"
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from langchain.llms import HuggingFacePipeline
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True,
"temperature":0.2})
llm = HuggingFacePipeline(pipeline=pipe)
Thanks @@hillhitch for your response, I will try it. Is there a chance that you know any open sourced LLM which supports the Greek language? Until now I have only found some fine-tuned gpt2 models
"nikokons/gpt2-greek" , "lighteternal/gpt2-finetuned-greek" and tried to fine tune them further for my task, but with no luck ☹ since my data is small.
@@georgekokkinakis7288 let me check for you
Have you checked the ones on huggingface?
Yes. The ones I mentioned above are from huggingface.
Does it work in any language?
GPT4All only works on CPU so far ;)
How to increase the response time?
you can use other models instead of the huge gpt4all models e.g:
model_path = "/LaMini-Flan-T5-248M"
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from langchain.llms import HuggingFacePipeline
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True,
"temperature":0.2})
llm = HuggingFacePipeline(pipeline=pipe)
Is it run on the CPU
it's not 100% local still
It's toooo slow
Excellent video but these free models r useless
can you please make these videos?
1. finetuning the 4bit quantized models e.g., anon8231489123/vicuna-13b-GPTQ-4bit-128g, step by step
2. top 4 opensource embedding models how to finetune them and use in langchain
3. scaling with langchain, how to have multiple sessions with LLM, meaning how to have a server with the LLM and serve to multiple people concurrently. What will be the system requirements to run such a setup. I believe we will be needing kubernetes for the scaling