- 40
- 39 473
Mosleh Mahamud
Registrace 19. 09. 2017
Subscribe for regular AI content
Fine Tuning Qwen 2 with Custom Data
Have questions or ideas, meet similar people?
join the discord : discord.gg/R3dPsd2E
Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company.
mosleh587084.typeform.com/to/HSBXCGvX
Notebook links:
Why Fine Tune?
Fine-tuning Qwen 2, a large language model (LLM), is essential for optimal performance and customization. It improves accuracy and efficiency for specific tasks like customer support and content creation. Tailoring the model to industry-specific needs enhances its understanding of specialized terminology and context. Fine-tuning also reduces biases and ensures ethical compliance, providing fair and appropriate responses. Regular updates keep the model relevant with new data and trends. Additionally, it improves interpretability and control, aiding in debugging and continuous improvement. Ultimately, fine-tuning Qwen 2 offers superior user experiences, strategic business advantages, and cost efficiency.
What is Qwen 2?
Qwen 2 is a series of large language models developed by Alibaba Cloud, designed to excel in various AI tasks. The Qwen 2 models range in size from 0.5 billion to 72 billion parameters, making them versatile for applications such as language understanding, generation, multilingual tasks, coding, and mathematics.
The Qwen 2 series boasts significant improvements in performance and efficiency. Leveraging advanced techniques like Group Query Attention, these models deliver faster processing with reduced memory usage. They support extended context lengths up to 128K tokens, enhancing their capability to manage long-form content.
Trained on data in 29 languages, including English, Chinese, German, Italian, Arabic, Persian, and Hebrew, Qwen 2 models excel in multilingual tasks. They have demonstrated superior performance on various benchmarks, surpassing other leading open-source models in language understanding and generation tasks.
Qwen 2 models are also designed with responsible AI principles in mind, incorporating human feedback to align better with human values and safety standards. They perform well in safety benchmarks, effectively handling unsafe multilingual queries to prevent misuse related to illegal activities.
These models are available on platforms like Hugging Face and Alibaba Cloud’s ModelScope, facilitating easy deployment for both commercial and research purposes.
join the discord : discord.gg/R3dPsd2E
Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company.
mosleh587084.typeform.com/to/HSBXCGvX
Notebook links:
Why Fine Tune?
Fine-tuning Qwen 2, a large language model (LLM), is essential for optimal performance and customization. It improves accuracy and efficiency for specific tasks like customer support and content creation. Tailoring the model to industry-specific needs enhances its understanding of specialized terminology and context. Fine-tuning also reduces biases and ensures ethical compliance, providing fair and appropriate responses. Regular updates keep the model relevant with new data and trends. Additionally, it improves interpretability and control, aiding in debugging and continuous improvement. Ultimately, fine-tuning Qwen 2 offers superior user experiences, strategic business advantages, and cost efficiency.
What is Qwen 2?
Qwen 2 is a series of large language models developed by Alibaba Cloud, designed to excel in various AI tasks. The Qwen 2 models range in size from 0.5 billion to 72 billion parameters, making them versatile for applications such as language understanding, generation, multilingual tasks, coding, and mathematics.
The Qwen 2 series boasts significant improvements in performance and efficiency. Leveraging advanced techniques like Group Query Attention, these models deliver faster processing with reduced memory usage. They support extended context lengths up to 128K tokens, enhancing their capability to manage long-form content.
Trained on data in 29 languages, including English, Chinese, German, Italian, Arabic, Persian, and Hebrew, Qwen 2 models excel in multilingual tasks. They have demonstrated superior performance on various benchmarks, surpassing other leading open-source models in language understanding and generation tasks.
Qwen 2 models are also designed with responsible AI principles in mind, incorporating human feedback to align better with human values and safety standards. They perform well in safety benchmarks, effectively handling unsafe multilingual queries to prevent misuse related to illegal activities.
These models are available on platforms like Hugging Face and Alibaba Cloud’s ModelScope, facilitating easy deployment for both commercial and research purposes.
zhlédnutí: 9
Video
NV-Embed-v1: Best Embeddings Model To Use 2024
zhlédnutí 101Před 2 hodinami
Have questions or ideas, meet similar people? join the discord : discord.gg/R3dPsd2E Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company. mosleh587084.typeform.com/to/HSBXCGvX Timestamps: Intro 0:00 MTEB Leaderboard 0:27 Extracting embeddings 1:27 Different embedding methods 3:15 NV-Embed-v1 by NVIDIA NV-Embed-v1 is a generalist embedding model that r...
Deploying Qwen 2 Model With AWS
zhlédnutí 58Před 4 hodinami
This video shows different deploying strategies with Qwen 2 using the easiest method available using 1 cost effective and 1 expensive method. Qwen 2 regardless of size can be deployed on AWS, GCP or azure. Have questions or ideas, meet similar people? join the discord : discord.gg/R3dPsd2E Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company. mosleh587...
Building RAG With Qwen 2
zhlédnutí 863Před 7 hodinami
Have questions or ideas? join the discord : discord.gg/R3dPsd2E Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company. mosleh587084.typeform.com/to/HSBXCGvX Notebook: github.com/mosh98/RAG_With_Models/blob/main/Simple RAG/Qwen2_Lanchain_RAG_DEMO.ipynb Hugginface model card: huggingface.co/Qwen/Qwen2-72B Ollama repo: ollama.com/library/qwen2 The Qwen 2 m...
Nvidia Nim: Deploy Open Source LLMs with 1 click
zhlédnutí 102Před 9 hodinami
Have questions or ideas? join the discord : discord.gg/R3dPsd2E NVIDIA NIM (NVIDIA Inference Microservices) offers numerous benefits for businesses deploying AI models at scale. First, it leverages optimized inference engines tailored to specific models and hardware, enhancing latency and throughput while reducing operational costs and improving user experiences. NIM is part of the NVIDIA AI En...
Classifying Sentences Using Nomic Embed Text
zhlédnutí 72Před 14 hodinami
Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company. AI Freelancing: mosleh587084.typeform.com/to/HSBXCGvX Have questions or ideas? join the discord : discord.gg/R3dPsd2E This video shows how to get embeddings using nomic-embed-text locally! Using Sentences transformer and classifying it using statistical models from sklearn. model card: huggingface.c...
Fine tuning Embeddings Model
zhlédnutí 379Před dnem
Fine tuning with the new Sentence Transformers v3.0. Notebook: github.com/mosh98/RAG_With_Models/blob/main/Fine-Tune/Fine_tuing_embeddings_model_DEMO.ipynb I can help integrate machine learning/AI into your company. mosleh587084.typeform.com/to/HSBXCGvX Have questions or ideas? join the discord : discord.gg/R3dPsd2E This video you will learn 1. Fine tuning embeddings model 2. What types of Data...
Fine-Tuning PaliGemma With Custom Data
zhlédnutí 189Před dnem
I can help integrate machine learning/AI into your company. mosleh587084.typeform.com/to/HSBXCGvX Notebook found here: github.com/mosh98/RAG_With_Models/blob/main/Fine-Tune/Fine_tune_PaliGemma_Demo.ipynb This video is about Fine Tuning PaliGemma with VQA dataset from Hugginface. Unlock the power of AI with PaliGemma, Google's state-of-the-art vision-language model! This video dives deep into th...
Fine Tuning Mistral v3.0 With Custom Data
zhlédnutí 2,2KPřed 14 dny
Don't fall behind the AI revolution! I can help integrate machine learning/AI into your company. mosleh587084.typeform.com/to/HSBXCGvX Mistral Fine tuning: github.com/mistralai/mistral-finetune Have questions or ideas? join the discord : discord.gg/R3dPsd2E Video is about Fine tuning Mistra v3 model with custom data. Mistral v3 is a new model that came out it has many benifits. The Mistral v3.0...
Building RAG with Mistral v0.3
zhlédnutí 996Před 14 dny
Don't fall behind the AI revolution! I can help integrate machine learning/AI into your company. mosleh587084.typeform.com/to/HSBXCGvX Code: github.com/mosh98/RAG_With_Models/blob/main/Simple RAG/Mistralv3_Lanchain_Ollama_RAG.ipynb Video is about getting embeddings from Mistra v3 model using Ollama. Mistral v3 is a new model that came out it has many benifits. The Mistral v3.0 model brings sign...
Get Embeddings From Mistral v0.3 Locally
zhlédnutí 377Před 14 dny
Notebook: github.com/mosh98/RAG_With_Models/blob/main/Simple RAG/Mistralv3_Lanchain_Ollama_RAG.ipynb Video is about getting embeddings from Mistra v3 model using Ollama. Mistral v3 is a new model that came out it has many benifits. The Mistral v3.0 model brings significant advancements in AI technology with its new architectural features, including Sliding Window Attention and Grouped Query Att...
Building AI Assistant for My SEO Work: Crew AI
zhlédnutí 328Před 14 dny
Let me know if i could improve my SEO agent, still working on improving it. Don't fall behind the LLM revolution, I can help integrate machine learning/AI into your company. AI Freelancing: mosleh587084.typeform.com/to/HSBXCGvX What is CREW AI? CrewAI is a powerful framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI enables agents to w...
Text Classification Using Llama 3
zhlédnutí 828Před 21 dnem
Notebook: github.com/mosh98/Embedding_Classification/blob/main/Llama3_Embeddings_classify DEMO.ipynb Don't fall behind the AI revolution, I can help integrate machine learning/AI into your company. AI Freelancing: mosleh587084.typeform.com/to/HSBXCGvX This video shows how to get embeddings using llama3 locally! Using Ollama and classifying it using statistical models from sklearn. Why use Llama...
Get Embeddings From Falcon 2
zhlédnutí 137Před 21 dnem
Don't fall behind the LLM revolution, I can help integrate machine learning/AI into your company. AI Freelancing: mosleh587084.typeform.com/to/HSBXCGvX Code: github.com/mosh98/RAG_With_Models/blob/main/GPT4o_Lanchain_RAG.ipynb Falcon 2 11B paramter model that is supposedly outperforming Llama 3. Falcon-11B model developed by the Technology Innovation Institute (TII). This state-of-the-art langu...
Advanced RAG: Ensemble Retrieval
zhlédnutí 1,6KPřed 21 dnem
Don't fall behind the LLM revolution, I can help integrate machine learning/AI into your company. AI Freelancing: mosleh587084.typeform.com/to/HSBXCGvX Code: github.com/mosh98/RAG_With_Models/blob/main/Simple RAG/GPT4o_Lanchain_RAG.ipynb When building RAG (Retrieval-Augmented Generation) applications, choosing the right retrieval parameters and strategies is crucial. Options range from chunk si...
Building RAG with Llama 3 using LlamaIndex
zhlédnutí 679Před měsícem
Building RAG with Llama 3 using LlamaIndex
Llama 3 vs Claude 3 Benchmark Comparison
zhlédnutí 601Před měsícem
Llama 3 vs Claude 3 Benchmark Comparison
Simply Explained: Retrieval-Augmented Generation
zhlédnutí 286Před 6 měsíci
Simply Explained: Retrieval-Augmented Generation
How about confidential data
Disliked, Jokic is the GOAT
Hahahha, he got handed by ANT tho
Thanks. You know how use paligemma without higgingface? For example I download pali models on my pc and i need make inferences using my gpu with not internet connection
Hi, thanks for a great video, it worked well but how to save those embedding, like I am using a static data and it won't change
Why would I use this if I can just copy the link, insert it into ChatGpt and ask the same question?
You're probably right. This is just a toy example. What if you want to use various complex data sources and care about data security, would you still use chat gpt....
Please make more RAG applications with advance techniques with larger context model
Hi, advanced rag videos are in the works. Coming soon
Very nice tutorial, I was actually looking into nim and found your video. I have a few questions I want to ask you, can I get your discord?
Hi, glad you liked the videos. Lot of people have been asking about my discord: so i made a new discord server. You'll find me there: discord.gg/R3dPsd2E
Could you please post your code used in this tutorial?
Hi, just posted it
@@moslehmahamud9574 Thank YOU!
And what if my data is not in that format because I have a few law judgements and it's not possible to format the data in that way
Thanks for the video. But what models with what configuration could be trained with free tier gpu..? Maybe phi3 mini?
I'll take a look good idea, but colab is the cheapest alternative in the market right now
@@moslehmahamud9574 hmm, okay, thx!
Great video! is it only for english?
Thanks, you can train on other languages too, make sure to pick a multi-lingual model.
HI. thank you for this video. I am using a server, and we are not allowed to use anything on baseline, but have tyo create docker container. I have installed llama in docker following the command docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama and i am able to run llama3 from ollama with the command docker exec -it ollama ollama run llama3 Now, could you please tell me how can i follow your way to use ollama for embedding? i want to use llama3 from ollama as embedding model like you did in the video.
Create a video on how to fine-tune multi-modal LLM models using custom image datasets.
Interesting Idea i'll look into it
Hi Mosleh i try to get an appointement with you but your link don't work.
Hi Jean, just opened a slot for you, the link should work now.
thank you
Good video. LBJ is far from GOAT though
Well, different folks, different strokes ;) Glad you liked the video
Can you implement evaluation with qdrant, ragas or some-other fav framework, langchain, langfuse (open-source alternative to langsmith)
Hi, thanks for the suggestions i'll write them down. I did make some around ragas though, you can find them in the channel.
@@moslehmahamud9574 looking forward to it also checkout aporia for rag hallucination make video if you can as i don't have a company email to signup
You could have used a pre-trained embedded model for feature extraction. I believe the results would be better than a text generation model such as llama 3 for feature extraction.
Great idea, I wanted to see if it would perform any good with llama 3
Could you suggest some pre trained embedded models for feature extraction/ classification?
@@moslehmahamud9574 Any update on this?
Could you please explain how to do it? I would like to test out this idea? Thanks!
@@moslehmahamud9574 Do you have a video for this?
I couldnt see ur any day I can book, dont u have a discord or something?
Hi, i've added in some slots, you should find them now.
Thank you. Also why use Mistral 3 vs nomic-embed
Good question, nomic-embed is a strong embeddings model. However, it could be useful to try mistral 3 model just to experiment. Could be better or worse for different use cases.
if you have a pretrained model , and your using it ... why would you use some forigin embeddings from another model on your rag system ? are you going to pay for something you already have ? or is your model not good enough to provide you the mebdding it uses for prediction ? perhaps you should also use the models tokenizer as well ? it really sounds silly when people have seprate models for seperate tasks when one model can handle the job ? crazy thinking !
@@moslehmahamud9574 to use it for embedding alone is not correct but if this is the model you are using as a pretrained then it make sense to use the same embedding as the model itself as they have been trained!: in fact: when you fine tune your model you update these embeddings also so your training your embeddings model also !: so if you have trained you model to handle code or other custom data then obvioulsy your emebdding space is also trained for this ! but the orignial base model maybe very far from your fine tuned model ! hence your embedding space is personalized to the model ! especially fine tuned moels of the same type .... ie my mistral and yours both 7b instruct ...both trained on personal lines... will have different embeddings !
@@moslehmahamud9574 nomic is a great embeddign for emdding only models .... the question is how to replace your tokenizer with the sentence transformer ... created like this ? can this be the final tokenizer ?
Can you do a tutorial on how to fine-tune Falcon 2? There isn't much content on it
fantastic idea! I'll look into it
simple but awesome explanation.
Work very hard to keep it simple, thanks!
By specifying the model in the ollama.embeddings() call and in the OllamaEmbeddings class, what goes on behind the scenes and how is that model utilized in that scenario? Are there advantages to different models specified for the embedding process?
Very insightful question. I'm assuming the embeddings are extracted from the linear layer at the end of the LLama architecture (this is an assumption, of course). Regarding the advantage part, it depends on your use case, but using the embeddings can be an additional experiment. It could also be useful to check other embeddings models too.
Hey Buddy, I've been noticing u from a while now on this channel Can u help me improve my RAG? I have everything ready and working just wanna optimise it
Hi, i made some videos on optimizing rag architecture. If you need help with something specific feel free to book a meeting. The link in the description. Will have some slots opening very soon.
Hi Mosleh, thanks for this. I keep getting error when trying this with llama3. This worked perfectly with llama2? What could be the reason? I have both llama2 and llama3 installed. Actually I tried first with llama3 and then installed and tried with llama2. With llama3 I keep getting error that it cannot establish a connection, even though I could see that llama3 is running on port 11434
Hmm, quite an unusual problem. Maybe try running llama3 on a different port?
Can you make a video of this in vscode
what tools do you use to generate the mind map?
mindmeister
Thank you!
Gr8 Video, Super Userful :)
You ran LLama 3 8B locally on a Mac?!
Yes sir
Thank your for your explanation. I will upload your video to chat gpt so he can do the understanding part for me.
Thanks! Any part of the video that was no so easy to understand? Maybe i could improve it
@@moslehmahamud9574 It was just a joke about overusing IA for everything. I don't even know what RAG is. :D
Yea okey i fell for that 😂
great video 🫡 could i expect an advanced tutorial on building an chatbot using gpt-4o api and implementing RAG
Thanks! I'm working on an advanced rag implementation as of writing. Although, I did make an adavanced rag technique video in my channel. Hope you find it useful
Do advanced RAGs search from the internet if it cant find it in the data ive provided ???@@moslehmahamud9574
Hey! What are your computer specs? Wondering how that may affect speed, either positively or negatively.
Hey! Using an M1 macbook pro (2020). Works decent for basic inference. Training is bit of a struggle, as expected. Let me know if you have any tips
How to solve it
Getting Error --------------------------------------------------------------------------- ReadTimeout
Hard to say my friend. Hope you figure it out :)
😊
Cool
Hi man! nice explanation. I was also trying to do that at my end. But I am getting some validation errors: raise validation_error pydantic.v1.error_wrappers.ValidationError: 2 validation errors for LLMChain llm instance of Runnable expected (type=type_error.arbitrary_type; expected_arbitrary_type=Runnable) llm Don't know why its happening. Can you please tell me how to resolve this
Quite unsure to be honest, did you end up solving it?
How is different from visual Bert ?
Is it true that the embedding values of the three methods are different for the same sentence?
possible
Thanks for sharing :) In the initial package installation I had also to run: `pip install llama-index-embeddings-ollama` in order to run `from llama_index.embeddings.ollama import OllamaEmbedding`.
Thanks! thats correct! i'll add it in the notebook. Frogot that i installed it before
Em PTBR czcams.com/video/0VtGC_N3Rvk/video.htmlsi=s5QDIeDhtLmYlZxb
What about without using gpt 4?
Hello, what if i wanted to use the rag agent for a crm website how would that work
Could be worth a shot!
I, too, have embarked upon the arduous journey ofRAG, encountering a similar quandary wherein the retrieval mechanism, much to my chagrin, procures contextual information that bears little to no relevance to the query at hand. My hypothesis is that the embedding model, the very foundation upon which this edifice is built, is the root of this discrepancy. As my endeavors lie within the realm of the Indonesian language, I humbly beseech thee for any sagacious counsel or erudite suggestions that may illuminate the path towards a resolution
Phi3 was specifically trained on a smaller quantity of higher quality data to similarly performing models, this means that there was a focus on english language data in training
As the other guy mentioned, you could train the embeddings model on your domain, but also maybe test your retrival using various evaluation metric. Maybe RAGAS or something similar?
Amazing habibi
Habibi! 💯
Hi! thanks for the great example. Have you tried to ask more questions e.g. "Which team has the most 3-point shots?"?
Hi, Good question! Took the NBA post season stats from espn. Did multiple runs, But it unfortunately did not give a good response. The recall could be it's weak spot.
Nice
I was scared a little bit when your voice changed lol 2:18 But good video overall, thank you
hahaha sorry to scare you. Glad you enjoyed it
Great ❤ Awesome lesson, Please look at fine-tuning this existing model with many documents for Ollama models. I looked everywhere and couldn't locate one without utilising an API or Langchain.
Thanks! I'll make sure to read up on it