Merge LLMs to Make Best Performing AI Model
Vložit
- čas přidán 1. 06. 2024
- This video is about mergekit, how to choose and blend models. It's non technical but links to technical papers are included. You need to know how to navigate the terminal but no programming is required.
🤖 Join my Discord community: / discord
📰 My tutorials on Medium: / mayaakim
🐦 My twitter profile: / maya_akim
To rent a GPU from Massed Compute (mergekit preinstalled) follow the link ⤵️
bit.ly/maya-akim
Code for 50% discount: MayaAkim
All links:
mergekit:
github.com/arcee-ai/mergekit
Open LLM Leaderboard
huggingface.co/spaces/Hugging...
my huggingface profile (with model configs you can copy):
huggingface.co/mayacinka
git installation:
gitforwindows.org/
lfs installation:
docs.github.com/en/repositori...
supported architecture for mergekit:
github.com/arcee-ai/mergekit/...
best blog about mergekit:
/ merge-large-language-m...
other really good blog about mergekit:
/ merge-large-language-m...
Charles Goddard’s blog: (author of mergekit)
goddard.blog/about/
Mona lisa with Mohawk
www.designboom.com/technology...
What is YAML:
www.techtarget.com/searchitop...
What is Data Contamination:
bdtechtalks.com/2023/07/17/ll...
Goodharts law
www.cna.org/reports/2022/09/g...
LazyMergekit:
colab.research.google.com/dri...
Auto evaluation: (requires runpod profile)
colab.research.google.com/dri...
configuration with 14 models merged:
huggingface.co/EmbeddedLLM/Mi...
MoE instructions:
github.com/arcee-ai/mergekit/...
higher density - better results
github.com/arcee-ai/mergekit/...
Model family tree: (visualization)
colab.research.google.com/dri...
huggingface.co/spaces/mlabonn...
cost of training mistral:
www.ft.com/content/387eeeab-1...
Leaderboard is disgusting:
/ open_llm_leaderboard_i...
Merging models with different architectures:
arxiv.org/pdf/2401.10491.pdf
merging models different arch:
github.com/18907305772/FuseLLM
Blending is all you need:
arxiv.org/pdf/2401.02994.pdf
Model soups
arxiv.org/pdf/2203.05482.pdf
Ties-merging research paper:
arxiv.org/pdf/2306.01708.pdf
Dare merge research paper:
arxiv.org/pdf/2311.03099.pdf
Task arithemtic:
arxiv.org/pdf/2212.04089.pdf
Benchmarks
Arc benchmarks
deepgram.com/learn/arc-llm-be...
arxiv.org/pdf/1803.05457.pdf
HellaSwag
arxiv.org/pdf/1905.07830.pdf
MMLU
arxiv.org/pdf/2009.03300.pdf
TrithfulQA
arxiv.org/abs/2109.07958
WinoGrande
arxiv.org/pdf/1907.10641.pdf
GSM8K
arxiv.org/pdf/2110.14168.pdf
overfitting problem Ann Lotz:
arstechnica.com/tech-policy/2...
Benchmarks are a problem screenshots:
analyticsindiamag.com/the-pro...
/ llm_benchmarks_are_bro...
/ llm_benchmarks_are_bul...
Attributions:
[commons.wikimedia.org/wiki/Fi...](commons.wikimedia.org/wiki/Fi...)
Timecodes:
0:00 - 1:47 - blending intro
1:48 - 3:36 - promise of blending
3:37 - 4:22 - blending steps and requirements
4:23 - 5:05 - all you need is hardware
5:06 - 5:30 - mergekit installation
5:31 - 9:23 - merge methods
10:48 - 13:31 - configurations and yaml
13:32 - 14:38 - how to run merge
14:39 - 14:42 - upload merged model
14:43 - 16:27 - best merge method
16:28 - 20:16 benchmark problems, overfitting and contamination
#mergekit #llm #localmodels - Věda a technologie
i hope you find the video useful and don't forget to show (and brag about) your blended models!
Thank you. Found this from your post on X.
Your videos are always very good and cutting edge.
Thanks for the very informative video.
Cheers from "Down Under"!
videos are so great! i will be modest to learn as more as i can!
Awesome video!! Been following your series on building AI agents and they're very good! Thanks for sharing!
Its mindblowing to me how good your videos are yet you are still so unknown. Keep it up!
Excellent walk through, thanks !
Great video Maya! Keep em coming! 😎
Fantastic video, so well prepared, fool-proof explained, and a really cutting-edge topic. Best AI CZcamsr out there - thank you 🙏
This is destined to evolve into a meticulously curated, go-to channel of human reliability for years to come. Thank you very much for the exceptional quality you provide!
Wow, what an incredible explaination of merge methods. Thank you.
Maya, you are good at this stuff. you are averaging my internal mind vectors to make Ai easy. Keep doing so!
Great video Maya. Keep it up ❕❕❕
Blending LLMs is a fascinating idea. The idea left me wondering: Why hasn't anybody developed a system/app that takes the API's from the top LLMs, created agents for each, and then have these agents all work together to brainstorm, debate, review, and solve problems? I often get 4 different answers from 4 LLMs, so why not have them all setup as agents "in one room" working together to come up with the "best" solution. I can't find anybody that's tried this... why not? Wouldn't having the "top minds" (LLMs) working together produce better results?
they could become worse, or give you 4 different answers, or could not stop talking around themsaelves
well articulated and educational video, thank you Maya!🙏🏼
Awesome, thanks for the gerat video. Very well explained, great diagrams! :)
interesting, easy to follow, well researched and critically scrutinized the results. like your content!
lot of information about so many topics presented nicely.
wow, most sophisticated CZcamsr ever. New favorite channel.
Thanks Maya!
Very insightful 👍
Great video and exactly what I need at the moment. Having a lot of specialized models for science, translation, coding, finance etc but no good way of combining them.
best of luck! and share with us your results if you want :)
Very good explanation. I'm looking for such easy to understand video how to fine tune a model locally .
Thank you for the insights
Love the props with storytelling! Great instructional video!
great video, loved the explanation of all the technical stuff. Would love to know your process on how you read and understand these topics in-depth?
thanks, great video!
Great Video👏🏻
Excellent video! The development outlook seems open to so many possibilities. I'm curious if anyone will find advantages in networks built via diffusions(similar to image generation) or if there will be more real time dynamics implemented as the model responds to a query.
🎯 Key Takeaways for quick navigation, however this summary does not avoids you watch the complete video for a more in deep understanding::
Main Ideas:
- 🌍 Model blending is an innovative approach to surpass the performance of high-cost models with limited resources.
- 🤖 Non-experts can effectively blend models, demonstrating the technique's accessibility.
- 💡 The blend allows for specialized functionality, combining models tuned for diverse tasks into a powerhouse model.
- 🛠 The merging process involves selecting compatible models, defining parameters, and executing the blend with basic command line knowledge.
- 🔄 Various blending methods like task vector arithmetic and SLURP offer unique advantages for custom model creation.
- 📚 Proper selection and preparation of models are crucial, with a focus on architecture compatibility and avoiding common pitfalls.
- 🏆 Blended models can achieve top rankings on leaderboards, though their position may fluctuate.
- 🤔 The effectiveness of benchmarks in evaluating model intelligence is questioned, highlighting the issue of data contamination.
Takeways:
00:00 *🤖 Introduction to Model Blending*
- Introduction to the concepts of model blending, showcasing the power of combining models to overcome resource limitations and improve performance.
- Highlights two models, Mixol and Ramonda, emphasizing the potential of model blending even with limited resources.
01:24 *📘 Basics of Model Blending*
- Detailed explanation of model blending, its significance, and the methodology behind efficient blending.
- Discusses the blending process, the importance of model selection, and the steps involved in creating a blended model.
02:05 *💡 The Promise of Blending*
- Explores the potential of blending models to create top-performing LLMs without the need for extensive resources.
- Focus on the accessibility of fine-tuning and blending for personalized model development.
03:33 *🛠️ How to Blend Models*
- Provides a practical guide on blending models using MergeKit, including setup and execution steps.
- Emphasizes the ease of blending models with basic knowledge and the right tools, offering an approachable method for enthusiasts and professionals alike.
05:33 *🧪 Detailed Blending Methods*
- Deep dive into various blending techniques such as task vector arithmetic, SLURP, TIES, and DARE, explaining their unique applications and benefits.
- Discusses the technical aspects of model blending, offering insights into choosing the right method for specific goals.
08:17 *🖥️ Preparing for Blending*
- Guidelines on selecting compatible models for blending, emphasizing the importance of architecture and layer compatibility.
- Instructions for downloading models from Hugging Face and preparing for the blending process.
10:33 *📝 Configuring YAML for Blending*
- Step-by-step instructions on setting up YAML files for blending, highlighting the importance of specifying base models, merge methods, and parameters.
- Offers practical tips for configuring blending parameters to optimize the blending process.
13:42 *🚀 Executing the Blend and Evaluation*
- Detailed walkthrough of the blending execution using MergeKit and subsequent evaluation through a text generation interface.
- Encourages testing and fine-tuning of the blended model before submission to benchmarks or public use.
15:45 *📊 Performance Testing and Data Contamination*
- Discusses the significance of performance testing on open LLM leaderboards and addresses the issue of data contamination in model training.
- Highlights the importance of careful model selection and blending strategy to avoid overfitting and ensure genuine improvements in model performance.
I hope this helps everybody!
Nice Video! Do we have the ability to fine tune the model on own codebase?
You are amazing… thank you
Amazing video! I didn’t know it could be done.
I am definitely going to make my own uncensored blended model for coding.
I am tired of openai telling me that I should not modify/hack code without owner permission even if I am the owner, and I am trying to test how solid the code is…
Very good lesson and explanation ! So far the best on this subject .. as the main problem I have was running the models after . I could not find the definitive method to work ... Despite one of the models scoring high it could not run in the HF Inference plugin on the model card ..
Great content.
Can we blend multimodal models like llavaa and mistral and gemini vision? Can u make a video on it pls..❤❤
oh that's interesting, I got to say I didn't try but I'm curious myself! I'll see how it goes and either I'll make a video or I'll let you know somehow
@@maya-akim sure.
¡Gracias!
We thank you for all the amazing content and as such , being a great content creator , i dont wanna sound nitpicky , but since you are already attracting and leaning towards the DIY crowd you may as well be using the open source tools as well { vs Codium} etc. Just a small critique because i love the content.
hey thanks for support and feedback 🙏🏻 I'm not sure I totally follow. Do you suggest that I switch to Codium? Honestly, before your comment I assumed that VScode is open source, but after googling a bit I realized that the product itself isn't actually. But I looks like Codium is os, so you think that that's a better fit for the channel?
Since we don't have access to the training data, it is simply impossible/unfeasible to choose models based on whether they have or don't have contaminated data.
Great video. How does this differ from the Mixture of Experts (MOE)?
that's an excellent question! first of all, I noticed that the community doesn't consider MoE to be merged models, even though you can use mergekit to create MoE yourself (instructions in the description box). My understanding is that blended models become "fixed" when it comes to their capabilities.
MoE capabilities change dynamically thanks to gating mechanism that decides how much of each expert's advice to follow for a given input. You specify prompts (or simple strings with mergekit) that activate specific expert. For example, here's a configuration that I used for MoE: huggingface.co/mayacinka/West-Ramen-7Bx4 as you can see, positive and negative prompts will "guide" the model.
@@maya-akim interesting. thanks for sharing your thoughts I'll look it out.
The combination of spending time messing w ai along with your videos are inspiring me to build my own workstation.
Not sure if that's smart considering I don't know how to code.
So far I have ordered:
super micro x12dai mobo
2 platinum 8352s
2 rtx 3090s
2 sata 12 tb
2 optane nvmes for os and quick retrieval stuff
128 gb lrdimm ddr4
E-ATX case, cables & ps
Do you do any consulting work via zoom? I may need some direction soon.
Very underrated channel!! This is enlightening. How a person can be so smart and beautiful too at the same time 😭😭
What happens if you merge two models of the same family but they each have different context lengths? Does the model with the larger token window take precedence?
it will depend on the "base model". But, in the cases that don't require defining a base model (like passthrough) or this hacky case here: huggingface.co/mayacinka/chatty-djinn-14B. when I merged models with 32K and 8K context window, the 32K models overpowered the 8K open chat model.
@@maya-akim thank you
Nice video
Thanks! What software are you running for loading and inferencing your merged LLM using localhost in browser?
that's oobabooga's text generation UI. It allows you to run any model, whether it's saved locally, or on huggingface's hub
thanks@@maya-akim
Maya, a great video, thank you. Quick question, where are you based? The reason I ask is I'm looking for an AI speak in the UK, you came to mind so was just wondering. Again, excellent video, amazing depth.
hey John, thanks a lot for the support! I live in Austin, TX, so I'm afraid I won't be of any help :/
@@maya-akim Damn, that's a long way away! Never mind, keep up the great work and thanks for getting back 🙂
Stacked and Cascading ensembling have been around for awhile
Is Combining tools like SWE-Agent, Crew AI, and OS-Copilot into a cohesive agentic workflow possible
after that i could use my customized model from hugging face or locally on my app's?
also now that i have decided to use this model on my creating of gen-ai app's how would i load?
llm = ??? # provide me the syntax for this
Genius
Maya.. you are the girl!!!
can you test agent frameworks like CrewAi with Claude 3 opus?
3:40 and now add robots 🤖 cheers🥂
How can I fine-tune the LLAMA 3 8B model for free on my local hardware, specifically a ThinkStation P620 Tower Workstation with an AMD Ryzen Threadripper PRO 5945WX processor, 128 GB DDR4 RAM, and two NVIDIA RTX A4000 16GB GPUs in SLI? I am new to this and have prepared a dataset for training. Is this feasible?
LLM: Tokenization vs MAMBA, please make video about this
Bach wtc 1 prelude 21 😍
LLMs catching up to something Stable Diffusion users have been doing for awhile.
Open source is the way.
Why does Slerp only support two models?
Can’t you just slerp between pairs, then slerp the slerps, etc until you have 1?
yep, you absolutely can slerp the slerps of the previously slerped slerps. That's what a lot of people do.
Is it possible to merge 7B with 8B models?
My only question while watching was. Why should I make a model? I figure there is going to a be a infinite number of models being created by people and soon to be ai models created by ai models. So my question is, what is the point of making a custom model aside from fine tuning on data. I use autogen, would creating a model like your doing. empower a local model to let's say.. chat on my data, and be good at function calling? maybe this would be an experimental way to possibly make my own model specifically for autogen? Like Ik someone out there is already working on that specifically and even you showed those models specifically used for function calling in one of your other vids.
oh that's a great question! here's how I would use it: 1. Find a model that scores highly on MMLU benchmark (which means that It has diverse knowledge). Blend it with a model that you like because of how its "vibe". For me that would be openchat because I like how conversational it is. The blended model would perform better than the two "parent" models. 2. I'm actually working on this one. I'm trying to fine tune one model to specifically be good at crafting youtube titles. And another one to write good youtube scripts. Than, I'll try to blend those two.
🎉
So it's like mixing colors, back in kindergarten, you'd always blend everything together hoping to create this amazing hue, but it always just ended up this muddy, ugly brown
👍
Good soup
At first - I was excited to see a new video with useful info - but when it got to that crime scene mapping thing you do - well... sorta creepy, no? What is that method called? Conspiracy mapping? Good visuals but wow... I lost track of what was going on with it... maybe it was more of a "Why are you putting holes in you walls? Some poor guy is gonna be like "...where's the spackle and putty knife? Some tenant/wife/daughter/kid poked a bunch of holes in my wall"... I never understood how so many holes got poked into my daughters walls or even our living room walls (ahem... the wife) but maybe this is just something that is fun to do? Now, do a video on how to patch all those little holes and get a paint roller with medium nap to repaint and cover everything up - but don't just paint a small area... no.. gonna probably have to paint the whole wall so there's no more streaks and visible coverups.. or at least learn how to feather out the edges so they blend better with the existing paint on the walls.. ugh... can't plug those holes and paint with an ai agent (yet)... so at least some skills are still worthy of known and learning... go get a guy or gal with some handy work skills - mechanical skills or something useful that AI can't do well any never will (likely for a long time) and you at least know your guy/gal will be useful given that AI will be putting lots of other people out of work (and is already doing so). I need to hire some people to help me get this working for our company - but I can't afford to keep paying drywall contractors every time we get a new idea... lol
Wow, you have too much time on your hands
Wanna get “addicted”.
why not just use autogen?
Autogen is full of crap
Stop saying it's so simple... yes, for you it is.
Monoltic LLMs < MulitAgents
CAN I use agent on my mobile device
your timing is scary
what do you mean?
It's not about collecting links with information and adding below the video... The important information is: which models are compatible and how to write the configuration file, which you barely mention! I can find all these links myself.
so a claude haha
problem with making vids that require prior knowledge and experience, is those who would find the information most useful, cannot make use of that information due requiring that prior knowledge and experience, yet at the same time the information provided in the vid is at the level to service a novice who had no prior interest, so who is the audience being catered to?
Too many video of AI nerds on TouTube... for AI nerds, hardly anyone makes videos for the average John and Jane, resulting in a large group people detached from AI.
what types of videos would appeal to average John and Jane?
@maya-akim blending would be the best word for this right, merging I think is the word people are using for it don't you think
¡Gracias!
Thanks a lot 🤗