Analytics Camp
Analytics Camp
  • 25
  • 21 994
Can This Replace Fine-tuning LLMs? 🤯 Insane GPT-4o Results
#ai #gpt4o #llm
When GPT-4o or omni was released, there was so much excitement about the super cool voice and visual recognition and generation that we overlooked an important aspect of it that I will be discussing today based on this paper by several Stanford University researchers including Andrew NG:
Jiang et al. (2024). Many-Shot In-Context Learning in Multimodal Foundation Models
Related videos:
What Language Model To Choose For Your Project? LLM Evaluation
czcams.com/video/PXX2OO7s8wY/video.html
💯 FREE Local Set-up For AI Agents With CrewAI And Ollama | Easy Tutorial
czcams.com/video/XkS4ifkLwwQ/video.html
www.youtube.com/@analyticsCamp/videos?sub_confirmation=1
Chapters and Key Concepts:
00:00 Introducing GPT-4o
00:32 Live Translation
00:41 Real-time Recognition of Objects and Animals
00:50 Real-time Recognition of Historical Sites
01:07 Two GPT-4os Role-switching
01:27 Issues with Current GPT-4o Model and Fine-tuned LLMs
01:56 Many-shot In-Context-Learning (ICL) in Multimodal Foundation Models
02:30 Explaining ICL
02:48 Zero-shot, Few-shot, and Many-shot ICL with Image
03:27 Issues with Few-shot ICL and Context Window
04:21 Many-shot ICL Benchmarking with Image Datasets
04:26 Results of GPT-4o and Gemini 1.5 Pro on ICL Research
05:01 Can We Replace Fine-tuning LLMs with ICL?
05:24 ICL Prompt Formats
05:34 Advantages of Using ICL
05:58 Taxonomy of In-Context-Learning Studies and Methods
06:18 Scoring Function for ICL
07:00 Conclusions and Questions
zhlédnutí: 54

Video

💯 FREE Local LLM - AI Agents With CrewAI And Ollama Easy Tutorial 👆
zhlédnutí 435Před dnem
#aiagents #crewai #tutorials Follow along with this super easy code tutorial to set up your local agentic workflow which is 100% free. I will use VSCode and show you how to install CrewAI and Ollama to work in your virtual environment. Check out this video in which I explain the Refinement method for agentic workflow: This Agentic AI Workflow Will Take Over Algorithm Papers Explained czcams.com...
🔴 This Agentic AI Workflow Will Take Over 🤯 Algorithm + Papers Explained
zhlédnutí 8KPřed měsícem
#ai #llm #aiagents #agentic What Language Model To Choose For Your Project? 🤔 LLM Evaluation: czcams.com/video/PXX2OO7s8wY/video.html : evaluation of Hugging Face models Please subscribe to support this channel :) Explanation of the papers and algorithms of LLM agents in the Agentic AI systems (see timestamps below) using the following concepts and papers: Iterative feedback and refinement for ...
How Sora AI works: OpenAI Text-To-Video Model and LLM
zhlédnutí 608Před 2 měsíci
#ai #llm #sora The Model Behind Sora AI Simplified! OpenAI Text-To-Video LLM Have you wondered what the main reason is that makes Sora so good at generating videos? What ingredient is added to the base diffusion model that has taken video generation to a whole new level, setting the scene for unlimited creativity? I will discuss this text-to-image and video model in this video, along with the L...
5 Steps to Improve LLM Results - Locally Run LLAMA 2 vs Orca 2
zhlédnutí 115Před 3 měsíci
#ai #llm #llama #orca I will test the 5-step method for improving the results of large language models on a famous reasoning test between llama 2 and orca 2. I have spent quite some time getting to the bottom line so you don’t have to! Watch till the end to see the winner and how these steps help improve the models’ responses. This test is one example of the GSM8K test used to evaluate LLMs in ...
What Language Model To Choose For Your Project? 🤔 LLM Evaluation
zhlédnutí 441Před 3 měsíci
#llm #huggingface #gpt4 #ai With more than 490,000 language models uploaded in the Hugging Face model repositories, how do you find the best language model for your personal or business projects? I have spent two weeks searching for the best models so you don’t have to. In this video, you will get to know all details about Hugging Face LLM Leaderboard and how it evaluates all the models objecti...
Is Mamba Destroying Transformers For Good? 😱 Language Models in AI
zhlédnutí 6KPřed 4 měsíci
#mamba #transformers #llm #mambaai Select 1080p. The Transformer language model started to transform the AI industry, but it has one main problem that can make it go extinct even before it blasts off in 2024! Watch this easy-to-follow and full-of-fun graphics video about the model architectures and performance differences of the Transformer and Mamba language models. I will compare the function...
Mamba Language Model Simplified In JUST 5 MINUTES!
zhlédnutí 4,9KPřed 4 měsíci
#mamba #ai #llm Here’s a super simplified explanation of the Mamba language model with the Selective State Space Model (Selective SSM architecture). In the previous videos, I used the example of sequences of words to show how transformers use the Attention Mechanism to process natural language and predict the next word in a sequence of words, e.g., a sentence. In this video, I show you how Mamb...
The Concept of Backpropagation Simplified in JUST 2 MINUTES! -- Neural Networks
zhlédnutí 647Před 4 měsíci
A beginner and easy-to-follow explanation of Backpropagation in Neural Networks, and how it helps to reduce the error in predicting the next word in a sequence in a text. Related videos: Transformer Language Models Simplified in JUST 3 MINUTES! czcams.com/video/6n-mOFlhbGI/video.html Mamba Language Model Simplified In JUST 5 MINUTES! czcams.com/video/e7TFEgq5xiY/video.html Stick around for more...
Transformer Language Models Simplified in JUST 3 MINUTES!
zhlédnutí 375Před 5 měsíci
#ai #llm #transformers #attention Watch how Transformer Language Models work in just 3 minutes! This is Transformers simplified for beginners, based on the journal article ‘Attention Is All You Need’ in 2017 by Google researchers. For more information watch my previous video about how language models work at this link: czcams.com/video/n_5spvz-2KI/video.html In the previous video, I showed the ...
5 AI Tools and Features to Expect in 2024 That You NEED to Know!
zhlédnutí 91Před 5 měsíci
5 AI Tools and Features to Expect in 2024 That You NEED to Know!
This is how EXACTLY Language Models work in AI - NO background needed!
zhlédnutí 297Před 5 měsíci
This is how EXACTLY Language Models work in AI - NO background needed!
Is Claude better than ChatGPT? SHOCKING Claude going viral
zhlédnutí 60Před 9 měsíci
Is Claude better than ChatGPT? SHOCKING Claude going viral
ChatGPT Prompt Engineering 101: 5 Best Prompt Hacks Everyone SHOULD know!
zhlédnutí 74Před 10 měsíci
ChatGPT Prompt Engineering 101: 5 Best Prompt Hacks Everyone SHOULD know!

Komentáře

  • @optiondrone5468
    @optiondrone5468 Před 12 hodinami

    All very exciting things but how long do you think before everyone can have access to all these AI based new applications?

    • @analyticsCamp
      @analyticsCamp Před 9 hodinami

      Thanks for watching :) You can use ICL with any LLM, especially the ones you can download directly from Hugging Face or via Ollama. Some other interfaces allow users to attach files to process, so you can write your prompts and instructions in those files plus any images you need to attach. I'm not sure about audio and video ICL at this moment, though.

  • @Researcher100
    @Researcher100 Před 3 dny

    The explanation was clear, thanks. Does this paper show how to use this method in practice? I think most llm users don't know ins and out of fine tuning so icl can be very helpful for ordinary users.

    • @analyticsCamp
      @analyticsCamp Před 3 dny

      Thanks for the comment :) Yes, the paper comes with all those explanation. Yep, I also believe this can open a way for more ordinary AI users AND many researchers in other fields.

  • @jarad4621
    @jarad4621 Před 7 dny

    Sorry another quesion, am i able to use LM studio with crewai as well, wanted to test some other models and its gpu accel allows models to run better then ollama for me, is it still going to have probems due to the issues you fix with the models file or is that issue not a problem for other local servers? Or is ollama the best way because you can actually edit those things to make it work well? Thanks

    • @analyticsCamp
      @analyticsCamp Před 7 dny

      I do not use LM Studio so I cannot comment on that. But Ollama via terminal is pretty sturdy, CrewAI it should work with all Ollama models, but I have not tested all. If you run into issues you can report it here so others can know and/or help you :)

  • @first-thoughtgiver-of-will2456

    can mamba have its input rope scaled? It seems it doesnt require positional encoding but this might make it extremely efficient for second order optimization techniques.

    • @analyticsCamp
      @analyticsCamp Před 7 dny

      In Mamba sequence length can be scaled up to a million (e.g., a million-length sequences). It also computes the gradient (did not find any info on second-order opt in their method): they train for 10k to 20k gradient steps.

  • @jarad4621
    @jarad4621 Před 8 dny

    Also never seen the mistral idea so this model would make a really good agent then better then the others? Really helpful to know, glad I found this. Also can you test agencu dwarm ans let us know what the best agent framewoek is currently? Apparently crew is not great for production?

    • @analyticsCamp
      @analyticsCamp Před 8 dny

      Thanks for watching :) I have tested multiple models from Ollama and mistral seems to have better performance overall, across most tasks. Agent Swarm can be useful for VERY specialised tasks in which general LLMs get it totally wrong. Other than that, it will add to the time/cost of build.But I'm not sure if I understood your question right!

  • @jarad4621
    @jarad4621 Před 8 dny

    Awesome I've been looking for some of this info for ages, Best video on agents after watching dozens of vids, nobody explains the problems with other models or fixing model file and how to make sure the local models work, many of these YT Experts are just using local and other nodels snd wondering why it's not working well. Can i use phi 3 mini local as well and it needs same model setup? Also will llama 70b on openrouter api actually work as a good agent or does something need to be tweaked first nobody can answer these things, please help? Thanks!

    • @analyticsCamp
      @analyticsCamp Před 8 dny

      Sure, you can essentially use any models listed in Ollama as long as you make a model file, you can manage the temperature etc. I have used LLAMA 70b before but surprisingly, it did not show better response than its 7b and 13b on most tasks! I recommend LLAMA3 (I may be able to make a video on it if I get some free time, LOL ).

    • @jarad4621
      @jarad4621 Před 8 dny

      @@analyticsCamp Awesome thanks ill test the smaller ones first then

  • @optiondrone5468
    @optiondrone5468 Před 9 dny

    Thanks for sharing your thoughts and practical AI agent workflow. I also believe that this agentic workflow will fuel many LLM based development in 2024

    • @analyticsCamp
      @analyticsCamp Před 8 dny

      Thanks for watching :) If you have a specific LLM-based development/project in mind please let me know. With such easy agentic access, I am also surprised how many AI users are still hooked on zero-shot with paid interfaces!

    • @optiondrone5468
      @optiondrone5468 Před 7 dny

      @@analyticsCamp ha ha it also never made sense to me why ppl don't look into open source LLM 🤔 its free, not limiting your token size, free to experiment with different models and most importantly your data (prompt) is yours and don't become automatically OpenAi's property. Keep up the good work, looking forward to your next video.

  • @milanpospisil8024
    @milanpospisil8024 Před 9 dny

    Yeah, Pat and Mat, thats from Czech studios :-)

  • @BooleanDisorder
    @BooleanDisorder Před měsícem

    I really don't understand why you have so few views and subscribers.

  • @SuperHddf
    @SuperHddf Před měsícem

    unwatchable with headphones.

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Apologies for the sound quality. Please watch this video using a speaker. Thanks :)

  • @doublesami
    @doublesami Před měsícem

    Very informative looking forward for the in depth video on vision mamba or vmamba

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for watching and for your suggestion. Stay tuned :)

  • @pitrgreat
    @pitrgreat Před měsícem

    Interesting video, good work, but it's very silent, I had to more than double times increase volume to get output like from other random video, somethings wrongs with the sound or style how you recorded ;)

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for your feedback. Sorry about the sound quality; I'll try to fix it in the next videos. Stay tuned :)

  • @blondisbarrios7454
    @blondisbarrios7454 Před měsícem

    Nice video, nice channel, nice voice, nice pronuntiation. Clear to understand ;)

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Glad you think so and thanks for watching :)

  • @ragibhasan2.0
    @ragibhasan2.0 Před měsícem

    Nice video

  • @HannesFoulds
    @HannesFoulds Před měsícem

    Oh come on. What if I tell you, you have been living your life wrong, and there is a much better way...

  • @thecooler69
    @thecooler69 Před měsícem

    gorilla seems like the holy grail. it's strange that it seems to have reached a standstill in development.

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for watching. Their newest development is GoEX, their paper (below reference) talks about how humans can supervise autonomous LLMs. Here's the paper ref: Patil et al., 2024. GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications.

    • @thecooler69
      @thecooler69 Před měsícem

      @@analyticsCamp Groovy, thanks for that!

    • @analyticsCamp
      @analyticsCamp Před měsícem

      👍

  • @erikjohnson9112
    @erikjohnson9112 Před měsícem

    I like your style, content and voice. Just earned a sub.

  • @NLPprompter
    @NLPprompter Před měsícem

    you know this is made by human, and this human is so busy with work... so this video have mono audio at left speaker only.

    • @NLPprompter
      @NLPprompter Před měsícem

      don't bother busy people doing their busy thing. you are asking too much.

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for your feedback! I used to record with Zoom but I switched to this new mic; will need to adjust it as you suggested :)

    • @NLPprompter
      @NLPprompter Před měsícem

      @@analyticsCamp can't wait another explanatory vids like this.

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for your feedback!

  • @christopheboucher127
    @christopheboucher127 Před měsícem

    Great content !! thx ! and yes agentic is my exploration for months... And it's still very difficult to masterize the agentic flow.. sometimes it's going out of limits , i didn't figure out yet how to fix that (with both autogen and crewai)... maybe a topic for a future video ?

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Great suggestion. Yes it can be challenging at times :)

  • @seupedro9924
    @seupedro9924 Před měsícem

    The video is very informative but would be great to see the agent power in action

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for watching, great suggestion :)

  • @fintech1378
    @fintech1378 Před měsícem

    more agents are all you need bro, dont treat AI agent as human, think of it as the cell in your body (maybe this is bad analogy), so we need hundreds of them at least to perform certain function

    • @analyticsCamp
      @analyticsCamp Před měsícem

      Thanks for your comment! Maybe you are right!

  • @eduardoramos8317
    @eduardoramos8317 Před měsícem

    Good video! Could you make a comparative with Mamba structure?

  • @TheFocusedCoder
    @TheFocusedCoder Před měsícem

    nice video !

  • @Researcher100
    @Researcher100 Před měsícem

    Very nice video! I liked the explanation of the Reflexion algorithm. Please do more of ai agents.

  • @kvlnnguyieb9522
    @kvlnnguyieb9522 Před 2 měsíci

    a great video. next video, may be you can explain the details about selective mechanisms in code

    • @analyticsCamp
      @analyticsCamp Před 2 měsíci

      Great suggestion! Thanks for watching :)

  • @optiondrone5468
    @optiondrone5468 Před 2 měsíci

    This is an excellent explanation, however Sora needs more improvements before it can be taken seriously to disrupt the film industry or other type video generation work-flow.

    • @analyticsCamp
      @analyticsCamp Před 2 měsíci

      Agree, but let's see the open-source model behind it first, if they ever release :)

  • @facundohannoch3675
    @facundohannoch3675 Před 3 měsíci

    Thank you!! Could not find much information about how to compare LLMs, and your video was really helpful!

    • @analyticsCamp
      @analyticsCamp Před 3 měsíci

      Glad it was helpful! Let me know if you'd like me to cover any topic :)

  • @gangs0846
    @gangs0846 Před 3 měsíci

    Thank you

  • @optiondrone5468
    @optiondrone5468 Před 3 měsíci

    I'm new to ML, and when it comes to model selection, I always had questions about what are the important matrices that are considered during model selection. I like what Hugging Face did in their leaderboard, and I also liked your explanation. Thanks for sharing it with us.

  • @Researcher100
    @Researcher100 Před 3 měsíci

    I see what you did with the GPT answers 😏 And the humans vs models thing with Khabib vs McGreggor was super dope 😂😂😂

  • @eugene-bright
    @eugene-bright Před 3 měsíci

    The best tip is to use RAG (e.g. Perplexity AI)

    • @eugene-bright
      @eugene-bright Před 3 měsíci

      AutoGen studio + LM Studio as quick custom solution. LangChain is the best when you are familiar with Python

  • @user-du8hf3he7r
    @user-du8hf3he7r Před 3 měsíci

    Low audio volume.

  • @ricardofurbino
    @ricardofurbino Před 3 měsíci

    I'm doing a work that uses sequence data, but not specific to language. In a transformer-like network, instead of embedding layer for the source and target, I have linear layers; also, I send both source and target to the forward process.. In a LSTM-like network, I don't even need this step, I just have the torch standard lstm cell; in this case, simply source is necessary for the forward pass. Does someone has a code example on how I can do it using Mamba? I'm having difficulties on how I can do it.

    • @analyticsCamp
      @analyticsCamp Před 3 měsíci

      Hey, I just found a PyTorch implementation of Mamba in this link. I haven't gone through it personally, though; but if it is helpful please do let me know: medium.com/ai-insights-cobet/building-mamba-from-scratch-a-comprehensive-code-walkthrough-5db040c28049

  • @raymond_luxury_yacht
    @raymond_luxury_yacht Před 3 měsíci

    Why didn't they call it Godzilla?

    • @analyticsCamp
      @analyticsCamp Před 3 měsíci

      Funny :) It seems to be as powerful as one!

  • @consig1iere294
    @consig1iere294 Před 3 měsíci

    I can't keep up. Is Mamba like Mistral model or it is a LLM technology?

    • @analyticsCamp
      @analyticsCamp Před 3 měsíci

      Mamba is an LLM but has a unique architecture, a blend of traditional SSM-based models together with Multi-layer Perceptron which helps it to add 'selectivity' to the flow of information in the system (unlike Transformer-based models which often take the whole context, i.e., all the information, to be able to predict the next word). If you are still confused, I recommend you watch my video in this channel called "This is how exactly language models work" which gives you a perspective of different types of LLMs :)

  • @richardnunziata3221
    @richardnunziata3221 Před 3 měsíci

    A 7B to 10B Mamba would be interesting to judge but right now it seems its really good with long content for the small models space

    • @analyticsCamp
      @analyticsCamp Před 3 měsíci

      You are right! Generally speaking, a larger size of parameter considered in tuning the models give better result. But Mamba is claiming that we don't necessarily need larger models, but a more efficient design of a model to be able to perform comparable to other models, even though it may have been trained on smaller training data and smaller number of parameters. I suggest their article, section 4.3.1 where they talk about "Scaling: Model Size", which can give you a good perspective. Thanks for watching :)

  • @datascienceworld
    @datascienceworld Před 4 měsíci

    Great video.

  • @mintakan003
    @mintakan003 Před 4 měsíci

    I'd like to see this tested out for larger models, such as comparable to llama 2. One question that I have, is whether there are diminishing returns for long distance relationships, compared to a context window of sufficient size. Is it enough for people to give up (tried and true?) transformers, with explicit modeling of the context, over something that is more selective.

    • @analyticsCamp
      @analyticsCamp Před 4 měsíci

      A thoughtful observation! Yes, it seems that the authors of Mamba have already tested it out against Transformer-based architectures, such as PaLM and LLaMA, and a bunch of other models. Here's what they quoted in their article, page 2: "With scaling laws up to 1B parameters, we show that Mamba exceeds the performance of a large range of baselines, including very strong modern Transformer training recipes based on LLaMa (Touvron et al. 2023). Our Mamba language model has 5× generation throughput compared to Transformers of similar size, and Mamba-3B’s quality matches that of Transformers twice its size (e.g. 4 points higher avg. on common sense reasoning compared to Pythia-3B and even exceeding Pythia-7B)." With regards to scaling the sequence length, I have explained a bit in the video. Here's a bit more explanation from their article, page 1: "The efficacy of self-attention is attributed to its ability to route information densely within a context window, allowing it to model complex data. However, this property brings fundamental drawbacks: an inability to model anything outside of a finite window, and quadratic scaling with respect to the window length." There's also an interesting table of summary of model evaluation (Zer-shot Evaluation, page 13) of different Mamba model sizes compared to GPT-2, H3 Hybrid model, Pythia, and RWKV, where in each instance Mamba exceeds these models' performances (check out the accuracy values in each dataset, especially for Mamba 2.8 Billion parameter model, it is truly unique. And, thanks for watching :)

  • @Kutsushita_yukino
    @Kutsushita_yukino Před 4 měsíci

    lets goooo mamba basically has the similar memory as humans. but brains do tend to forget when the information is unnecessary so thats that.

    • @analyticsCamp
      @analyticsCamp Před 4 měsíci

      That's right. Essentially, the main idea behind SSM architectures (e.g., having a hidden state) is to be able to manage the flow of information in the system.

  • @70152136
    @70152136 Před 4 měsíci

    Just when I thought I had caught up with GPTs and Transformers, BOOM, MAMBA!!!

  • @yuvrajsingh-gm6zk
    @yuvrajsingh-gm6zk Před 4 měsíci

    keep up the good work, btw you got a new sub!

  • @nidalidais9999
    @nidalidais9999 Před 4 měsíci

    I liked your style and your funny personality

    • @analyticsCamp
      @analyticsCamp Před 4 měsíci

      Thanks for watching, I love your comment too :)

  • @MrJohnson00111
    @MrJohnson00111 Před 4 měsíci

    You clearly explain what is the difference between Transformer and Mamba, thank you but could you also give the reference paper you mention in the video let me dive in ?

    • @analyticsCamp
      @analyticsCamp Před 4 měsíci

      Hi, glad the video was helpful. The reference for the paper is also mentioned multiple times in the video, but here's the full reference for your convenience: Gu & Dao (2023). Mamba: Linear-Time Sequence Modelling with Selective State Spaces.

  • @viswa3059
    @viswa3059 Před 4 měsíci

    I came here for giant snake vs giant robot fight

  • @Researcher100
    @Researcher100 Před 4 měsíci

    Thanks for the effort you put into this detailed comparison, I learned a few more things. Btw, the editing and graphics in this video were really good 👍

  • @optiondrone5468
    @optiondrone5468 Před 4 měsíci

    I'm enjoying these mamba videos you're sharing with us, thanks

  • @zagoguic
    @zagoguic Před 4 měsíci

    Great video! Keep making them!

  • @ln2deep
    @ln2deep Před 4 měsíci

    It's a bit unclear to me how the Mamba architecture works recurrently when looking at the architecture in 5.30. What is the input here? the whole sequence or individual tokens? Surely it'd have to be the whole sequence for Mamba to build a representation recurrently. But then it seems strange to have a skip connection on the whole sequence. I think I've missed something.

    • @analyticsCamp
      @analyticsCamp Před 4 měsíci

      Hi, thanks for your comment. I mentioned that delta discretizes the input as the word sequence into tokens, ..., and the fact that, at every step of the hidden state update, it takes into account the previous hidden state and the 'current input word'. I try to make an update on this, maybe reviewing the entire article if I can. Please do let me know if you are interested in any particular topic for a video.

  • @optiondrone5468
    @optiondrone5468 Před 4 měsíci

    Thanks for this video, keep up the good work.