Fine Tuning ChatGPT is a Waste of Your Time

Sdílet
Vložit
  • čas přidán 12. 09. 2024
  • Fine-tuning doesn't line up for many problems and teams. Today we discuss why fine-tuning has limitations and why alternative approaches might be better for you despite how major companies are talking about AI. We also glimpse an exciting field of study that is yet to be fully explored!
    OpenAI Fine Tuning - platform.opena...
    AWS re:Invent presentation - • AWS re:Invent 2023 - C...
    Generative Agents paper - arxiv.org/pdf/...
    Give us a follow on Stable Discussion: blog.stabledis...

Komentáře • 61

  • @BradleyKieser
    @BradleyKieser Před 9 měsíci +17

    Very good explanation and excellent thinking however the problem is that context Windows or not normally big enough to take all the data. This is why fine tuning is an important part of the mix. The correct usage is a balance between long-term data going into fine tuning and short-term data going into RAG. There will soon be a type of job specifically around the sort of data architecture.

  • @CitizenWarwick
    @CitizenWarwick Před 4 měsíci +3

    We had a well crafted GPT4 prompt with many tests covering our desired outputs. We took gpt35 and fine tuned it and now it's performing the same. Worked well for our use case!

    • @YanMaosmart
      @YanMaosmart Před 3 měsíci

      Can you share how many datasets have you used to finetune? Used arounds 200 examples but finetuned model still not work quite well

    • @CitizenWarwick
      @CitizenWarwick Před 3 měsíci

      @@YanMaosmart around 600 though I guess success depends on expected output, we output JSON and our prompt is conversational

  • @tomski2671
    @tomski2671 Před 9 měsíci +4

    Relaying what works and what doesn't is highly valuable. Too few people share their experience. Thank You
    Training/Fine-tuning is a very delicate process, it has to be done really well to get really good results. Moreover it's not a well understood process - new discoveries are constantly being made, even at the highest levels of research.

    • @breadcho7446
      @breadcho7446 Před 6 měsíci

      The problem is that finetuning GPT for example is a black box

  • @techracoon7180
    @techracoon7180 Před 12 dny +1

    Cool but fine tuning is a necessary tool if you want to lock domain specific information that doesn't change frequently into the model while freeing up the context window for more dynamic content. An example: I want to make an AI model that generates quests in a game. For this I need to finetune the model to have the basics of the game universe and such and free up the context window to include the information that is coming from the game world, such as population of each territory, which faction controls which places, the user's location and progress, etc.

    • @StableDiscussion
      @StableDiscussion  Před 10 dny +1

      Thanks the comment, however I'm unconvinced that it's a good idea for locking in a domain unless you have a very specific way you want it to answer. Say, in your example, you want it to structure quests in a specific way that has enum values or other formatting that needs to be adhered. That could be a good means of fine tuning but you might see a drop in overall quest creativity.
      I'd find using a RAG-like approach to only pull in context about the world at quest generation time to be a better and more scaleable approach. You are in control of the factors and can adjust and change how you add context as you tune the game you're creating.
      This pushed me to summarize and put out another post on this topic which leverages your example in some of my thinking: czcams.com/video/ZI0ujkLhlCY/video.html

    • @techracoon7180
      @techracoon7180 Před 10 dny +1

      @@StableDiscussion
      Thank you for your reply. As I understand from your explanation, you are saying that with fine-tuning, I will be unable to lock in the extra domain-specific data into the model. I would only be able to teach it for subtle formatting and such things. I agree with that upon some investigations, and the fact that I don't have labeled data for it. A RAG approach would fit this usecase much better indeed. Thank you for the clarification, good videos overall.

  • @YuraCCC
    @YuraCCC Před 9 měsíci +3

    Good explanation. However it looks like these two techniques are not mutually exclusive, e.g. it could still be valuable to finetune a model to improve processing of RAG generations without any specific data, while RAG mechanism supplying all the data for each specific generation

    • @StableDiscussion
      @StableDiscussion  Před 9 měsíci +1

      Thanks the comment!
      That’s true and a good point. The most basic example is formatting responses but there could be other opportunities that don’t necessarily look to provide data and instead augment or support the generation. That’s a really interesting space and a topic I’d love to learn more about

  • @injeolmi6
    @injeolmi6 Před 8 měsíci +2

    Thank you for making this video. I remember I talked to my friends about a similar concept a few months ago, now I finally know I was not alone! RAG seems like the thing most AI services should have by default.

    • @StableDiscussion
      @StableDiscussion  Před 8 měsíci +1

      Glad it was helpful! We’re hoping to continue and expand on this thinking in future videos

  • @adrianmoisa2281
    @adrianmoisa2281 Před 9 měsíci +3

    Excellent description of the challenges in fine tuning AI models! You got yourself a new subscriber 🎉

  • @arthurguiot8897
    @arthurguiot8897 Před 9 měsíci +1

    Waow it's qualitative! You won another sub :) soon you ll be big, I can see that, continue to work hard

  • @JoshKaufmanstuff
    @JoshKaufmanstuff Před 9 měsíci +2

    Great Video!
    What is the whiteboard app that you are using?

  • @aldotanca9430
    @aldotanca9430 Před 9 měsíci +1

    Currently I am planning and testing about a project which will rely heavily on RAG and I think I will have to also consider fine-tuning, becasue of the way I need the model to format, reference and present information from multiple documents. Still wrapping my head around how to produce the training data, but at the moment my impression is that, at least in my case study (a specialized and niche knowledge base about music and musical research), even RAG requires quite a bit of work to fragment the documents in ways that guarantee reliable retrieval.

    • @StableDiscussion
      @StableDiscussion  Před 9 měsíci +1

      Absolutely! We did a video just a little while ago about how custom chunking RAG helps you to improve retrieval: czcams.com/video/LHuWSGYuG4M/video.html
      Fine tuning might be what you need to do but it’s an optimization over being a first step. Doesn’t necessarily exclude it from being a valuable piece of the picture though!

    • @aldotanca9430
      @aldotanca9430 Před 8 měsíci +1

      @@StableDiscussion thanks, I was a bit buried in study and missed your reply. I will chck it out!

  • @kingturtle6742
    @kingturtle6742 Před 6 měsíci +2

    Can the content for training be collected from ChatGPT-4? For example, after chatting with ChatGPT-4, can the desired content be filtered and integrated into ChatGPT-3.5 for fine-tuning? Is this approach feasible and effective? Are there any considerations to keep in mind?

    • @dawoodnaderi
      @dawoodnaderi Před 4 měsíci

      all you need for fine-tuning is samples of "very" desirable outcome/response. that's it. doesn't matter where you get it from.

  • @user-qh4ze7xq4f
    @user-qh4ze7xq4f Před 9 měsíci +2

    I wonder how performance of RAGs will vary with integrating generative and retrieval processes. Seems like it would be difficult to optimise, plus more expensive computationally. Definitely the way forward though

  • @gopinathl6166
    @gopinathl6166 Před 8 měsíci +2

    I would like to get your advice for creating conversational chatbot. Do RAG or Finetune be suitable because we have a CourtLAW based dataset that contains 1000's of PDF which is unstructured dataset of paragraphs?

    • @zainulkhan8381
      @zainulkhan8381 Před 5 měsíci

      Hello I am also trying to feed pdf data as an input to openai its unstructured set of data and ai is not able to process it correctly when I ask it to list transactions in pdf that it generated garbage values and not the actual values that are in pdf I am tired of giving prompts so I am looking forward to fine tune now

    • @zainulkhan8381
      @zainulkhan8381 Před 5 měsíci

      Did you achieved the results of the operations you were doing on your pdfs

  • @keithprice3369
    @keithprice3369 Před 9 měsíci +12

    I'm far from an expert, but I think at least part of the challenge is when people think fine-tuning is for giving the LLM more DATA; increasing it's knowledge base. That's not what fine tuning is for. It's for customizing the WAY it responds. It's more of a style guide than a knowledge store.

    • @StableDiscussion
      @StableDiscussion  Před 9 měsíci +2

      I think this is largely because of how we see OpenAI and other companies train their models off of data. It’s not a clear separation but I agree, that is the prevailing opinion on where fine tuning fits. If so, I still question how useful fine tuning will be for unexpected prompts and if it gets stuck in the ruts or correctly adapts to the situation it’s presented with

    • @rafaeldelrey9239
      @rafaeldelrey9239 Před 8 měsíci +5

      There is a general misunderstanding of fine-tuning vs RAG. Fine-tuning is used to teach patterns of question-answers, not to add new data to a model.

    • @gemini22581
      @gemini22581 Před 3 měsíci

      What do u mean? You train it on additional data which is then used to specifically cater contextual responses for questions around the training set. How is this not adding to the LLMs existing knowledge pool?

    • @gemini22581
      @gemini22581 Před 3 měsíci

      @@rafaeldelrey9239yes but it answers questions around the questions and answers it has been trained on. Why is this not considered as adding to the existing knowledge base of the LLM?

    • @user-eh4ke5wm2f
      @user-eh4ke5wm2f Před 2 měsíci +1

      One down side for RAG no one is comparing is the slow response time. Thus causing an increase in cost. No right answer here. If you're looking for updated response and only serving 1-10 ppl, go for RAG. If you're looking to serve above 100 concurrent user at the lowest cost, go for fine tuning. cost, accuracy, time to production.

  • @MaxA-wd3qo
    @MaxA-wd3qo Před 6 měsíci +1

    why, why so tiny amount of subscribers. Very much needed approach to problems, to tell 'wait a minute... here are the stones on the road"

  • @ominoussage
    @ominoussage Před 9 měsíci +8

    I'm not an expert in AI topics, but I really do think the only thing we need is an AI that can just understand and it's just RAG on everything else.
    Great and insightful video!

    • @PorkBoy69
      @PorkBoy69 Před 8 měsíci +3

      "just understand" is carrying a LOT of weight here

    • @JaapvanderVelde
      @JaapvanderVelde Před 8 měsíci +1

      The problem of 'just understand' is really the problem at the core of AGI. If we solve that, we won't need LLM's (unless they're part of the solution of course :)).

  • @joshmoracha1348
    @joshmoracha1348 Před 9 měsíci +1

    Nice video dude. What is that app you are using to visualize your message.

    • @StableDiscussion
      @StableDiscussion  Před 9 měsíci

      Thanks! Glad you liked it!
      Excalidraw is what I use for all the diagrams that help me explain things

  • @korbendallasmultipass1524
    @korbendallasmultipass1524 Před 5 měsíci

    I would say you are actually looking for Embeddings. You can set up a database with Embeddings based on our specific data which will be checked for similarities. The matches would then be used to create the context for the completions api. Fine tuning is more to modify the way how it answers. This was my understanding.

  • @user-du8hf3he7r
    @user-du8hf3he7r Před 9 měsíci +1

    Training augments behaviour, RAG augments information - they are two different things.

  • @zalzalahbuttsaab
    @zalzalahbuttsaab Před 9 měsíci

    5:22 When you started talking about the context window problem, I did think about indexing. I suppose an AI is a sort of an index but it is more dynamic than a traditional database. Setting up a session database would effectively solve the context issue.

    • @StableDiscussion
      @StableDiscussion  Před 9 měsíci +2

      For an AI it has deep dimensionality to be able to search for language semantics and largely the context size is the issue rather than indexed search. Vector databases are the best at tracking this space and calculating similarity in a number of ways based on queries. But there are definitely ways to leverage traditional databases to provide context as well. Any form of retrieval opens a large space of possibility

  • @DJPapzin
    @DJPapzin Před 8 měsíci +2

    🎯 Key Takeaways for quick navigation:
    00:00 🎯 *Fine-tuning Overview*
    - Fine-tuning is a technique to personalize AI models.
    - It's data-intensive and currently a popular trend in the AI community.
    - Major AI companies, including OpenAI, are emphasizing fine-tuning.
    01:01 🤔 *Why Fine-Tune?*
    - Fine-tuning addresses limitations in AI's memory space and context windows.
    - Challenges arise when context exceeds the model's memory, leading to information loss.
    - AI enthusiasts and companies advocate fine-tuning for more personalized responses.
    02:35 ⚠️ *Challenges of Fine-Tuning*
    - Defining relevant training data is complex, considering unknowns in the model's knowledge.
    - Overtraining is a significant challenge, leading to rigid responses and missing diverse solutions.
    - Difficulty in determining what the model lacks in knowledge and how to supplement it.
    05:19 🔄 *RAG (Retrieval Augmented Generation)*
    - RAG breaks down related data into manageable chunks, overcoming context window issues.
    - It enables searching for specific chunks relevant to the question, improving answer quality.
    - RAG allows continuous updates to data chunks, providing flexibility compared to fine-tuning.
    06:51 🛡️ *Security Considerations*
    - Fine-tuning and AI interactions may expose proprietary information and data vulnerabilities.
    - RAG offers stronger control over which documents are sent to specific users, enhancing security.
    - The ability to control data distribution to users provides additional security benefits.
    08:21 🌐 *Future Possibilities of RAG*
    - RAG opens up exciting possibilities, such as developing autonomous agents with perception and planning capabilities.
    - The potential for optimizing RAG for various situations makes it a promising area.
    - RAG's flexibility and adaptability make it a more compelling option compared to fine-tuning.
    09:18 🎙️ *Conclusion and Call to Action*
    - RAG offers more potential than fine-tuning, especially in terms of data curation and understanding.
    - A glimpse into the fascinating space of RAG and its diverse applications.
    - Encouragement to follow Stable Discussion for more insights and discussions on AI.
    Made with HARPA AI

  • @droidtafadzwa5545
    @droidtafadzwa5545 Před měsícem +1

    You got yourself a new subscriber

  • @Arashiii87
    @Arashiii87 Před 9 měsíci +2

    I am very new to this AI field, thank you very much for explaining in simple terms !

  • @tijldeclerck7772
    @tijldeclerck7772 Před 9 měsíci +2

    Loved this explanation. Subscribed.

  • @cyclejournal9459
    @cyclejournal9459 Před 8 měsíci

    Would that be different with the recently introduced custom-gpts which allow you to personalize your model based on your specific instructions and provide it with your own contextual documents for reference?

    • @StableDiscussion
      @StableDiscussion  Před 8 měsíci +1

      It’s similar, however there are a number of limitations to using custom gpts over using the API and a customized data source.
      We talk about this briefly here: czcams.com/video/SCeqWFjBGjE/video.htmlsi=MfIF0RPBH5tGdOr7
      We also have a post on our blog about gpts more specifically: blog.stablediscussion.com/p/are-gpts-a-marketing-gimmick?

  • @GDPLAYz155
    @GDPLAYz155 Před 9 měsíci

    So, if I am correct, you suggest adding a value to the jason object, which is a chunk of data, sending with other set of data for fine-tuning the gpt model like questions and answers, am I right or does it require a different process?

    • @breadcho7446
      @breadcho7446 Před 6 měsíci

      This usually is done by having Vector Stores, with encoded data into Embeddings.

  • @-Evil-Genius-
    @-Evil-Genius- Před 9 měsíci +4

    🎯 Key Takeaways for quick navigation:
    00:00 🤖 *Understanding Fine Tuning in AI*
    - Fine tuning is a technique to customize AI models, gaining popularity in the AI community.
    - Major AI companies like OpenAI and AWS focus on making fine tuning more accessible.
    - The appeal of fine tuning arises from addressing the limitations of AI models, particularly in handling context and relevant information.
    02:35 🧠 *Challenges of Fine Tuning and Overtraining*
    - Defining training data for fine tuning is challenging due to the difficulty in understanding what the model lacks.
    - Overtraining poses a significant challenge, making the model rigid and less adaptable to changes.
    - The need for a representative set of data that mirrors real-world scenarios to avoid overtraining pitfalls.
    05:19 🔄 *RAG (Retrieval Augmented Generation) as an Alternative*
    - Retrieval Augmented Generation offers a more flexible approach by breaking information into smaller, manageable pieces.
    - Using smaller chunks allows for better management of context window problems in AI.
    - Updating and modifying information chunks becomes easier compared to the fixed nature of fine-tuned models.
    06:51 🔐 *Security Concerns in Fine Tuning and RAG*
    - Security issues arise in fine tuning as users can extract data about the training process and model's responses.
    - Retrieval Augmented Generation provides better control over which documents go to specific users, enhancing security.
    - The ability to control and restrict the knowledge base of AI systems based on user requirements.
    08:21 🌐 *Future Possibilities with Retrieval Augmented Generation*
    - Retrieval Augmented Generation opens up diverse possibilities, such as developing autonomous agents with brain-like patterns.
    - The potential for AI systems to perceive, plan, reflect, and act based on stored details about their environment.
    - An exploration of the vast capabilities within the space of Retrieval Augmented Generation compared to the limitations of fine tuning.
    Made with HARPA AI

  • @scottvickrey2743
    @scottvickrey2743 Před měsícem +1

    Thanks for your indepth explanation. It has effect.

  • @BernardMcCarty
    @BernardMcCarty Před 8 měsíci

    Thank you. Your clear explanation of RAG was very useful 👍

  • @christinawhisler
    @christinawhisler Před 6 měsíci

    Is it a waste of time for novelist too?

  • @quick24
    @quick24 Před 8 měsíci +1

    Am I the only one here surprised to find out that Jack Black is an AI expert?

  • @protovici1476
    @protovici1476 Před 8 měsíci

    This video and opinion is fairly incorrect in regards to fine-tuning. Especially, fine-tuning can be utilized in any deep learning hyperparameters (i.e. GenAI, Discriminative AI, BERT, NLP) with any data set. Self supervision, supervised, to reinforcement learning just to name a few use cases of algorithms to solve a problem. CZcamss fine-tuning in their algorithm made me stumble upon this video. Highly recommended re-evaluation of this video to save folks from misunderstanding.

  • @tecnopadre
    @tecnopadre Před 7 měsíci

    Sorry but why then it's a waste of time? It wasn't clear or finally mention as far as I've listened

  • @MrAhsan99
    @MrAhsan99 Před 5 měsíci

    thanks for the insight

  • @rfilms9310
    @rfilms9310 Před 5 měsíci

    "you to curate data to feed AI"