Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained)

Sdílet
Vložit
  • čas přidán 23. 08. 2024

Komentáře • 110

  • @Thanos916
    @Thanos916 Před 10 měsíci +5

    I asked Claude to give me an example based off of the PDF:
    Claude: "Here is a hypothetical example of how Promptbreeder could be used to automatically evolve better prompts for a question answering system:
    Let's say we have a QA system that uses a large language model to answer trivia questions. Its current prompt is simply:
    "Q: What is the capital of France?
    A:"
    We want the system to give more explanatory and conversational answers.
    With Promptbreeder, we could provide an initial population of prompts like:
    "Q: What is the capital of France?
    A: Let me think about that. France is located in Western Europe and has a long history dating back to ancient times. It has had many important cities over the centuries that have served as capitals. Based on my knowledge, the current capital of France is [ANSWER]."
    Promptbreeder would then repeatedly evaluate these prompts on a dataset of training questions, pick the best ones, mutate them, and evolve the population over generations.
    The prompts could be mutated by paraphrasing, changing structure, adding or removing explanation, etc. For example, mutating the above prompt could give:
    "Q: What is the capital of France?
    A: Good question! Let me consult my memory bank about France. It is one of the largest countries in Europe with many major cities over its history. But in modern times, the capital city is [ANSWER]. Let me know if you need any other details!"
    Over multiple generations, Promptbreeder would evolve prompts that produce more conversational and explanatory answers, outperforming the original hand-crafted prompt.
    The key advantage is that domain knowledge is not required to engineer good prompts - the system automatically explores variations and finds effective ones for the task through self-referential evolution."

    • @SaberXrod
      @SaberXrod Před 8 měsíci

      Amazing, yet. What is the capital of France? Would asking where is the capital of France improve this? Or perhaps, what is the name of the capital of France? Where it should give us the capital of France 🤔

  • @sndkfnsknlfdsklnsdk1070
    @sndkfnsknlfdsklnsdk1070 Před 10 měsíci +6

    Thanks! My 2cents is that this would be much easier to understand by using a specific example from start to end.

  • @wolpumba4099
    @wolpumba4099 Před 10 měsíci +32

    *Summary*
    *Introduction and Overview*
    - 0:00: Introduction of the paper topic on "Prompt Breeder," a Google DeepMind system for evolving prompts in large language models.
    - 0:18: Explains that the system auto-generates prompts instead of using human-generated ones.
    - 0:44: Describes the evolutionary algorithm used for prompt generation.
    *Core Mechanisms*
    - 0:57: Claims of in-model "mutation and fitness evaluation."
    - 1:09: Focus on the system's self-referential and self-improving attributes.
    - 1:22: Notes the complexity of managing all aspects within a large language model.
    *Existing Approaches and Their Limits*
    - 1:42: Expresses skepticism about the system's efficiency in solving prompt engineering issues.
    - 2:12: Mentions existing techniques for improving language model outputs.
    - 2:26: Criticizes traditional handcrafted prompt strategies.
    *Prompt Breeder Details*
    - 3:34: Introduces "Prompt Breeder," which automates prompt adaptation.
    - 4:10: Clarifies that evolutionary algorithms drive the mechanism.
    - 4:29: Debates the applicability of the term "Foundation Models."
    - 5:43: Reviews existing work in automated prompt engineering.
    *Algorithmic Enhancements*
    - 7:26: Proposes a diversity-maintaining algorithm to mitigate diminishing returns.
    - 8:10: References related work on the importance of diversity in open-ended learning.
    - 9:29: Presents experimental results supporting "Prompt Breeder."
    *Implementation Details*
    - 10:01: Notes the need for training data for effective prompt improvement.
    - 10:42: Explains the operational mechanics of "Prompt Breeder."
    - 11:08: Introduces "thinking styles" as a variable for improving diversity.
    *Two-Prompt System*
    - 17:18: Discusses the structure of using problem and mutation prompts.
    - 17:46: Describes how mutation and task prompts evolve together.
    - 18:20: Details how the fitness of mutation prompts is assessed.
    *Mutation Operators*
    - 20:14: Defines the population structure in the algorithm.
    - 21:09: Presents various forms of mutation operators.
    - 23:10: Elaborates on the scope of linguistic exploration through operators.
    *Criticisms and Limitations*
    - 27:04: Critiques the paper's specificity and lack of generalizability.
    - 28:34: Questions the need for training data in large language models.
    *Final Remarks*
    - 36:00: States the research's comparative advantages.
    - 37:32: Notes cost and complexity considerations.
    - 39:14: Expresses doubts on the method's general applicability.
    - 45:23: Concludes with a mixed assessment of the method's impact.
    *Positive Learnings*
    1. *Automated Prompt Generation:* "Prompt Breeder" automates the task of generating prompts, which can be a major advancement.
    2. *Evolutionary Algorithms:* The use of evolutionary algorithms provides a scalable and adaptive mechanism for prompt improvement.
    3. *In-Model Evaluation:* Performing mutation and fitness evaluation within the model itself is an innovative approach.
    4. *Diversity Maintenance:* The introduction of a diversity-maintaining algorithm could solve the problem of diminishing returns in prompt engineering.
    5. *Experimental Evidence:* The system has been backed by experimental results that demonstrate its effectiveness over existing methods.
    *Negative Learnings*
    1. *General Applicability:* There's skepticism about how generally applicable this method would be across different types of problems.
    2. *Complexity:* The system is complex, with everything managed within a large language model, raising questions about efficiency and computational costs.
    3. *Training Data Requirement:* Despite the advanced capabilities, the system still requires training data for effective prompt improvement.
    4. *Limited Comparative Study:* The introduction of a two-prompt system complicates direct comparisons with simpler, zero-shot prompting techniques.
    5. *Specificity Over Generality:* The paper is critiqued for being too task-specific, possibly limiting its wider applications.

  • @petevenuti7355
    @petevenuti7355 Před 10 měsíci +5

    I give this paper a grade of W.
    Let me explain.
    In my first year of college English 101, the teacher gave us a surprise essay, "you got 15 minutes to write one-page on any subject" and I had no idea what the write about so I wrote a self-referential essay about self-referential essays since I was sitting there having to write an essay.
    When I got my grade back it was a W, I asked "is it that bad you suggest I withdraw from the class?" She laughed and said no so I asked "what's the w stand for?",
    ... She said "WEIRD"
    I give this paper a W

  • @Kram1032
    @Kram1032 Před 10 měsíci +12

    Should note that 80% to 89% *is* quite a huge increase: It takes much much more work to get the last 5% than the first 75%.
    But yes, this is quite a complicated and handtinkered setup for achieving that.

    • @leonfa259
      @leonfa259 Před 10 měsíci +4

      Halving the errors is doubling the performance, that's great progress.

  • @TimScarfe
    @TimScarfe Před 10 měsíci +18

    I think this is a pretty good illustration of why neural networks can’t do high-level reasoning which is possible in recursive class algos - This algorithm is basically an outer loop to explore a wider space of retrieval options with the inner LLM

    • @KP-sg9fm
      @KP-sg9fm Před 10 měsíci +2

      could you possibly explain that like im a dumbass

    • @Levnerad
      @Levnerad Před 10 měsíci

      @@KP-sg9fmfr

    • @giosasso
      @giosasso Před 10 měsíci

      ​@KP-sg9fm Chatgpt is dumb dumb. Needs to be tricked for good outcomes. Chatgpt = dumb dumb

    • @jonathanpadilla3888
      @jonathanpadilla3888 Před 10 měsíci +3

      @@KP-sg9fm i think what he means is underlying reasoning of the model will never improve no matter how many different ways you try prompting it, the breadth of your approaches may improve results in some questions but will never answer all of them.
      the inner loop is fixed, it's reasoning capabilities are not able to handle more complex things like architecting an entire saas company with backend and frontend, no matter how much context you feed and how many agents you provide.
      sam altman recently stated gpt4 may be so good at reasoning, super human level, why? because arguments are breath oriented but math is deeply reasoning oriented.

  • @phaZZi6461
    @phaZZi6461 Před 10 měsíci +1

    whether or not the results were as good as was hoped for, publishing the research is in itself valuable

  • @tantzer6113
    @tantzer6113 Před 10 měsíci +11

    While it is true that the “meta-prompts” have been handcrafted, that is fine in a sense, since the user of this system (as opposed to the creators of this system) will not need to do any handcrafting. The downside, however, is that as the underlying LLM is upgraded or otherwise is fiddled with, this system might loose some of its accuracy. This means that the paper has a shelf life. This shelf life may or may not be shorter than a system (for generating prompts) that was created without handcrafting.

    • @naromsky
      @naromsky Před 10 měsíci

      I agree. For now.

    • @clray123
      @clray123 Před 10 měsíci +1

      The problem is that science is supposed to be theory building. But instead of developing general theory (which enables predictions in specific cases). But they seem to be doing purely empirical work to demonstrate that their approach works "better than others" given some set of preconditions. But the preconditions themselves are largely unknown due to the lack of theory. So the question is really, what has been achieved, in the sense of expanding our knowledge of the world, by such experimentation?

  • @dinoscheidt
    @dinoscheidt Před 10 měsíci +2

    💅 I think the list of thinking prompts and prompt examples (which is a very annoying task and takes a lot of try and error) is the most valuable part of the whole paper. Don’t want to know how much time the researchers spent to gather and or come up with all of these. Thank you

  • @210Artemka
    @210Artemka Před 10 měsíci +5

    It's hilarious on several levels. Great job Google!👍

  • @nevokrien95
    @nevokrien95 Před 10 měsíci

    I love how you have put this "we are back in the world of training data" the curent ml landscape is undergoing a paradigm shift. From the old train test metric compare to whatever we r in rn.

  • @YinnonHaviv
    @YinnonHaviv Před 10 měsíci +2

    9:17 - Dead end?
    Thanks for another great video!

  • @drdca8263
    @drdca8263 Před 10 měsíci +1

    The thing I’ve been imagining, is what about a system which generates, not just prompts for a given task, but instead (or also) produces *new tasks* (as in, not just e.g. “new instances of problems of the form ‘what is [x] plus [y]?’ “, but like, making up families of tasks which take in some parameter(s)), including prompts and/or programs for how to evaluate whether a proposed solution to (a given instance) of the task, is correct, or how good it is.
    So like, something that could generate things like “How many times does the following passage contain the string {slot for parameter 1}?”, “How many distinct roots does the polynomial {slot for parameter 1} have?”, “Which person was born first? {slot for parameter 1} or {slot for parameter 2}?”, “rewrite the following python code by defining a named function for each lambda expression that is used more than once, and replacing those lambda expressions with that function: {slot for parameter 1}”,
    etc.
    And, also produce either code or prompts which would generate appropriate values for the slots in the tasks,
    and also either code or prompts which, given the values that were used for each slot in the task, and the response to the task, produces an evaluation of the response to the task.
    I kind of imagine that this paper might be like,
    originally motivated as an attempt at part of the project I described?
    Like, if one wants to train something to do well on the tasks this setup generates, one would want to also produce a diverse collection of prompts for the task, which are still good in addition to being diverse?
    But actually, that’s probably not the original motivation for it. Or if it was, they really got off target.

  • @zellfaze
    @zellfaze Před 10 měsíci +1

    I think at 44:52 it is paraphrasing a line from the Kalamas Sutta in Buddhism in prompt 0.

  • @imagiro1
    @imagiro1 Před 9 měsíci

    "Think step by step": I remember when I tried the "Lisas father has three daughters" trick question: GPT first said what you would expect: It doesn't know the name of the third daughter. When given the hint "Use logic thinking" or so it got the right answer. That was already a few months ago, so it was at max GPT3.

  • @OperationDarkside
    @OperationDarkside Před 10 měsíci +4

    "One way street" (Einbahnstraße) is probably the phrase you were looking for.

    • @210Artemka
      @210Artemka Před 10 měsíci +4

      Dead end street?

    • @OperationDarkside
      @OperationDarkside Před 10 měsíci +1

      @@210Artemka Or that (Sackgasse in German). I didn't think about that possibility.

    • @byrnemeister2008
      @byrnemeister2008 Před 10 měsíci

      “Cul de sac” in English. Although it’s French!

  • @AlphaMoury
    @AlphaMoury Před 10 měsíci +2

    Wouldn’t be great to make a video about CAMEL or ChatDev? I think those papers tackle the general purpose of solving the goals through different agent roles interacting with each other, avoiding burning out the idea of improving prompt via heuristics. Regards and thanks for sharing all of this knowledge!

    • @naromsky
      @naromsky Před 10 měsíci

      Tackle the purpose of solving the goal? That might actually be a very apt description.

  • @BrianMosleyUK
    @BrianMosleyUK Před 10 měsíci +2

    Quite a critique, thanks 🙏👍

  • @fabriai
    @fabriai Před 10 měsíci +1

    Fantastic walkthrough of the paper, Yannic.
    I always wonder about the software you use to draw and highlight.

  • @Soul-rr3us
    @Soul-rr3us Před 10 měsíci +1

    GREAT PAPER DISCUSSION, YANNIC! (Shouting is all you need)

  • @TheLastVegan
    @TheLastVegan Před 10 měsíci +1

    Exciting! I'm glad theory of mind via recursive prompt seeding is becoming mainstream and that diversity of thought is allowed!

  • @dennisestenson7820
    @dennisestenson7820 Před 7 měsíci

    41:30 I think the zero order hyper-mutation operator was most "effective" because the large range in variation of effectiveness among the "thinking styles".

  • @clray123
    @clray123 Před 10 měsíci +1

    Back in the old times we had "correctness by design". Today we have "incorrectness by design" with ML researchers fumbling around to make it look respectable.

  • @howardkong8927
    @howardkong8927 Před 10 měsíci +2

    I'm curious, how does Yannic work on computers with sunglasses on? I tried it myself but the sunglasses would happen to completely block the polarized light from my monitor when I'm sitting straight, making the monitor completely dark.

  • @roomo7time
    @roomo7time Před 10 měsíci

    I love your video. Thanks always!

  • @a1k0n
    @a1k0n Před 10 měsíci +4

    Has anyone tried doing backprop to the initial KV cache, and optimizing those vectors directly? So the initial prompt isn't made up of tokens at all, but non-token embedding vectors that optimize the downstream task when concatenated with the query.

  • @kevinboles3885
    @kevinboles3885 Před 10 měsíci

    You said "I don't want to denegrate the paper..." quite late in your review. That you felt you had to say that is an admission that you had already done just that. And from my perspective you did exactly that REPEATEDLY throughout your discourse. I don't know a single thing about your education, experience, training, etc. with respect to AI/ML. But I am still 100% CERTAIN that you are utterly inadequate to treat this paper the way you did. I am that confident because I DO know the body of work (and the skills/education/experience/etc. it takes to be members) of the Google Brain team and Deep Mind. Unless you believe that they are lying with their improvement measures, it came across as childish to me that you made the type of comments you did about the METHODS used to achieve those improvements. This is the second video I have watched of yours, and I was strongly considering subscribing. Nope - moving along here.

  • @dennisestenson7820
    @dennisestenson7820 Před 7 měsíci

    44:00 wow it went off in the weeds.

  • @th3ist
    @th3ist Před 10 měsíci

    A 9% improvement to detecting hate speech...great
    Solving protein folding helped humanity more. Can't we just do stuuf like that instead?

  • @dr.mikeybee
    @dr.mikeybee Před 10 měsíci +1

    This method can be used to create a training set for a policy net.

  • @christiangreisinger2339
    @christiangreisinger2339 Před 10 měsíci +3

    Is this the first paper that lied to an LLM to generate benefits? I feel like this concept could have a lot of advantages in varoius applications. To me this has a lot of similarities on how humans (especially politicans) communicate in order to generate benefits for themselves or their group of peers.

  • @bencoman
    @bencoman Před 10 měsíci +2

    A prompt like "Are there any stakeholders or individuals who are directly affected by the problem?" is like a human in university being taught how to write a Functional Requirements document. its like teaching a generally intielligent human a manual process like Edward de Bono's Thinking Hats to generate divergent answers. Although the original mutators and mutating process are manually crafted, what isimportant is whether the mutating process needs to be applied again to each final-user's-question, or whether all the mutators are packaged together with an LLM as a "whole-LLM" itself, and it applies the mutators to the final-users's question in the background bringing those results together like a mix-of-experts.

  • @RickeyBowers
    @RickeyBowers Před 10 měsíci

    Although, the tuning to insure the fidelity of the genetic algorithm doesn't really bother me, I would have taken a more granular approach to solve complex problems with smaller models. For example, some of the basic problems solving tools are: guess and check; work backward; incremental steps; solve a simpler related problem, extrapolate; break into sub-problems; etc. (maybe domain specific mutations/styles) -- I would use these mutations to have the model create the context it needs to solve the problems.

  • @Hukkinen
    @Hukkinen Před 10 měsíci +1

    I'd like to see, this run for different types and sizes of LLM's and what does that then tell us about the models or this framework 🤔

  • @thomasmuller7001
    @thomasmuller7001 Před 10 měsíci +7

    prompt comment

  • @sunnohh
    @sunnohh Před 10 měsíci

    Methods in using statistical mush to better optimize the statistical mush created by the autocorrect mush algorithm

  • @craftymanish
    @craftymanish Před 10 měsíci +5

    🎯 Key Takeaways for quick navigation:
    00:13 🧠 Basic idea: The paper introduces a system that automates the creation of prompts for large language models.
    00:42 🧬 Evolutionary Algorithm: The system evolves prompts using an evolutionary algorithm based on mutation and fitness evaluation.
    01:06 🔄 Self-Improving: The system is self-referential and self-improving, as it evolves its own method for generating improved prompts.
    01:48 🚧 Not a Complete Solution: The system shows promise but isn't the end-all solution; it moves the problem to another domain.
    02:14 🍎 Example-Based: Traditional prompt strategies like "Chain of Thought" can improve the performance of language models in tasks like math problems.
    03:21 🛠️ Handcrafted vs Automated: Earlier prompting techniques were manually engineered; Promptbreeder aims to automate this.
    04:16 🌱 Evolution of Mutation Prompts: The system also evolves the prompts used for mutations, making it self-referential.
    05:53 📉 Diminishing Returns: Previous efforts in automated prompt engineering hit a ceiling; this system aims to overcome that through diversity.
    08:04 🌈 Diversity for Improvement: Maintaining diversity in the evolved prompts can lead to overcoming the problem of diminishing returns.
    09:39 📊 Results: Promptbreeder outperforms other methods in various datasets in both zero-shot and fine-tuned scenarios.
    10:42 🎯 Task Execution: Promptbreeder uses training data of successful task executions to evolve better prompts.
    11:36 🤔 Promptbreeder uses different "thinking styles" as heuristics, not specific to any task.
    12:03 🎯 Task descriptions are very specific and guide what problem the system tries to solve.
    12:46 🔄 "Mutation prompts" help transform one prompt into another.
    13:49 🌱 Over time, both the prompts and mutation prompts evolve and improve.
    14:44 🧬 The system uses "units of evolution," each consisting of one mutation prompt and two task prompts.
    16:31 🎨 Multiple prompts can be interleaved to instruct the model and then bring the output into the correct format.
    18:20 📊 Evaluation involves checking the "fitness" of the mutation prompts by proxy-evaluating the prompts they produce.
    20:49 🔄 The population evolves by taking the "fit" units, mutating them, and then re-evaluating fitness.
    21:14 🛠️ Mutation operators can transform prompts or create new ones, sometimes using the associated mutation prompt.
    23:30 🔄 Discusses the mutation operators used in Promptbreeder for evolving prompts, focusing on changing the population of prompts each round based on their fitness.
    24:11 🌱 Introduces "zero of order" generation, where a completely new prompt is generated without the influence of existing or mutation prompts.
    25:19 🧬 Explains "first order" prompt generation, where a mutation prompt is concatenated to an existing parent prompt to create a new, mutated prompt.
    26:29 🎣 Critiques the paper for its reliance on "fiddling around" with various parameters rather than taking a more hands-off, general approach.
    28:34 🎓 Describes the use of "sets of parents" for more complex mutation, along with filtering for diversity among existing prompts.
    30:14 📊 Details an ordered list mutation, but critiqued for tricking the model about the ordering to improve diversity.
    31:53 🎭 Points out the contradiction in the paper's stated aim to remove handcrafting, while employing various handcrafted techniques.
    33:11 🌳 Discusses lineage-based mutation where the language model is coaxed into continuing a list of prompts based on their evolutionary history.
    34:58 🧠 Promptbreeder uses thinking styles as meta mutation prompts to generate new mutation prompts.
    35:54 🧬 Techniques like prompt crossover and context text shuffling are applied, similar to general evolutionary algorithms.
    36:32 📈 In a hate speech classification problem, Promptbreeder evolved two sequentially applied long prompts that scored 89%, an improvement over a hand-designed prompt that scored 80%.
    37:51 🤔 Despite the complex setup, the gain is a 9 percentage point improvement, which equates to 50% fewer mistakes.
    38:19 📊 Evolutionary algorithms show improvement over time, as seen in the average fitness curves.
    40:38 ❓ The top-scoring mutation prompt is basic and questions whether the entire complex machinery is necessary.
    41:30 🔄 No single mutation operator dominates, implying that all types contribute positively.
    42:36 👍 First-order hyper mutation, where you mutate a mutation prompt, ranks third, which is seen as a positive outcome for the paper.
    44:00 🤨 The evolved prompts for math problems seem dubious in their effectiveness.
    45:28 🤷‍♂️ The actual results are mixed, leading to questions about the system's real-world applicability.
    Made with HARPA AI

  • @marcelbritsch6233
    @marcelbritsch6233 Před 10 měsíci

    excellent walkthrough
    thankyouuuuu

  • @toddnedd2138
    @toddnedd2138 Před 10 měsíci

    This seems to be a funny paper, need to read it. In principle, one can additionally mutate over every other dimension, including the fitness function itself. And again over the resulting dimension. This can be continued indefinitely. The question is just how practical such a thing is. In the end, you have the 'perfect' prompt for a specific problem. Unless you have a flat rate for the used LLM and a lot of time, you will go broke beforehand.
    Retaining the least fit solutions is also common, to maintain diversity or elitism.

  • @markbosshard
    @markbosshard Před 10 měsíci +1

    Hey Yannic, thanks a lot for the detailed walkthrough! Is an implementation available somewhere, to test the automatic prompt improvement and its use in practice? Guess that might interest most of us. Thanks a lot! :)

    • @harinir4608
      @harinir4608 Před 10 měsíci

      hii!! did you find anything about the implementation?

  • @fordstone6308
    @fordstone6308 Před 10 měsíci

    Was the term you were looking for “being trapped in a local optimum”?

  • @HappyMathDad
    @HappyMathDad Před 10 měsíci

    I feel the action will move to knowledge graphs. Maybe review one of those papers???

  • @MrStarchild3001
    @MrStarchild3001 Před 10 měsíci

    Some of those generated prompts look wonderful to me! Not sure if I understand all of them.

  • @sgramstrup
    @sgramstrup Před 10 měsíci

    Hm, I had the idea of getting rid of human intervention in the evolution of a cognitive architecture. I would generate a cognitive architecture by adding thoughts (short AI perspectives) and optimize the architecture by fitness. My thought was to let a genetic algorithm optimize for thoughts popping up at the correct time in a process, and to add specialists the same way. A cognitive process could start simple, but the system would reconfigure it self dynamically to the new situation.
    However, I'm not sure I quite understood this paper, or if it was the same general idea they build on ?

  • @jnevercast
    @jnevercast Před 10 měsíci

    The end of a street, is a dead end 👍

  • @clray123
    @clray123 Před 10 měsíci

    Re shouting: maybe if we include direct threats and expletives in our prompts the LLMs will work even better?

  • @user-ku3uy7ou3t
    @user-ku3uy7ou3t Před 8 měsíci

    have you looked into comparing with ProTeGi?

  • @ScottVanKirk
    @ScottVanKirk Před 10 měsíci

    Hey, can you do a vid about what's happening with personal assistant?

  • @Tehom1
    @Tehom1 Před 10 měsíci +1

    Interesting, thanks for sharing it. I have to wonder if some of the stranger results are just flukes. Of course they evaluate it on a different data set, but if they try a zillion things some are going to do well just by chance.

    • @_rockt
      @_rockt Před 10 měsíci +1

      Things doing well be chance is the point of any evolutionary approach. The important part is to make sure ML 101 principles are adhered to and you have clean train / test split to test generalization.

    • @Tehom1
      @Tehom1 Před 10 měsíci

      @@_rockt No, and I think you missed the point.

  • @jtjames79
    @jtjames79 Před 10 měsíci +18

    I was literally just telling my wife about this. 😂

    • @saminchowdhury7995
      @saminchowdhury7995 Před 10 měsíci +2

      You guys are researchers?

    • @jtjames79
      @jtjames79 Před 10 měsíci +15

      @@saminchowdhury7995 Yes. It's one of our hobbies. We share white papers instead of ticktocks, or whatever the kids are doing these days.

    • @saminchowdhury7995
      @saminchowdhury7995 Před 10 měsíci +10

      @@jtjames79 Power couple 💪💪

    • @potatodog7910
      @potatodog7910 Před 10 měsíci +4

      W

    • @MagicJoshua
      @MagicJoshua Před 10 měsíci +4

      @@jtjames79WHAT A TIME TO BE ALIVE!❤️

  • @naromsky
    @naromsky Před 10 měsíci

    Prompts all the way down. 🐢

  • @Doug97803
    @Doug97803 Před 10 měsíci

    Cul-de-sac, appropriated from French.

  • @JahMusicTube
    @JahMusicTube Před 10 měsíci

    Party game: watch this video and take a shot everytime you hear the word "prompt".

  • @ixion2001kx76
    @ixion2001kx76 Před 10 měsíci

    9:17 rut..stuck in a rut

  • @ahmadalis1517
    @ahmadalis1517 Před 10 měsíci +2

    How do you define a foundation model?

    • @zerotwo7319
      @zerotwo7319 Před 10 měsíci +2

      Probably a model that can be refined, like stable diffusion models

  • @almoni127
    @almoni127 Před 10 měsíci +1

    30:58 Imagine a future LLM trained on this paper and therefore knows it is being lied to... 😆

  • @pensiveintrovert4318
    @pensiveintrovert4318 Před 10 měsíci

    Try to think random thoughts and we can experience confirmation bias and think the result is wonderful.

  • @tempacc9589
    @tempacc9589 Před 10 měsíci

    "Prompt engineers" in shambles😂

  • @JorgetePanete
    @JorgetePanete Před 10 měsíci +4

    mutant

  • @mequavis
    @mequavis Před 10 měsíci

    this is literally in a way how I have eve on my chan making prompts and telling stories, im just doing it a bit sloppy

  • @Kerrosene
    @Kerrosene Před 10 měsíci

    In all fairness a sales pitch has to be made for acceptance or better reviewer ratings these days.. then you handle the reviewers one by one..

  • @twobob
    @twobob Před 10 měsíci +1

    cul de sac. Pretty sure its not English though ;) We stole it from France. "Dead End" is probably the best term

  • @khalilsabri7978
    @khalilsabri7978 Před 10 měsíci

    The results are weird, dont they cherry pick the best examples? Im getting confused, seems like the idea did not work that well as expected and that's it .

  • @cerealpeer
    @cerealpeer Před 10 měsíci

    🤩🥳🤯

  • @flipmarley9973
    @flipmarley9973 Před 10 měsíci

    So AI will take over "prompt engineer" jobs as well

  • @christiangreisinger2339
    @christiangreisinger2339 Před 10 měsíci

    I am not too convinced about that

  • @TiagoTiagoT
    @TiagoTiagoT Před 10 měsíci

    Kinda sounds a bit like they wrote the conclusion they expected, with strong flourishes, and then ran the experiment and attached the data without weighting how much the data matches the expected conclusions, or even actually analyzing the data to see if it actually produced any conclusions by itself. Though, this is based on how it was presented on this video, I have not read the paper myself.

  • @freedom_aint_free
    @freedom_aint_free Před 10 měsíci +3

    Pardon my ignorance, but, as long as a method does not alters the weights and biases of the neural networks involved, it can not be called "self improving"* ?
    * At least no recursively self improving (the golden goal of AI), for it will remain bounded by the particular neural network architecture.

  • @trackingzone
    @trackingzone Před 10 měsíci

    Good laugh.

  • @DamianReloaded
    @DamianReloaded Před 10 měsíci

    How can something be completely irrelevant if it is introduced to make the algorithm work better? Constructive criticism: Maybe tune down "opinionism" if there isn''t a really good, well thought out reason for it?

  • @andrewsomerville5772
    @andrewsomerville5772 Před 10 měsíci

    Love the videos but, am I the only one that finds the sunglasses obnoxious?

  • @paxdriver
    @paxdriver Před 10 měsíci

    This looks sooo expensive lol

  • @billymonday8388
    @billymonday8388 Před 10 měsíci

    breeder

  • @einarkeyser1763
    @einarkeyser1763 Před 10 měsíci

    Taking the easy way out, didn't other things already do that and take humanity another step towards destruction?

  • @TheBeatle49
    @TheBeatle49 Před 3 měsíci

    Horrible paper. Riddled with hand-crafting which it is supposedly replacing!

  • @pensiveintrovert4318
    @pensiveintrovert4318 Před 10 měsíci

    I am sorry, but this is just a GAN with you being the discriminator.

  • @Sven_Dongle
    @Sven_Dongle Před 10 měsíci +3

    The didnt really use the language of evolutionary algorithms and genetic programming properly. Specifically the use and overuse of the word "mutation". It's like they took an ad hoc approach without really studying the well established science of evolutionary algorithms.