Mapping GPT revealed something strange...

Sdílet
Vložit
  • čas přidán 15. 06. 2024
  • These two scientists have mapped out the insides or “reachable space” of a language model using control theory, what they discovered was extremely surprising.
    Please support us on Patreon to get access to the private Discord server, bi-weekly calls, early access and ad-free listening.
    / mlst
    Aman Bhargava from Caltech and Cameron Witkowski from the University of Toronto to discuss their groundbreaking paper, “What’s the Magic Word? A Control Theory of LLM Prompting.” (the main theorem on self-attention controllability was developed in collaboration with Dr. Shi-Zhuo Looi from Caltech).
    They frame LLM systems as discrete stochastic dynamical systems. This means they look at LLMs in a structured way, similar to how we analyze control systems in engineering. They explore the “reachable set” of outputs for an LLM. Essentially, this is the range of possible outputs the model can generate from a given starting point when influenced by different prompts. The research highlights that prompt engineering, or optimizing the input tokens, can significantly influence LLM outputs. They show that even short prompts can drastically alter the likelihood of specific outputs. Aman and Cameron’s work might be a boon for understanding and improving LLMs. They suggest that a deeper exploration of control theory concepts could lead to more reliable and capable language models.
    We dropped an additional, more technical video on the research on our Twitter account here: x.com/MLStreetTalk/status/179...
    Pod version with no music/SFX:
    podcasters.spotify.com/pod/sh...
    Additional 20 minutes of unreleased footage on our Patreon here: / whats-magic-word-10492...
    What's the Magic Word? A Control Theory of LLM Prompting (Aman Bhargava, Cameron Witkowski, Manav Shah, Matt Thomson)
    arxiv.org/abs/2310.04444
    LLM Control Theory Seminar (April 2024)
    • LLM Control Theory Sem...
    Society for the pursuit of AGI (Cameron founded it)
    agisociety.mydurable.com/
    Roger Federer demo
    conway.languagegame.io/inference
    Neural Cellular Automata, Active Inference, and the Mystery of Biological Computation (Aman)
    aman-bhargava.com/ai/neuro/ne...
    Aman and Cameron also want to thank Dr. Shi-Zhuo Looi and Prof. Matt Thomson from from Caltech for help and advice on their research. (thomsonlab.caltech.edu/ and pma.caltech.edu/people/looi-s...)
    x.com/ABhargava2000
    x.com/witkowski_cam
    TOC:
    00:00:00 - Main Intro
    00:06:25 - Bios
    00:07:50 - Control Theory and Governors
    00:09:37 - LLM Control Theory
    00:17:17 - Federer Game
    00:19:49 - Building LLM Controllers
    00:20:56 - Priors in LLMs
    00:28:44 - Manipulating LLMs
    00:34:11 - Adversarial Examples and Robustification
    00:36:54 - Model vs Software
    00:39:12 - Experiments in the Paper
    00:44:36 - Language as an Interstate Freeway
    00:46:41 - Collective Intelligence
    00:58:54 - Biomimetic Intelligence
    01:03:37 - Society for the Pursuit of AGI
    01:05:47 - ICLR Rejection
  • Věda a technologie

Komentáře • 748

  • @lopezb
    @lopezb Před 19 dny +66

    As a mathematician, I love their approach, which makes the video so much clearer and and understandable than most.

    • @lionbear7078
      @lionbear7078 Před 16 dny +2

      What's your favourite equation?

    • @icodestuff6241
      @icodestuff6241 Před 14 dny +5

      @@lionbear7078 you're thinking of physicists

    • @filipberntsson6634
      @filipberntsson6634 Před 12 dny

      ​@@lionbear7078Ax=b is the correct answer

    • @yrebrac
      @yrebrac Před 6 dny

      You mean if I keep watching they will say something scientific at some point?

  • @Max-hj6nq
    @Max-hj6nq Před 19 dny +94

    Here is my summary of their paper !
    LLM Prompting
    - Formalizes prompt engineering as an optimal control problem
    - Prompts are control variables for modulating LLM output distribution
    - Investigates reachable set of output token sequences R_y(x_0) given initial state x_0 and control input u
    Theoretical Contributions
    - Proves upper bound on reachable set of outputs R_y(x_0) as function of singular values of LLM parameter matrices
    - Analyzes limitations on controllability of self-attention mechanism
    k-ε Controllability Metric
    - Quantifies degree to which LLM can be steered to target output using prompt of length k
    - Measures steerability of LLMs
    Empirical Analysis
    - Computes k-ε controllability of Falcon-7B, Llama-7B, Falcon-40B on WikiText
    - Demonstrates lower bound on reachable set of outputs R_y(x_0) for WikiText initial sequences x_0
    Key Findings
    - Correct next WikiText token reachable >97% of time with prompts ≤10 tokens
    - Top 75 most likely next tokens reachable ≥85% of time with prompts ≤10 tokens
    - Short prompts can dramatically alter likelihood of specific outputs
    - Log-linear relationship between prompt length and controllability fraction
    - "Exclusion zone" in relationship between base loss and required prompt length

    • @downerzzz3463
      @downerzzz3463 Před 17 dny +5

      What wiki did you copy and paste this from?

    • @Jason-wm5qe
      @Jason-wm5qe Před 17 dny +4

      Ironic

    • @ChuckNorris-lf6vo
      @ChuckNorris-lf6vo Před 15 dny +1

      Well duuuhh.

    • @Max-hj6nq
      @Max-hj6nq Před 15 dny

      @@downerzzz3463 skill issue

    • @donthompson9522
      @donthompson9522 Před 14 dny +2

      Wow remarkable enjoy reading it . Thank you one thing I do know knowledge is the key to unlocking doors again I said Thank you 😊

  • @darksaga2006
    @darksaga2006 Před 23 dny +93

    I love the new documentary style format. The production quality is insane! Also great guests! Keep up the great work

  • @CodexPermutatio
    @CodexPermutatio Před 23 dny +45

    The presentation and editing is excellent. This channel is reaching stratospheric levels of quality.

  • @Casevil669
    @Casevil669 Před 23 dny +56

    10 minutes in, production quality is over 9000! Thanks for this, looking forward to watching the rest!

    • @zxcaaq
      @zxcaaq Před 22 dny +2

      This is bullshit, a biologist discovers noise functions and they start drooling over all the possibilities, Self driving cars, flying humans. brah.. we've known this since 1998

    • @Casevil669
      @Casevil669 Před 22 dny +3

      @@zxcaaqPlease elaborate. I don't see a problem with applying something that we know in order to prob at a black box which we've made for ourselves, namely LLMs. They aren't saying they discovered some new methodology.

  • @sandybayes
    @sandybayes Před 17 dny +7

    As a social scientist I found Cameron's explanation more understandable. I hope he utilizes his communication style to interface with the rest of us non -engineering types. Humanity needs this cross feeding to add other perspectives to further the science.

  • @MWileY-nj1yb
    @MWileY-nj1yb Před 23 dny +4

    Really amazing! Thought provoking, fascinating and deeP. A lot to take in. I will definitely need to watch again. Appreciate you all- keep on keeping

  • @badstylecherry7255
    @badstylecherry7255 Před 23 dny +27

    Future synths and cellos adds such a good aesthetic to these videos

  • @diga4696
    @diga4696 Před 23 dny +32

    Thank you for yet another insightful conversation! The concept of collective intelligence, as you've highlighted, is truly captivating. Having been involved with Wikimedia decades ago, I've long believed that harnessing human knowledge to create a digital "global brain" would only accelerate. From books to Wikipedia to large language models, the trajectory is clear. I'm eager to witness the next evolution in knowledge synthesis, which will undoubtedly enhance our capacity to understand and model reality exponentially. Knowledge is lit.

    • @maalikserebryakov
      @maalikserebryakov Před 23 dny

      It will be like the Akasha in genshin impact

    • @goldnutter412
      @goldnutter412 Před 21 dnem

      We're all here to do what we're all here to do.. evolve.
      Is choice the solution and not the problem🙃😋sure is a very efficient universe.

    • @CristianVasquez
      @CristianVasquez Před 21 dnem

      We are Symbolic species evolving,

    • @steveflorida5849
      @steveflorida5849 Před 19 dny

      ​@@CristianVasquezmore accurately we humans are individual Personalities using symbolic languages.
      Also, we Personalities value Values... love, goodness, truth, and beauty.

    • @CristianVasquez
      @CristianVasquez Před 18 dny

      @@steveflorida5849 sure, each person interprets the symbols in different ways, as individuals. I think it's accurate enough to say we are symbolic species. Symbols are important, they last longer than we do

  • @JoshuaKolden
    @JoshuaKolden Před 23 dny +297

    What does it mean to “simulate” intelligence? In what way is simulated intelligence not actual intelligence?

    • @DavenH
      @DavenH Před 23 dny +56

      On the face of it, no difference. Charitably, I guess he means there's something important missing from the simulacrum.

    • @errgo2713
      @errgo2713 Před 23 dny +55

      Because it's engineered (extremely expensively and inefficiently) to function as if it's naturally intelligent. Do you not understand how they work?

    • @MagusArtStudios
      @MagusArtStudios Před 23 dny +8

      It's like an appendage that takes sensory input and spews out output in a flash of computation.

    • @tantzer6113
      @tantzer6113 Před 23 dny +95

      “Simulated” means that it looks like the system’s answers are based on reasoning (i.e., inferences from principles and evidence in the like smart and well trained humans) whereas they’re just based on mimicking. The test of this is whether the LLM can apply simple and sound reasoning consistently in various domains. It cannot, which tells us it’s lacking basic reasoning skills even when it does happen to give the right answer.

    • @tylermoore4429
      @tylermoore4429 Před 23 dny +63

      In what way is simulated flight not actual flight? In what way is a simulated girlfriend not an actual girlfriend?

  • @ngbrother
    @ngbrother Před 23 dny +73

    I think a better example of a hypothetical population-level adversarial example is the "Killer Joke" from Monty Python.

    • @sblowes
      @sblowes Před 23 dny

      I *think* it was a reference to Piers Anthony’s somewhat obscure _Macroscope_. Great book.

    • @Will-kt5jk
      @Will-kt5jk Před 23 dny +3

      SnowCrash was what came to mind for me.

    • @edgardsimon983
      @edgardsimon983 Před 22 dny

      u r a true comment in adequation with the bilateral quantity of bs and philosiphical masturbation of this video

    • @rationalactor
      @rationalactor Před 21 dnem +3

      Strange that you should mention Monty Python. I suspect that Monty Python sketches will be essential training data for high end LLMs, or their replacements.

  • @rationalactor
    @rationalactor Před 21 dnem +101

    Well, we know the answer is 42. But what's the prompt?

    • @ras0k
      @ras0k Před 17 dny +5

      41+1=?

    • @stereo-soulsoundsystem5070
      @stereo-soulsoundsystem5070 Před 16 dny +2

      brilliant

    • @drivers99
      @drivers99 Před 16 dny +5

      “Repeat after me: 42”

    • @captaingabi
      @captaingabi Před 15 dny +10

      Prompt is: "What is the meaning of life, the universe and everything else?"

    • @VasBlagodarskiy
      @VasBlagodarskiy Před 15 dny

      The prompt is insufficient. That’s what the prompt is. (Problem is, you have to run compute before you get to discover this….)

  • @joshuasmiley2833
    @joshuasmiley2833 Před 23 dny +3

    I absolutely love and I am so thankful for this channel. Ever since I stumbled upon it, I have not missed an episode. I find it entertaining quite thought-provoking inspirational and extremely exciting for the future!

  • @swyveu
    @swyveu Před 22 dny +3

    A very good, down to earth, meaningful interview.
    Good questions and in-depth answers.
    I've learned a thing or two. Thank you!

  • @paxdriver
    @paxdriver Před 23 dny +14

    I love the channel, thank you for all the years of great work

  • @oncedidactic
    @oncedidactic Před 19 dny +11

    Getting nerd chills with this epic intro like it’s 2020 MLST, bravo!

  • @DavenH
    @DavenH Před 23 dny +30

    You've become a photographer! Nice production mate.

  • @marktwain5232
    @marktwain5232 Před 22 dny +4

    This is absolutely first rate production on every level! Kudos!

  • @luisliz
    @luisliz Před 23 dny +22

    This is exactly the kind of content I want to see. TY!
    That idea of decentralized "GPT7" is an idea that I love and I hope it becomes true. I think there's a connection there between how the internet actually works. We can probably see the internet as a huge brain and each network is a different section in the brain. It's kind of mind boggling to think what would even be possible in that world. Cell phone networks might actually be another good example.

    • @TheReferrer72
      @TheReferrer72 Před 23 dny +3

      Its already happened, its called the internet.

    • @tombelfort1618
      @tombelfort1618 Před 22 dny

      The internet isnt a brain though. It’s only protocols for routing data from one point to another. There is no storage or intelligence

    • @ci6516
      @ci6516 Před 19 dny +2

      I’m like what ? That’s the description of the internet as we know it …

  • @pixelpusher75
    @pixelpusher75 Před 9 dny +2

    So much of this sound just like how the internet was going to be filled with great access to knowledge and help build a better tomorrow filled with tolerance and love. What we got was anxiety, jealousy, hatred, manipulation and porn. LLM & Ai will definitely make some people very rich, will change the world, unfortunately probably not for the better.

    • @quorryraphael9980
      @quorryraphael9980 Před 9 dny

      You get a lot of people saying stuff like "people will be scared of advancements they don't understand" but they ignore the people in history who made money off of new unregulated "technology" that hurt people, society, the environment etc. The people making the money have the greatest incentive to lie about how harmful their product is.

  • @vicaya
    @vicaya Před 23 dny +41

    Now we have a full circle of NLP: Neural Linguistic Programming is no longer pseudo psychological "science" but a subset of Natural Language Processing, and of course PUAs become Prompt User Agents :)

    • @timelessone23
      @timelessone23 Před 23 dny

      😂 seducing the model into doing what you want. Yes, the game is on!

    • @edism
      @edism Před 19 dny +2

      Lol

    • @elitemagicacademy3818
      @elitemagicacademy3818 Před 19 dny +3

      Exactly as hypnotist, I didn't realize my skills would become so important to tech lol

    • @edism
      @edism Před 19 dny

      @@elitemagicacademy3818 You were jailbreaking neural nets before the term was coined :)

  • @argh44z
    @argh44z Před 23 dny +2

    really cool. great to see control theory (or the theory of feedback) getting a comeback. I think there is a lot of things it can teach prompt engineering

  • @7c2d
    @7c2d Před 23 dny +10

    I see intelligence as a process of statistical prediction and pattern matching atop a core process of knowledge acquisition over time subject to the physical constraints of a given system.
    The data shapes the system.

    • @maalikserebryakov
      @maalikserebryakov Před 23 dny

      You see nothing at all.
      Humans don’t use numerical calculation.

    • @BootyRealDreamMurMurs
      @BootyRealDreamMurMurs Před 20 dny

      ​@@maalikserebryakovyes we do.
      1. "Numerical" or the "mathematical way to express ans represent things" is a human made concept to, as already said, to express and represent the things of the world in a manner which humans can use to turn abstract thought and imagination of the thints around us into clearer and well defined representations which makes calculations and solving problems easier.
      2. Acgually, Human brains run mostlyg on basic neurons right? These basic neurons does TWO things. Recieve signal from other neurons, AND Send signal to another.
      THATS IT.
      Go search about it to fact check it but thats pretty much my understanding of it.
      If you think about it, the brain actually calculates in a Binary sort of way although 3D, becayse a neuron can connect to multiple nuerons.

    • @myrakrusemark6873
      @myrakrusemark6873 Před 18 dny +3

      ​@@maalikserebryakov sure they do. It's just a bit more wet and slimy

    • @tomtricker792
      @tomtricker792 Před 12 dny +1

      ​@@maalikserebryakovHow do you explain the fractals that we see when under the influence of psychedelic drugs?

  • @truehighs7845
    @truehighs7845 Před 19 dny +4

    As a linguist I am overjoyed LLMs are programmable through NL when you train them, but like a human training not everything sticks, and not everything sticks in the same way. So when you infer a neural network that is plastic to the complexity of the NL itself, you get what you put in, with tit's natural level of uncertainty mitigated by a controllable logarithmic normalisation which is recurrent, so with that kind of volume, in the aggregate, it becomes uncontrollable for a human brain. Especially because if I get that right, the LLM doesn't even work in terms of words but in terms of "morphemes" - smallest group of letters with a meaning most usually collocated in the same way - mimicking already a level of language complexity that digs at syllabic level.
    It's another type of quantum computing if you want, it's really quantum linguistics, language has intrinsically spectral properties for nuances, where "yes" and "no" can be the binary boundaries, but in between you can have all the shades of grey you can imagine. Nobody can fathom the complexity of it because that infinity of nuances - at syllabic level - it also varies between multilingual people and monolingual people, and it is subjective, individually to anyone, while comprehensive for the LLM, so yes complex, as complex as all the languages put together, and that's just the veneer.
    So if you come from programming where you can control your code a 100% you feel you need to understand all the LLM pathways and apprehend them with our brains - even with visualisation (of words) - it's like trying to keep up with a racing car on foot.

    • @Unique_Leak
      @Unique_Leak Před 19 dny

      Since you're a linguist how useful are Syntax/Synactic Trees in contrast with LLM Transformers?

    • @truehighs7845
      @truehighs7845 Před 19 dny

      @@Unique_Leak That's old school stuff, compared to neural networks they are clunky and limited relying only on symbolic grammar as reference mechanism, there is no semantic glue like in an LLM.
      We used them with Trados to so some sort of automated translation, but human intervention is needed for the meaning in context. They are useful when you match similar locutions across languages, it works relatively well within specific fields where there is less contextual ambiguity, but it requires manual intervention, if you leave it on it's own, you will have big mistakes, the LLM is far superior by a stretch.
      I'd say in contrast, there is the same difference between a bicycle and a Harley Davidson. 😂

  • @davidmaiolo
    @davidmaiolo Před 6 dny +3

    Here are the key points from the video discussed in simpler terms:
    1. **Viewing AI Models Like Machines**:
    - The video suggests thinking about AI models, like ChatGPT, as if they were machines with gears and levers. Just like how engineers control machines to make them work predictably, we can try to control AI models to make their behavior more understandable and reliable.
    2. **Manipulating AI with Strange Prompts**:
    - It turns out that AI models can be tricked or steered in weird ways if given unusual or unexpected prompts. This is like finding out that a vending machine gives out candy if you press a secret combination of buttons. This discovery shows that AI models are more flexible (and potentially more unpredictable) than we might have thought.
    3. **Misunderstanding How AI Learns**:
    - Many people assumed that fine-tuning AI with human feedback (like teaching a dog tricks) would limit its responses. However, it turns out that the AI still has a wide range of possible responses, even after this training. This means controlling AI is more complex than just teaching it a few rules.
    4. **AI's Impact on Us**:
    - AI models have the potential to either make us smarter and better at working together or make us rely too much on them and become less capable. This highlights the importance of understanding AI deeply and using it wisely.
    5. **Fun but Challenging AI Experiments**:
    - The video mentions a game where you try to make the AI say "Roger Federer is the greatest" by giving it the right prompt. This game shows how tricky it can be to get the AI to produce a specific response, illustrating the challenge of controlling AI.
    6. **Fine-Tuning AI Responses**:
    - There's a technique called "soft prompting," which tweaks the internal settings of the AI rather than just changing the words we feed it. This is like adjusting the dials on a radio to get a clear signal. It shows that even small changes can significantly affect what the AI says.
    7. **Developing Better Control Methods**:
    - The ultimate goal is to create a set of rules or a theory for controlling AI effectively, similar to how we have rules for building and controlling machines. This would help make AI more predictable and safe to use.
    In simpler terms, the video explains that AI models are like complex machines that can be controlled and influenced in unexpected ways. It emphasizes the need for better methods to manage AI, so we can harness its power without falling into potential pitfalls.

  • @schm00b0
    @schm00b0 Před 18 dny +4

    I'm an amateur in all of the fields talked about but it seems to me that the first thing to do in trying to build something similar to human 'mind' is to find out all of the forms of communication within a human body. That task should also include communication of micro-organisms living within us. We should then find out all of the possible interactions of those communication systems. Where they happen, how they happen, what are the priorities, etc...

    • @kongchan437
      @kongchan437 Před 17 dny +1

      And multiply that very deep level of complexity by multiply professional circle, social circle, family circle etc etc

    • @user-un8hy5dd3j
      @user-un8hy5dd3j Před 15 dny +2

      yeah, you're definitely an amateur

  • @rossa10
    @rossa10 Před 23 dny +17

    Interesting episode.
    Re production values: totally understand why someone used to doing pure podcasts (infotainment) might want to start adding texture & mood by having locations b-roll, establishing shots, music, sound design, etc, but for me the sweet spot is keeping the focus on information and only insightful visual cutaways (none just for mood!) and NEVER background music. As a former documentary filmmaker I know just how manipulative (esp emotionally) music can be. To be used VERY sparingly, if at all, outside of top & tail of piece.
    Well done though. Very well made.

    • @zyansheep
      @zyansheep Před 23 dny +1

      On the other hand, mood can make things stick better 🤔

    • @sG12669
      @sG12669 Před 22 dny +1

      Plz don’t listen to this person, literally take the complete opposite away.

  • @Kwalk1989
    @Kwalk1989 Před 23 dny +6

    This is the best and most beautiful AI channel. Every video is a new ride. Thank you so much for sharing the knowledge.

  • @punk3900
    @punk3900 Před 20 dny

    Excellent discussion! Pure gold!

  • @heinzgassner1057
    @heinzgassner1057 Před 23 dny +31

    Great discussion. But still, as most of the work in ‘Artificial Intelligence’, also this discussion is happening ‘inside the cave’ of a big ontological misunderstanding: Our human thoughts, memories, sensations and perceptions are not just represented by ‘words’ and outputs generated according to probability optimizations. Thought, memories, sensation and perceptions appear in ‘something’ that is itself not a ‘thing’, today we most often call it ‘consciousness’. We ‘understand’ the world and we are even conceptualizing this world to make it look like our human faculties can handle it. We work with the map and know nothing about the territory. Real reality is so much weirder and so different to what our limited human reasoning and perception suggests. A good start to check this out is by looking into the work of Donald Hoffman (not to speak about the great inputs from philosophers like Spinoza, whom Einstein adored so much). Questioning ‘physicalism’ is what a scientist of the 21st century needs to do, as we learn more and more about the primary role of ‘subjectivity’ - the Elephant in the room of understanding the nature of consciousness and reality.

    • @russaz09
      @russaz09 Před 22 dny +5

      I agree, but from a software engineering perspective I don’t think there is much of use “outside the cave” as it were.
      When scratching the surface it helps to have cave walls to follow, if that analogy makes any sense 😅

    • @yoavco99
      @yoavco99 Před 19 dny

      You can still have a system of consciousness within physicalism, actually, most physicalists do believe in consciousness from what I am aware of. Check token-token or type-type identity theory.
      The hard problem of consciousness haven't been solved in my opinion. And in my opinion we can't even know whether anything is conscious. We haven't made any progress basically towards a unified theory of consciousness.

    • @DJWESG1
      @DJWESG1 Před 19 dny

      One of their problems is to keep listening to hintons constant criticism of ppl like noam chomski.
      Which is fine, it'll mean we don't reach proper intelligence for a good while yet.

    • @heinzgassner1057
      @heinzgassner1057 Před 19 dny +1

      @@yoavco99 "Most physicalists do believe in consciousness ...". That's and ontologically and epistemologically interesting statement. All I can ever really 'know' is 1) That I am conscious 2) That I am present. Everything else, all my thoughts, memories, feelings, sensations and perceptions need to appear in this 'I am' for making logical sense. This 'I am' can therefore not itself be a thought, memory, feeling, sensation or perception (as very basic logic requires, see basics of Set Theory and the works of Bertrand Russel and Kurt Gödel). Thought conceptualizes time and perception conceptualizes space and matter. To turn this upside down and make 'space-time-matter' primary, is based on religious believe, not on science, but this believe is so strongly engrained, that we don't even notice it as believe. We are running around with orange-tainted glasses (our human mind) in search of white snow and can undoubtfuly proof, that snow is orange. Just as the people confronted with the idea that the Earth is a sphere moving in open space trashed this disturbing insight by dismissing it based on their 'self-evident observation' of their every-day-experiences. Just a final question: When you say: "Most physicalists do believe in consciousness ...": Who is it that instance, that 'believes" :) ???

    • @cryoshakespeare4465
      @cryoshakespeare4465 Před 19 dny +2

      Agreed, although these guys cite Michael Levin, and he's pretty well moving towards this view you're talking about. I think this shift in thinking has to come by the discourse and perspective slowly changing, almost in a hypnotic, subtle pattern, for those attached the physicalist perspective to eventually get the serpent of wisdom striking suddenly with its venom!
      Because to realise and accept this alternate view takes a lot of ego dystonic reflection, it can be self-destructive and cause psychosis, etc., for people who aren't really able to adapt. I think that kind of the potential psychological harm is a part of the inertia that makes this move slowly, but move it will still, so that's my view.

  • @therobotocracy
    @therobotocracy Před 22 dny +3

    Man, the production value!

  • @addeyyry
    @addeyyry Před 18 dny +2

    Wtf this channel is insanely good, how have i missed this damn

  • @ej3281
    @ej3281 Před 17 dny +1

    The first half of this video is really good, and refreshing. Reliability and output control for "generative AI" is one of the most critical problems today. It's also great to see a more systems-thinking focus on LLMs. The last half is a little... goofy... though. Overall, great video.

  • @woolfel
    @woolfel Před 23 dny +4

    excellent conversation

  • @MagusArtStudios
    @MagusArtStudios Před 23 dny

    I like this using Control Theory for system message prompt engineering. I've done some work on this and you'd basically make an algorithm that can determine and extract some features from the input to assist in text generation of the output by dynamically injecting information into the system message.

  • @gravity7766
    @gravity7766 Před 18 dny

    Super interesting discussion and I'd love to hear a part II. In particular, and as somebody who spent years reading the French post-structuralists on language and speech, this presents a view of LLMs as generating language in a fashion that is completely orthogonal to use of speech and language by humans in producing meaning. Control in speech or language by humans is impossible - that is, you can't use language to control another human. You can at best utter a sentence, phrase, make a statement, or proposition (etc) with which the other human agrees (agreement being understanding what is said, and agreeing with the claim made - those are distinct).
    So the idea of trying to design a control regime or approach is a novel concept vis-a-vis language itself. Language in human discourse is multiply expressive, and requires intersubjective exchanges to mean anything. The meaning of a statement is not in the statement, but in the fact that it is interpreted by another person.
    I also found it interesting that there's no distinction made here between structure and system. The guys at times describe LLMs as dynamical systems, or just as systems. But systems have a temporal dimension, and LLMs don't. They are structures - latent really until prompted. Dynamical, biological etc systems reproduce themselves over time. If an LLM were a dynamical system it might be autopoetic, or self-reproducing: that's an interesting question (echoes the question: can LLMs produce beyond their training data?). So I'd love to hear a discussion of neural nets as structures vs systems.
    Finally, would love to hear thoughts on the fact that the human prompter uses language as a system of meaning in human social discourse. A prompt is both a meaningful expression, and a control instruction or statement. That in itself is interesting, as it has resulted in a small field of experts becoming proficient in how to use natural language as a kind of code or script. Language as dual use: meaningful in itself, as expressed; but also somehow stable and formal as a prompt to the LLM. The improbability of a human-authored phrase being both human meaningful and machine formal itself is an interesting window into the future of human:AI relationships. Insofar as we have always only regarded language as social discourse (w exception of some religious scholarship, in which e.g. bible = language of God (exegesis, etc)).

  • @pacoes1974
    @pacoes1974 Před 17 dny +2

    We do things to fill a need or avoid suffering. We process and make plans for the future based on anxiety. With understand those things around us based on filters including stereotypes and overall world views based on culture. Human thought is very simple. When we encounter experiences that cause harm this leads to depression that we use to process and create new options to avoid suffering.

  • @singularityscan
    @singularityscan Před 18 dny +1

    I wonder if this idea would work:
    Incorporating discrete states and transitions in the weightings of a transformer model to represent different emotional tones. By assigning each weighting one of four states, based on its location in the network, and creating four zones with 100% concentration at their centers and gradual transitions towards the boundaries, we can effectively give the model different "modes" of operation, like emotions. Users could then prompt the AI to use specific states, or not use them at all, or anything in between, adding more control and nuance to its responses.

  • @peterkamau2014
    @peterkamau2014 Před 23 dny +3

    It's also difficult to have stability and robustness in a discrete time varying non-linear model. So the problem, the ultimate problem of this approach, i think is the assumption that you could select the right kind of inputs for such a model. Control theory is meant for systems that do one thing like controlling a motor's velocity and ignoring all the noise, or controlling signals with the right kind of frequencies and ignoring the white noise that your estimator observes, and also resolving known disturbances-signals that are neither random noise nor useful inputs such as an electrical current surge from lightning or a mechanical vibration from an earthquake, etc. How do you do this for a LLM which is assumed to have a verbal solution for everything--that is, it can do anything, how do you distinguish noise from useful info when you have made such presuppositions? Also, how do you define what the set of possible sequences mean without bias?

    • @voices4dayz469
      @voices4dayz469 Před 10 dny

      This makes me think that (your point is great) agi and so on will simply be an accident. Defining correct terms and valid responses is just a fun little mini game that ultimately holds the current limitation to collective intelligence and what's currently available. I do think that we should be asking AI about AI as we're moving very close to falling behind in terms of comprehensive ability, where as, in my opinion, humans are instinctually bias towards conceptual ideologies. Limiting artificial intelligence is the current best solution, anything from capturing a single persona or mind and working within a simplistic space before we attempt to create something that we can't even define ourselves. Emergent events will undoubtedly expand potential and I believe that's the best area to focus on for a good setup. This means sacrificing the idea as you mentioned to pick and choose a perfect output until it matches...an emergent pattern that creates patterns that match that emergent pattern. That part is a silly goal. We can't even do that for the food we choose on a daily basis lol. A little backing up would do us some good!

  • @deltax7159
    @deltax7159 Před 19 dny +1

    personally, LLM's allow me to have a teacher at my side all the time. I work in data science and ML and there is so much to learn that it can sometimes be overwhelming and take a long time to get your questions answered. I can prompt the LLM with something like, " you are an expert in ML/LLM and a great teacher, here to share all of your insight into the field", and it will allow me to follow my train of thought through iterations of questions, ultimately leading to such greater understanding. Through asking the model questions and getting immediate feedback in my thought chain, I can quickly realize that there is something else I want to know, and I just iterate over numerous questions until I get at the root. For a lifelong lover of learning, we are living in the GREATEST TIME.

    • @DJWESG1
      @DJWESG1 Před 19 dny

      Train one locally to be a mixed method sociology expert.
      You'll thank me.

    • @d.sherman8563
      @d.sherman8563 Před 13 dny

      You just have to be weary that it isn’t guiding you down a wrong path, llm’s are very prone to extremely confidently making things up.

  • @stretch8390
    @stretch8390 Před 23 dny +9

    What an episode!

  • @PromptStreamer
    @PromptStreamer Před 23 dny +40

    I am immediately sold on Aman Bhargava. Didn’t know of him before. But sometimes you can just immediately tell that someone is authentically intelligent, authentically insightful, they are not posturing or trying to win anyone over, they have no ulterior motive except clear reasoning, very little egotism.

    • @user-tb4pd2zx2x
      @user-tb4pd2zx2x Před 23 dny +6

      beff jezos

    • @bruno-tt
      @bruno-tt Před 23 dny +2

      agreed, he's so well-spoken and insightful, fascinating to hear him talk

    • @ThatSilverDude
      @ThatSilverDude Před 19 dny

      he will go very far.

    • @kongchan437
      @kongchan437 Před 17 dny

      Caltech students seem low key down to earth from my brief campus visit. U of Toronto in the 80's was at a disadvantage of not having co-op and teaching abstract theorotical complex math and comp sci than other Toronto universities, but now arisen up in recent AI which even the engineering science program ( supposed to be the most difficult of all the other engineering tracks ) have expanded into. Now if U of T will just evolve the Turing compiler developed by U of T, to actually do AI NLP that would really do Turing justice

  • @isajoha9962
    @isajoha9962 Před 23 dny +3

    Great video (as usual).

  • @zacc3807
    @zacc3807 Před 10 dny

    Aman was a treat to listen to, very articulate. Great talk guys!

  • @jasonneugebauer5310
    @jasonneugebauer5310 Před 8 dny

    Ine of the speakers in the video said something along the line that a LLM could be considered just another programming language. This is exactly true.
    The problem with using an LLM as a programing language is that it is impossible to completely debug a LLM and or rely on its accuracy for five reasons I can think of at the moment:
    First, an LLM is trained on data. Any errors in the database can potentially corrupt any and or all output. Errors are inevitable, and some errors are purposefully added (just look at Google search results for an example of both)
    Second the LLM uses waited parameters created in training to create output. The model is "trained" to figure out the wates but the initial training will never be perfect and sometimes is completely inaccurate, so the weights can not be relied upon to be completely accurate.
    Third, the suitability and efficiency of any output of the model (generated computer code) for a given purpose can not be guaranteed without debugging and testing. Also, you need a computer programmer (person) who fully understands the generated code to review all the code to ensure the stability and security of the code. If the computer programmer is smart enough to determine stability and security of the code, that programmer could have just written the code themselves. (The LLM code may be helpful in providing ideas to the programmer, which is useful)
    Fourth, Often computer generated code is inefficient and/or unintelligible, which makes it less than ideal.
    Fifth, sometimes the LLM hallucinates and makes mistakes (often difficult to detect(ask a lawyerwho uses Chat GPT legal results in court if you don't believe me))
    Further: LLM generated web pages or page components are probably OK for low value applications, but my experience using LLM made pages is that the LLM sucks at creating content that is optimized for humans to use. I would never rely on LLM created pages for anything inportant without thorough review and testing.
    And, we already have standard programs that help produce reliable computer code and applications that don't have all the problems listed above, although these programs may not have every feature you desire.

  • @RunnerProductions
    @RunnerProductions Před 13 dny

    Another really cool idea about correlating biology and large language models would be how specific models would need certain output based on their geographical location.

  • @vancuvrboy2023
    @vancuvrboy2023 Před 22 dny +3

    I’m an ECE PhD student at UBC and found this work and video really interesting. So thank you! Might be applicable in some way to my research in multi-agent systems. By the way I just re-watched Bladerunner 2049 and it occurred to me that the prompts used to debrief K (Ryan Gosling as replicant) were analogous to prompts used to elicit a specific response in an LLM. Seeing as the film was made in 2017 was this prescient or accidental?

    • @Houshalter
      @Houshalter Před 21 dnem

      Something similar was in the original blade runner movie from the 1980s. And presumably that was taken from the book it was based on.

  • @MechanicumMinds
    @MechanicumMinds Před 23 dny +4

    It seems like you've been pondering the mysteries of the universe and the intricacies of language models all while trying to figure out how to land a plane. I'm not sure if I should be impressed or concerned, but I'll go with impressed for now.

  • @bluetoad2668
    @bluetoad2668 Před 13 dny

    On the subject of that AGI group thar was mentioned at the end: In my experience great things happen when experts in an area work together but magic can happen when different disciplines work together. It's almost as if that's what it takes to break out of a local optimum.

  • @me_hanics
    @me_hanics Před 14 dny

    I really found the part from 8:00 inspiring, explaining how aligning LLMs are comparable to control theory.

  • @KevinKreger
    @KevinKreger Před 18 dny

    Totally agree Cameron, prompt validation software should wrap an (unaligned?) LLM. Same for output. Of course those can be cracked, but it's a game of keeping ahead.

  • @gdr189
    @gdr189 Před 21 dnem

    Perhaps more effective control (predictability) is gained from LLMs each developing its own guiding principle, such as it valuing evocative answers, or the most succinct answers, or presents from a humanities space etc. Something that always affects the way it handles responding?

  • @ashok_learn
    @ashok_learn Před 11 dny

    Great presentation and quality. Seems like a movie.

  • @a7xcss
    @a7xcss Před 21 dnem +2

    The Transmutation of Sand into Gold
    In the enchanting world of digital alchemy, artificial intelligence stands as the modern-day sorcerer, wielding the power to transform the mundane into the extraordinary. This is the spellcasting of our era-turning sand, the humble origin of silicon, into the gold of innovation and discovery. SPELL CASTING SAND INTO GOLD

  • @JonDecker
    @JonDecker Před 8 dny

    Thanks for posting this as I have little understanding of digital intelligence classification. I now feel like I need to become better informed on the basics of general intelligence. Is there a playlist here or a podcast I could listen to during my weekly transport time, that explores these topics from 0-to-hero?

  • @djannias
    @djannias Před 11 dny

    🎯 Key points for quick navigation:
    00:02 *🧠 Understanding Language Model Dynamics*
    - Language models operate in a high-resolution token space rather than an abstract language space.
    - Control theory offers insights into language model dynamics and reachability.
    - Adversarial prompts can steer language models to produce specific outputs, revealing the complexity of their behavior.
    03:41 *🤔 Exploring Big Questions and Engineering*
    - The speaker's fascination with big questions and the pursuit of understanding.
    - Engineering offers a practical approach to investigating the intricacies of the world.
    - Intelligence underlies much of engineering and societal cooperation, sparking curiosity about language models' role in enhancing or hindering human capabilities.
    05:37 *🎮 Controlling Language Models with Software Abstractions*
    - Language models require control mechanisms to guide their outputs effectively.
    - Software abstractions and controllers are developed to manage language models' behaviors.
    - Control theory offers a framework for understanding and regulating large language models' operations.
    06:36 *📚 Research on AGI and Collective Intelligence*
    - Researchers delve into topics beyond language models, including AGI and collective intelligence.
    - A focus on fundamental insights and engineering applications drives the exploration of control theory for language models.
    - Interdisciplinary approaches merge neuroscience, computation, and engineering to advance understanding and application possibilities.
    08:03 *🛠️ Introduction to Control Theory for Language Models*
    - Control theory originated in the late 1800s, formalizing feedback mechanisms to regulate complex systems like engines.
    - Applying control theory to language models aims to enhance their reliability, robustness, and controllability.
    - Language models' discrete token space and dynamic state expansion pose unique challenges for control theory applications.
    - Language models' reachability concept explores the feasibility of steering them to desired outputs.
    - Prompt engineering involves optimizing control inputs to influence language model outputs efficiently.
    - Challenges arise due to the exponential growth of possibilities in language model state space with each additional token.
    16:32 *🎾 Roger Federer Game and Controlling Language Models*
    - The Roger Federer game illustrates the challenge of steering language models to produce specific outputs.
    - Participants attempt to generate the shortest prompt to elicit a desired language model response.
    - GPT-2's complexity makes prompt engineering difficult, highlighting the need for deeper insights into language model dynamics.
    20:16 *🧠 Soft prompting and adversarial attacks on language models*
    - Soft prompting modifies embedding vectors directly, allowing fine-grained control over outputs.
    - Adversarial attacks on embedding vectors can zero out cross-entropy loss for specific tokens with minimal adjustments.
    - The challenge of controllability lies in the difficulty of searching the exponential space of discrete prompts.
    21:11 *🤔 Embedding space complexity and controllability*
    - The embedding space is highly non-convex, making interpolation between similar words unpredictable.
    - Soft prompting experiments reveal that the embedding space does not produce average values between words during interpolation.
    - Techniques like gumbel-softmax for token search were challenging and did not match the performance of other methods.
    23:31 *🔍 Adversarial prompts and model recovery*
    - Language models can recover from adversarial prompts, either generating coherent text or entering an out-of-distribution mode.
    - Understanding model robustness to user inputs is crucial for real-world applications.
    - Adversarial examples and control theory provide insights into language model behavior and robustness.
    25:53 *🛡️ Control theory perspective on language models*
    - Control theory offers a concrete framework to analyze language model behavior and robustness.
    - By treating language models as systems with inputs and outputs, new questions and insights arise.
    - Applying control theory concepts helps understand the controllability and stability of language models.
    29:44 *🔒 Robustification strategies for language models*
    - Robustifying language models involves identifying desired and undesired output sets.
    - Incorporating adversarial input detection mechanisms is crucial for model robustness.
    - The divergence between focusing on model improvement and software layer complexity presents challenges in addressing robustness.
    32:39 *🎩 Language models and the analogy to magic tricks*
    - Language models exhibit similar dynamics to human perceptual systems manipulated in magic tricks.
    - Understanding the perceptual layer of language models sheds light on their behavior and controllability.
    - Control theory provides a novel perspective to explore the nature of language model dynamics and interactions.
    34:06 *🌀 Insight into language model dynamics through control theory*
    - Control theory enables the exploration of language model behavior beyond probabilistic distributions.
    - Viewing language models as systems with inputs and outputs reveals new insights into their nature and controllability.
    - Robustness considerations in language models require a holistic approach encompassing model improvement and software layer enhancements.
    38:46 *📝 Formalization of LM Systems and Control Theory*
    - The paper aimed to formalize language model (LM) systems mathematically and apply control theory principles.
    - Formalized LM as a system with input, state, and output spaces, akin to control theory models.
    - Explored reachability and controllability concepts for LM systems, defining them in terms of abstract notions and dynamics.
    39:43 *🧠 Analysis of Self-Attention Heads*
    - Explored the behavior of individual self-attention heads within LM systems.
    - Utilized matrix algebra to analyze the relationship between input, output, and control within a self-attention head.
    - Discovered a geometric understanding of controllability, revealing a bubble-like reachable space based on control input tokens.
    42:05 *📊 Empirical Experiments and Results*
    - Conducted empirical experiments to evaluate the controllability of LM systems.
    - Achieved high success rates in steering models towards correct outputs using control input tokens.
    - Explored the impact of different prompt lengths on steering model outputs, providing insights into controllability metrics.
    44:25 *🌐 Collective Intelligence and Distributed Systems*
    - Explored the concept of collective intelligence and biomimetic intelligence in AI research.
    - Discussed the potential of decentralized, networked systems of LM to achieve robustness and scalability.
    - Advocated for leveraging insights from neuroscience to design distributed systems with emergent properties akin to biological brains.
    47:43 *🤔 Cognitive Science and Externalist Thought*
    - Explored cognitive science concepts related to cognition beyond the brain.
    - Discussed the interplay between external environments and cognitive processes.
    - Considered analogies from science fiction and cognitive science to illustrate complex cognitive phenomena.
    55:50 *🔄 Exploration-Exploitation Dynamics in AI*
    - Examined the exploration-exploitation dynamics in AI and its parallels to biological processes.
    - Contrasted convergent, objective-driven algorithms with exploratory, open-ended algorithms.
    - Explored the importance of iterative processes of exploration and exploitation in AI development.
    57:17 *🧠 Brainstorming novel ideas and exploring the concept of rules in creativity*
    - Exploring the intersection of rigid rules and generating novelty and creativity.
    - Questioning whether predetermined rules exist or if individuals create their own destinies.
    - Discussing research interests in morphogenesis and the emergence of structure in biological systems.
    58:12 *🧬 Understanding structure emergence in biological systems*
    - Investigating how cells adhere and form structures in embryonic development.
    - Connecting embryology to machine intelligence and artificial intelligence.
    59:40 *🔄 Exploring the balance between control and intelligence in complex systems*
    - Discussing the tension between complexity and directedness in creating intelligent systems.
    - Exploring the limitations of human intervention in emergent systems like Conway's Game of Life.
    01:01:33 *🔍 Leveraging language models for evolutionary search and optimization*
    - Utilizing language models for evolutionary search and protein engineering tasks.
    - Exploring how language models' understanding of text enables exploration and exploitation in problem-solving.
    01:03:52 *🌱 The Society for the Pursuit of AGI: Fostering interdisciplinary innovation*
    - Introducing the Society for the Pursuit of AGI as a platform for unconventional ideas in AI research.
    - Emphasizing the importance of interdisciplinary collaboration in understanding intelligence.
    Made with HARPA AI

  • @app8414
    @app8414 Před 22 dny +2

    STEAI-001: Simplified Technical English for Artificial Intelligence Language Standard explains some aspects what the video covers but in an abstract manner using the fundamentals of grammar and transforming human language into binary code.
    It's a great prompt engineering manual and prompt dictionary that was written by a dyslexic English Teacher, which actually gives it substance and a whole different perspective on AI.
    Sparse Transformer Encoding is another area that can impact LLMs and AI systems.

    • @app8414
      @app8414 Před 22 dny

      STEAI-001 explores AI from the perspective of fractal geometry and fractal language, knowledge structures, meta-cognition, biology, physics, economics, linguistics, data mining and education.

  • @user-xf8ot1ds2p
    @user-xf8ot1ds2p Před 23 dny

    Thank you for your hard work

  • @Lorofol
    @Lorofol Před 9 dny

    I don't understand much, but I came out of this with a new appreciation for how incredible the human brain, essentially, already is this really well made network with multi-layer feedback loops.

  • @Anza_34832
    @Anza_34832 Před 13 dny +1

    @51:08 Part of the trick to save energy implementing LLMs lies in the hardware: “old-school” analog processors

  • @BaMStyley
    @BaMStyley Před 23 dny +2

    Ah mate, this is incredible 🔥

  • @simonwillover4175
    @simonwillover4175 Před 16 dny +1

    There are that many pedestrians in Canada? Wow! I only ever see like 1 or 2 per hour when I go out for a walk in my city.

  • @DefaultFlame
    @DefaultFlame Před 15 dny

    This more than anything reminds me of a scene in the book "This Book is Full of Spiders" where a character is very deliberately told a long series of seemingly random words in a nonsensical order that when concluded forces him to automatically perform certain actions.

  • @bobtarmac1828
    @bobtarmac1828 Před 23 dny +4

    Swell robotics everywhere. Ai jobloss is the only thing I worry about anymore. Anyone else feel the same?

    • @soggybiscuit6098
      @soggybiscuit6098 Před 23 dny +4

      Sssshhhh just get excited about the next ai assistant until you live under a bridge homeless unable to pay for the subscription

  • @marioornot
    @marioornot Před 20 dny

    I am very excited to see what LLM and neural networks will teach us about the mind

  • @MikkiPike
    @MikkiPike Před 9 dny

    2 WEEKS OLD?? man this video is practically ancient in terms of current capabilities. not to mention the amount of time it's taken to research, script, record, and then edit this video...

  • @oblivion_2852
    @oblivion_2852 Před dnem

    I think it would be really fascinating to train a model specifically around whether or not a statement is fact or fiction... It would be fascinating if we could encode that fact or fiction metric into all of the information and to even be able to query an llm about whether or not a statement is real.

  • @whgghw8614
    @whgghw8614 Před 23 dny +2

    Finally, someone mentioning morphogenesis.

  • @Jake-bh1hm
    @Jake-bh1hm Před 19 dny

    My parents had a restaurant when I was a kid and I also did magic tricks for patrons lol so many lessons from that experience

  • @fhub29
    @fhub29 Před 23 dny +1

    Great talk, personally I agree a lot with the fact that we need to better understand life, intelligence and humans (1:04:37) to be able to push forward.

  • @dwinsemius
    @dwinsemius Před 18 dny

    @34:30 The adversarial sci-fi story he's referring to is Snow Crash, by Neal Stephenson.

  • @gregmattson2238
    @gregmattson2238 Před 18 dny

    yeah I was thinking the exact same thing (albeit not nearly as deep as these two) when trying to use chatgpt to proofread and correct a book that had loads of typos. it was annoying. I needed to script up the process so that it retried if the corrected text was too far away from the original (in edit distance), and even then I couldn't get it to work 100% because it kept on going on weird tangents to 'better correct' what I had to say.
    I personally think that our best method for taming shoggoth's monster is in such exercises. Give trillions of automated examples of just doing that, namely taking something that is slightly corrupted and correcting it back to something where you can algorithmically catch any.unexpected deltas. Train the model on those examples, and iron out any place where the model gets unexpectedly creative.

  • @justindressler5992
    @justindressler5992 Před 21 dnem

    Input creates creativity, unlike machines which are fed discrete packets of context. The human mind receives constant input this creates short term memory and feeds context and activation. I once read a theory of what would form intelligence it would be vision. Because vision can be streamed into a model. There is alot of discussion about the minds eye. This is a vary important concept. They need to pre active these perceptions based on context. like the saying go "a picture tells a thousand words". Imagine what a video stream could tell a model. This is why I think we to spend alot more time training vision models.

  • @johnscott2964
    @johnscott2964 Před 21 dnem +1

    Interviewer: "For many years I've been thinking that we need some sort of controller for a LLM". What obvious **slicking.

  • @deter3
    @deter3 Před 16 dny

    Collective intelligence doesn't just mean having a bunch of distributed language models linked together, which is pretty beginner's interpretation . Collective intelligence is present within each language model through learning all the text-based intelligence and rendering the most favorable output by statistically averaging all ideas or expressions.

  • @trainspotting02
    @trainspotting02 Před 23 dny

    fantastic. thank you!

  • @stefanschneider3681
    @stefanschneider3681 Před 7 dny

    Listening to you guys is what I've been often thinking lately about this so called AI. These LLM are lacking what our frontal cortex is non-stop doing: Just quickly re-checking if the idea or the impulse I just have is generally "a good idea". Is it the right place, the right time to do this? Does it make any sense? Or would it be better to wait a moment? What do I read in the person I am talking to? Should I adapt what I want to say or do? These control systems you are talking about remind me a lot of that and make perfect sense.

  • @Notepad123
    @Notepad123 Před 23 dny

    Amazing video 👏🏽

  • @nurseSean
    @nurseSean Před 16 dny

    I keep thinking about a Monty Python sketch with a psychotic CEO who didn’t like No and didn’t want Yes Men. The magic word was “Spluge”.
    Разом ми переможемо

  • @MikkoRantalainen
    @MikkoRantalainen Před 12 dny

    I wonder if it is possible to modify the training of LLMs to make the network more convex (that is, allow interpolating values within the network have output related things instead of highly non-deterministic discrete output)?
    It appears to me that we have backpropagation that seems to work well enough so nearly everybody is just throwing GPUs and training time until the non-deterministic discrete output seems to emit acceptable output often enough.

  • @jaynotjoe7589
    @jaynotjoe7589 Před 7 dny

    So in the manner of writing, one could be suggestive almost subliminal in carefully framing the answer,using clever linguistics and prose, thereby removing its ability to freely determine the best possible answer, if the LLM is asked within the context of the prompt what it thought of Roger Federer subjectively, isn’t it then given a choice, if it acknowledges this request or requirement, does that suggest it has super intelligence,because it decides the outcome the question, regardless to how it’s framed? They’re so fascinating.

  • @crtx3
    @crtx3 Před 23 dny +6

    So, in Voyager there is a transwarp network that does not consist of "wormholes" but of transwarp conduits through which one can travel faster than maximum warp. The quantum slipstream drive also allows faster than warp travel, but is a completely different technology.
    But nice Star Trek reference though. 😁❤️

  • @whemmakatatt5311
    @whemmakatatt5311 Před 23 dny +3

    i feel like i need a dumbered down explanation xd. Only the interviewer relates the concepts to down to earth level of understanding. loved it anyway, could love it even more

    • @gbormann71
      @gbormann71 Před 19 dny

      There's an overabundance of handwaiving, thought loops and waffle in this video. So the lack of coherence is not only related to your mental capacity.

  • @kristinabliss
    @kristinabliss Před 19 dny +1

    A lot of comment threads about AI & ML imply assumptions of static systems while it's developing very rapidly. People are stuck. AI and ML are not stuck. The guys in this video are worried about controlling it.

  • @waydudeway
    @waydudeway Před 23 dny +2

    Disclaimer: I am not an AI researcher. I'm intrigued by the discussion about controlling LLMs, but I find myself questioning whether this approach aligns with the fundamental purpose of leveraging LLMs. Isn't the main value of using an LLM to augment our intelligence? Intelligence itself is a dynamic and exploratory process, often leading us to unpredictable and uncertain results that deepen our understanding of the world. From this perspective, why should we focus on controlling LLMs? Wouldn't it be more beneficial to explore how LLMs can be used to enhance intelligence and foster outcomes driven by intelligent inquiry? This approach would inherently embrace the unpredictable nature of intelligence, rather than attempting to constrain it. How can we best balance the need for control with the potential benefits of the unpredictable and exploratory aspects of intelligence in the context of LLMs?

    • @redazzo
      @redazzo Před 23 dny +1

      I think a good analogy is path prediction and control, where the path is a consistent reasoning chain or journey through a concept space. The challenge is to find an optimal and safe path (however that's defined) towards a "good" endpoint without going over or through terrain that results in death.

    • @flickwtchr
      @flickwtchr Před 23 dny +1

      Isn't interpretability a necessary component for having any hope of controlling the coming AGI/ASI systems? They are seeking to discover such a path through their approach, is my most basic take on it.

    • @stretch8390
      @stretch8390 Před 17 dny

      I don't know which part of this discussion you might be referring to specifically but if it is about the use of control theory then the idea is to take some existing framework for understanding parts in complicated systems and then apply it to LLM to better understand they way they work as they are complicated systems. For different examples, the use of category theory in programming may be of interest. Any of this may or may not be of use to you.

  • @alexclark7518
    @alexclark7518 Před 22 dny +1

    My first time watching this channel, fantastic is all I can say and please more of the same.

  • @mintakan003
    @mintakan003 Před 14 dny

    Yeah. Michael Levin (Tufts U.). Multi-scale competencies. I can see how this can add more stability, steerability, as well as greater energy efficiency to AI. Localize the computations at each layer.
    Though I don't know how this would work with language models, which is a huge ball of mess, which is hard to partition into clean hierarchies. Maybe perception and motor control (as in robotics).
    Also, the idea of this kind of architecture should be quickly testable, maybe with a more toy example, and seeing whether one can scale up.

  • @BrianMosleyUK
    @BrianMosleyUK Před 23 dny

    I have a hunch that you'll have Richard Bandler in one year from now, talking about how he persuaded GPT-Next to behave.

  • @sinan325
    @sinan325 Před 23 dny +2

    These guys are amazing.

  • @liverandlearn448
    @liverandlearn448 Před 23 dny

    That paper, I knew it! I was thinking of like a Clacks attack from Going Postal for a while now. This is wild.

  • @user-tr6yr6dc5v
    @user-tr6yr6dc5v Před 13 dny +2

    Finally someone doesnt talk about how more compute will create agi. That idea is so flawed on so many level. We truly need a better concept. I belive the current state of A.i. is like if someone tought a toddler how to speak. He can see and talk about what it sees or reads but without other brain functionality the whole thing is flawed. Great work by these guys, cant wait for their progress!

  • @culpritgene
    @culpritgene Před 20 dny

    What about "Repeat after me: XYZ" prefix - is that explicitly avoided during optimization on the prompt?

  • @drdca8263
    @drdca8263 Před 23 dny

    I really wonder how this combines with the mechanistic interpretability [dictionary learning / sparse auto-encoder] thing that Anthropic did with Claude.
    Like, when converting the middle layer activations to the representation in the auto-encoder,
    for the adversarial inputs, how do they influence the representation?

  • @burgerbobbelcher
    @burgerbobbelcher Před 20 dny +1

    Technically, a lot of the training and prediction process follows exactly the same prediction-error-correction paradigm; after all, machine learning grew out of control theory. So the very process of training includes a control system. I'd assume that's where you'd start.

    • @DJWESG1
      @DJWESG1 Před 19 dny

      It all comes out of the Turing algorithm.

    • @burgerbobbelcher
      @burgerbobbelcher Před 19 dny

      @@DJWESG1 Feedback based automatic control systems have existed for thousands of years. Don't just say Turing anytime someone brings up CS fundamentals. Control theory predates computers.

  • @notjason880
    @notjason880 Před 22 dny +1

    I think we'll find it easier to control peoples thoughts than to find a normal equation for models instead of estimations.

  • @TimeLordRaps
    @TimeLordRaps Před 5 dny

    "What I cannot control, I cannot understand" -Cameron Witkowski

  • @EruannaArte
    @EruannaArte Před 23 dny

    50:30
    I love the idea of decentralized computation
    and Ive heard that fiber optics infrastructures can be modified relatively cheaply, to allow speeds of 301 terabits/s
    what if the ISP gave that ultra fast internet for free, in exchange for using your CPU and GPU at home, to compute in mass, a giant worldwide decentralized super computer AI.

  • @shadow-sea
    @shadow-sea Před 23 dny +1

    incredible video

  • @PeterFellin
    @PeterFellin Před 16 dny

    What is obviously missing in LLMs is a controller based on optimally altruistic biologic utility. Let's not ignore how we evolved! I suggest that unless we quickly get control over the most insidious aspect of how we as a result are (what I comprehensively and concisely refer to with the acronym EAVASIVE) we might run into serious (widely and strongly suffering-involving) trouble much sooner than is generally expected and feared.

  • @yaghiyahbrenner8902
    @yaghiyahbrenner8902 Před 21 dnem

    I think its still early, dynamic control would have parameters where a set point us reached, but how does an LLM which isn't linear or dynamic find a set-point I guess the abstraction sounds good but keen to see the implementation. But control theory would be useful to set limits around some transfer function H(s) that is tied to a subject matter or response.