Dan Hendrycks on Catastrophic AI Risks

Sdílet
Vložit
  • čas přidán 3. 06. 2024
  • Dan Hendrycks joins the podcast again to discuss X.ai, how AI risk thinking has evolved, malicious use of AI, AI race dynamics between companies and between militaries, making AI organizations safer, and how representation engineering could help us understand AI traits like deception. You can learn more about Dan's work at www.safe.ai
    Timestamps:
    00:00 X.ai - Elon Musk's new AI venture
    02:41 How AI risk thinking has evolved
    12:58 AI bioengeneering
    19:16 AI agents
    24:55 Preventing autocracy
    34:11 AI race - corporations and militaries
    48:04 Bulletproofing AI organizations
    1:07:51 Open-source models
    1:15:35 Dan's textbook on AI safety
    1:22:58 Rogue AI
    1:28:09 LLMs and value specification
    1:33:14 AI goal drift
    1:41:10 Power-seeking AI
    1:52:07 AI deception
    1:57:53 Representation engineering
  • Věda a technologie

Komentáře • 9

  • @kimholder
    @kimholder Před 6 měsíci

    I got a lot out of this and am reading the associated paper. I have some questions.
    Why isn't criminal liability also included?

  • @PauseAI
    @PauseAI Před 7 měsíci +2

    Is there a source for Elon Musk's p(doom)?

  • @mrpicky1868
    @mrpicky1868 Před 6 měsíci

    he is much better in making a serious risk taken seriously then Eliezer. hope he does more interviews

    • @geaca3222
      @geaca3222 Před 4 měsíci

      I hope so too, he also recently published a very informative safety book online

    • @mrpicky1868
      @mrpicky1868 Před 4 měsíci +1

      books have no power .sadly so more interviews and broader public understanding is what will make difference @@geaca3222

    • @geaca3222
      @geaca3222 Před 4 měsíci

      @@mrpicky1868 Agree, but as an addition I think the online book is very helpful as a good source of information. It gives a concise overview of the CAIS research findings that is readily accessible for international AI safety agents and the general public. the website also offers courses on the subject.

  • @michaelsbeverly
    @michaelsbeverly Před 7 měsíci

    _Knock, knock!"_
    "Who's there?"
    _"Hello Amazon, I'm agent of the court with service..."_
    "This is about that destroying humanity thing?"
    _"That's right."_
    "Yeah, um, about that..."

  • @Dan-dy8zp
    @Dan-dy8zp Před 3 měsíci

    He doesn't provide any justification for why we should be more concerned about these problems than about the alignment of true super-intelligence, nor any for why he thinks we are in a 'medium take-off' situation, or why we should be replaced with a 'species' instead of a singleton.
    *(These programs don't mate. They are not related to each other. They don't age and die and replace themselves. One would probably triumph in the end, I think, however long that takes)*.
    I'm left with the impression he just likes to tackle easier problems.
    Though if the former problem, super-alignment, is totally intractable you could argue that it makes sense to focus on what is doable and just hope we get lucky about the alignment. He doesn't really make that argument though.