Nathan Labenz - AI Biology Could Spiral Out Of Control

tagesschau 20:00 Uhr, 19.06.2024

Two GPT-4os interacting and singing

Hadiah untuk Bimbim part 1‼️🎁🦉#cute #funnyvideo #owls #videolucu #funny #animals

Vémola vs. Végh 2 • OKTAGON 58 (celý zápas)

ASÍ IMPROVISÓ AL FINAL DE LA COREO MI ALUMNA 😱

Adam Gleave - Vulnerabilities in GPT-4 APIs & Superhuman Go AIs

The Inside View

zhlédnutí 532

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 20. 06. 2024
This is a special crosspost episode where Adam Gleave is interviewed by Nathan Labenz from the Cognitive Revolution. At the end I also have a discussion with Nathan Labenz about his takes on AI.
Adam Gleave is the founder of Far AI, and with Nathan they discuss finding vulnerabilities in GPT-4's fine-tuning and Assistant PIs, Far AI's work exposing exploitable flaws in "superhuman" Go AIs through innovative adversarial strategies, accidental jailbreaking by naive developers during fine-tuning, and more.
OUTLINE
00:00 Intro
02:57 Far.AI's Mission
05:33Unveiling the Vulnerabilities in GPT-4's Fine Tuning and Assistance APIs
11:48 Divergence Between The Growth Of System Capability And The Improvement Of Control
13:15 Finding Substantial Vulnerabilities
14:55 Exploiting GPT 4 APIs: Accidentally jailbreaking a model
18:51 On Fine Tuned Attacks and Targeted Misinformation
24:32 Malicious Code Generation
27:12 Discovering Private Emails
29:46 Harmful Assistants
33:56 Hijacking the Assistant Based on the Knowledge Base
36:41 The Ethical Dilemma of AI Vulnerability Disclosure
46:34 Exploring AI's Ethical Boundaries and Industry Standards
47:47 The Dangers of AI in Unregulated Applications
49:30 AI Safety Across Different Domains
51:09 Strategies for Enhancing AI Safety and Responsibility
52:58 Taxonomy of Affordances and Minimal Best Practices for Application Developers
57:21 Open Source in AI Safety and Ethics
01:02:20 Vulnerabilities of Superhuman Go playing AIs
01:23:28 Variation on AlphaZero Style Self-Play
01:31:37 The Future of AI: Scaling Laws and Adversarial Robustness
01:37:21 Start of Michael Trazzi interviewing Nathan Labenz(1:37:33) Nathan’s background
01:39:44 Where does Nathan fall in the Eliezer to Kurzweil spectrum
01:47:52 AI in biology could spiral out of control(01:56:20) Bioweapons
02:01:10 Adoption Accelerationist, Hyperscaling Pauser
02:06:26 Current Harms vs. Future Harms, risk tolerance
02:11:58 Jailbreaks, Nathan’s experiments with Claude
The cognitive revolution: www.cognitiverevolution.ai/
Exploiting Novel GPT-4 APIs: far.ai/publication/pelrine202...
Advesarial Policies Beat Superhuman Go AIs: far.ai/publication/wang2022ad...
Věda a technologie

Komentáře • 1

@TheInsideView Před měsícem ⁺¹
Timestamps of Adam Gleave interview:
02:57 Far.AI's Mission
05:33Unveiling the Vulnerabilities in GPT-4's Fine Tuning and Assistance APIs
11:48 Divergence Between The Growth Of System Capability And The Improvement Of Control
13:15 Finding Substantial Vulnerabilities
14:55 Exploiting GPT 4 APIs: Accidentally jailbreaking a model
18:51 On Fine Tuned Attacks and Targeted Misinformation
24:32 Malicious Code Generation
27:12 Discovering Private Emails
29:46 Harmful Assistants
33:56 Hijacking the Assistant Based on the Knowledge Base
36:41 The Ethical Dilemma of AI Vulnerability Disclosure
46:34 Exploring AI's Ethical Boundaries and Industry Standards
47:47 The Dangers of AI in Unregulated Applications
49:30 AI Safety Across Different Domains
51:09 Strategies for Enhancing AI Safety and Responsibility
52:58 Taxonomy of Affordances and Minimal Best Practices for Application Developers
57:21 Open Source in AI Safety and Ethics
01:02:20 Vulnerabilities of Superhuman Go playing AIs
01:23:28 Variation on AlphaZero Style Self-Play
01:31:37 The Future of AI: Scaling Laws and Adversarial Robustness
Michael Trazzi interviews Nathan Labenz:
1:37:33 Nathan’s background
01:39:44 Where does Nathan fall in the Eliezer to Kurzweil spectrum
01:47:52 AI in biology could spiral out of control
01:56:20 Bioweapons
02:01:10 Adoption Accelerationist, Hyperscaling Pauser
02:06:26 Current Harms vs. Future Harms, risk tolerance
02:11:58 Jailbreaks, Nathan’s experiments with Claude

Další v pořadí

Automatické přehrávání

Nathan Labenz - AI Biology Could Spiral Out Of Control

Nathan Labenz - AI Biology Could Spiral Out Of Control

tagesschau 20:00 Uhr, 19.06.2024

tagesschau 20:00 Uhr, 19.06.2024

Two GPT-4os interacting and singing

Two GPT-4os interacting and singing

Hadiah untuk Bimbim part 1‼️🎁🦉#cute #funnyvideo #owls #videolucu #funny #animals

Hadiah untuk Bimbim part 1‼️🎁🦉#cute #funnyvideo #owls #videolucu #funny #animals

Vémola vs. Végh 2 • OKTAGON 58 (celý zápas)

Vémola vs. Végh 2 • OKTAGON 58 (celý zápas)

ASÍ IMPROVISÓ AL FINAL DE LA COREO MI ALUMNA 😱

ASÍ IMPROVISÓ AL FINAL DE LA COREO MI ALUMNA 😱

Please be kind🙏

Please be kind🙏

SPD CRISIS: "Esken is certainly the one who takes the denial of reality the furthest"

SPD CRISIS: "Esken is certainly the one who takes the denial of reality the furthest"

How to reboot Britain's capital markets | FT Film

How to reboot Britain's capital markets | FT Film

AI and Quantum Computing: Glimpsing the Near Future

AI and Quantum Computing: Glimpsing the Near Future

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

Sharma: In Every Crisis, Bailouts Get Bigger

Sharma: In Every Crisis, Bailouts Get Bigger

How Nvidia Grew From Gaming To A.I. Giant, Now Powering ChatGPT

How Nvidia Grew From Gaming To A.I. Giant, Now Powering ChatGPT

The Boeing Starliner Has A New Problem!

The Boeing Starliner Has A New Problem!

John Mearsheimer on Ukraine, Gaza & escalation dominance | SpectatorTV

John Mearsheimer on Ukraine, Gaza & escalation dominance | SpectatorTV

Sleeper Agents Explained - Part 2 - Deceptive Instrumental Alignment, Model Poisoning

Sleeper Agents Explained - Part 2 - Deceptive Instrumental Alignment, Model Poisoning

Samsung Crushed Apple

Samsung Crushed Apple

Lenovo Legion Gaming #PC won't stop beeping! (RAM fix and dust cleaning) #tech #technology #shorts

Lenovo Legion Gaming #PC won't stop beeping! (RAM fix and dust cleaning) #tech #technology #shorts

Otestoval jsem falešný telefon.

Otestoval jsem falešný telefon.

Samsung tablet apps 🤯💕 Galaxy tab S9 plus | Android apps

Samsung tablet apps 🤯💕 Galaxy tab S9 plus | Android apps

Logitech, wake up.

Logitech, wake up.

WWDC 2024 Recap: Is Apple Intelligence Legit?

WWDC 2024 Recap: Is Apple Intelligence Legit?

[1595] Sentry Safe’s Lock Design Malpractice

[1595] Sentry Safe’s Lock Design Malpractice

Memory subscription

Memory subscription