Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools

How are memories stored in neural networks? | The Hopfield Network #SoME2

20: Hopfield Networks - Intro to Neural Computation

This or That in Public 🐕

Remove side stitch !! 😱😱

NEJBOHATŠÍ ŽEBRÁK NA SVĚTĚ JE UMĚLEC #shorts

Dmitry Krotov | Modern Hopfield Networks for Novel Transformer Architectures

Harvard CMSA

zhlédnutí 5 596

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 9. 05. 2023
New Technologies in Mathematics Seminar
Speaker: Dmitry Krotov, IBM Research - Cambridge
Title: Modern Hopfield Networks for Novel Transformer Architectures
Abstract: Modern Hopfield Networks or Dense Associative Memories are recurrent neural networks with fixed point attractor states that are described by an energy function. In contrast to conventional Hopfield Networks, which were popular in the 1980s, their modern versions have a very large memory storage capacity, which makes them appealing tools for many problems in machine learning and cognitive and neurosciences. In this talk, I will introduce an intuition and a mathematical formulation of this class of models and will give examples of problems in AI that can be tackled using these new ideas. Particularly, I will introduce an architecture called Energy Transformer, which replaces the conventional attention mechanism with a recurrent Dense Associative Memory model. I will explain the theoretical principles behind this architectural choice and show promising empirical results on challenging computer vision and graph network tasks.
Věda a technologie

Komentáře • 5

@Anikung17 Před 7 měsíci
Excellent talk, very interesting developments with the energy transformer
@user-os1gd3cc2l Před 8 měsíci
thanks for sharing
@michaelcharlesthearchangel Před měsícem
Only geniuses realize the interconnectiveness between the relationship between Hopfield Networks and Neural Network Transformer models then latter Neural Network Cognitive Transmission models.
@maxkho00 Před 7 měsíci
Ngl, this was pretty confusing.
For one, the two energy formulae at 12:32 are only equivalent if i=j, i.e. if the contribution of each feature neuron is evaluated independently; now, the second formula can be intuitively understood as representing the extent to which the state vector's shape in the latent space matches the shape of each of the memories, but the first formula is harder to conceptualise, and it's never explained how the first formula can be practically reduced to the second (i.e. why not considering the interdependencies between the feature neurons in the energy formula doesn't make a practical difference).
Secondly, without an update rule or at least a labelled HLA diagram, it was really hard to visualise the mechanics of the network; I had to pause the video and google the update rule to understand how dense Hopfield networks are even supposed to work. Dmitry did make the very vague statement that "the evolution of the state vector" is described, in some way, by the attention function, but he didn't explain in what way (is it the update rule? Is it a change vector? Is it something else? What does "V" correspond to? etc), which was pretty frustrating. For anyone watching, the attention function is the update rule where V is a linear transform of K; the value of the attention vector is substituted for Q, and the formula can be applied recursively.
In general, I think more high-level explanations ─ especially within a consistent framework ─ would've been very helpful.
@joeysmoey3004 Před 4 měsíci
For your first point, this is not true because the square of the sum is not the sum of the squares. There are cross terms which give you the non-independence.

Další v pořadí

Automatické přehrávání

Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools

How are memories stored in neural networks? | The Hopfield Network #SoME2

How are memories stored in neural networks? | The Hopfield Network #SoME2

20: Hopfield Networks - Intro to Neural Computation

20: Hopfield Networks - Intro to Neural Computation

This or That in Public 🐕

This or That in Public 🐕

Remove side stitch !! 😱😱

Remove side stitch !! 😱😱

NEJBOHATŠÍ ŽEBRÁK NA SVĚTĚ JE UMĚLEC #shorts

NEJBOHATŠÍ ŽEBRÁK NA SVĚTĚ JE UMĚLEC #shorts

Most expensive pasta in the WORLD?

Most expensive pasta in the WORLD?

What are Transformer Models and how do they work?

What are Transformer Models and how do they work?

Yann LeCun - Gen AI Winter School, Objective Driven AI.

Yann LeCun - Gen AI Winter School, Objective Driven AI.

ICML 2021 | Modern Hopfield Networks - Dr Sepp Hochreiter

ICML 2021 | Modern Hopfield Networks - Dr Sepp Hochreiter

Most Research in Deep Learning is a Total Waste of Time - Jeremy Howard | AI Podcast Clips

Most Research in Deep Learning is a Total Waste of Time - Jeremy Howard | AI Podcast Clips

Transformer Neural Networks Derived from Scratch

Transformer Neural Networks Derived from Scratch

TransformerFAM: Feedback attention is working memory

TransformerFAM: Feedback attention is working memory

Highly Accurate Protein Structure Prediction with AlphaFold | SimonKohl

Highly Accurate Protein Structure Prediction with AlphaFold | SimonKohl

A Path Towards Autonomous Machine Intelligence with Dr. Yann LeCun

A Path Towards Autonomous Machine Intelligence with Dr. Yann LeCun

Can We Build an Artificial Hippocampus?

Can We Build an Artificial Hippocampus?

$30 vs $350 Microphone

$30 vs $350 Microphone

iPhone 16 Packs a Big Surprise.

iPhone 16 Packs a Big Surprise.

Good Tool Cutting And Recycling Circuit Board Easily- Wisdom Tips Machine Easy Easyway Easywork !

Good Tool Cutting And Recycling Circuit Board Easily- Wisdom Tips Machine Easy Easyway Easywork !

S24UItra VS S23UItra anti-shake function comparison Samsung mobile phone digital mobile #shorts

S24UItra VS S23UItra anti-shake function comparison Samsung mobile phone digital mobile #shorts

Building the ENDGAME invisible PC

Building the ENDGAME invisible PC

I finally own the Dyson Zones.

I finally own the Dyson Zones.

НЕ ПОКУПАЙ iPad Pro

НЕ ПОКУПАЙ iPad Pro

I’m kind of an iPad hater, but this is MAGICAL. - iPad Pro M4

I’m kind of an iPad hater, but this is MAGICAL. - iPad Pro M4