OpenAI CLIP: ConnectingText and Images (Paper Explained)

LlamaIndex Webinar: LLaVa Deep Dive

Why Does Diffusion Work Better than Auto-Regression?

How much pee is in a pool?

Největší FAIL Celého Fotbalového Týmu…

TOHODLE JSTE SI V AVENGERS NEVŠIMLI #zajimavosti #avengers

OpenAI CLIP model explained

Machine Learning Studio

zhlédnutí 2 197

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 13. 09. 2024
CLIP: Contrastive Language-Image Pre-training
In this video, I describe the CLIP model published by OpenAI. CLIP is based on Natural Language Supervision for pre-training. Natural Language Supervision is not a new, in fact there are two approaches for this, one approach tries to predict the exact caption for each image, whereas the other approach is based on contrastive loss, where instead of predicting the exact caption, they try to increase the similarity of correct pairs.

Komentáře • 8

@AI_For_Scientists Před 2 dny
Great video series on vit and derivatives, watched all of it. Thank you very much for sharing.
@PyMLstudio Před 22 hodinami
Glad you enjoyed it.
@SebastianRaschka Před 3 měsíci
Very nice video! I can also imagine that predicting the caption text exactly isn't only more difficult but it would also be more likely result in (more) overfitting if it is learned this way.
At 5:43, the pair-wise similarities, they are basically like cross-attention scores?
@PyMLstudio Před 3 měsíci ⁺¹
Yes, in a way, it’s analogous to cross-attention, taking dot-product between the features from the text encoder and image encoder. This dot-product similarity is used as the final output of the model to determine if an image and a text caption are related or not.
Good question, thanks for the comment
@fouziaanjums6475 Před 2 měsíci ⁺²
Please cover FasterViT model too...
@PyMLstudio Před 2 měsíci
Absolutely, I’ll cover that , I have a few other topics lined up, then I’ll get to FasterViT
Thanks for the suggestion!
@randomstuff39280 Před měsícem
thank you for explaining! very clear!
but I'm wondering how do you know WiT dataset is based on 50000 queries and 20000 pairs for each query? I can't find it in the paper.
@PyMLstudio Před 26 dny ⁺¹
Thanks for the comment!
Please see Page 3, section 2.2: Creating a sufficiently large dataset
But it’s 500000 queries, balancing 20000 (Image, text) pairs per query

Další v pořadí

Automatické přehrávání

OpenAI CLIP: ConnectingText and Images (Paper Explained)

OpenAI CLIP: ConnectingText and Images (Paper Explained)

LlamaIndex Webinar: LLaVa Deep Dive

LlamaIndex Webinar: LLaVa Deep Dive

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

How much pee is in a pool?

How much pee is in a pool?

Největší FAIL Celého Fotbalového Týmu…

Největší FAIL Celého Fotbalového Týmu…

TOHODLE JSTE SI V AVENGERS NEVŠIMLI #zajimavosti #avengers

TOHODLE JSTE SI V AVENGERS NEVŠIMLI #zajimavosti #avengers

Proč LikeHouse Tak Tragicky Selhal?

Proč LikeHouse Tak Tragicky Selhal?

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

Query, Key and Value Matrix for Attention Mechanisms in Large Language Models

Query, Key and Value Matrix for Attention Mechanisms in Large Language Models

OpenAI's CLIP for Zero Shot Image Classification

OpenAI's CLIP for Zero Shot Image Classification

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

Top Optimizers for Neural Networks

Top Optimizers for Neural Networks

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

CLIP - Paper explanation (training and inference)

CLIP - Paper explanation (training and inference)

Can Contrastive Learning Work? - SimCLR Explained

Can Contrastive Learning Work? - SimCLR Explained

这三姐弟太会藏了！#小丑#天使#路飞#家庭#搞笑

这三姐弟太会藏了！#小丑#天使#路飞#家庭#搞笑

Clash of the Stars 9 | Nejrychlejší KO v historii | Deathmatch 5 proti sobě

Clash of the Stars 9 | Nejrychlejší KO v historii | Deathmatch 5 proti sobě

“It seems your luggage was lost in transit” ✈️

“It seems your luggage was lost in transit” ✈️

Apple Event - September 9

Apple Event - September 9

PÁRTY VE 20 vs VE 30 LETECH 😅😂

PÁRTY VE 20 vs VE 30 LETECH 😅😂

Nastya and balloon challenge

Nastya and balloon challenge

Fortunately, the rabbit held up the umbrella in time#Short #Officer Rabbit #angel

Fortunately, the rabbit held up the umbrella in time#Short #Officer Rabbit #angel

Pouštíme si TRAPNÉ PÍSNIČKY na VEŘEJNÝCH MÍSTECH 2

Pouštíme si TRAPNÉ PÍSNIČKY na VEŘEJNÝCH MÍSTECH 2