OpenAI's CLIP for Zero Shot Image Classification

OpenAI CLIP: ConnectingText and Images (Paper Explained)

What does it take to create a Text to Image Diffusion Model from scratch?

Vážně Tohle Řekl? 😨 - OMEGLE RIZZ EDITION💦 ft. Lišák

NA toto se úplně ZAPOMÍNÁ! 🙅🏻‍♂️🤯

The kindhearted bunny officer helps the disabled!#spiderman #Harley Quinn

OpenAI CLIP Explained | Multi-modal ML

James Briggs

zhlédnutí 22 573

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 29. 08. 2024
OpenAI's CLIP explained simply and intuitively with visuals and code. Language models (LMs) can not rely on language alone. That is the idea behind the "Experience Grounds Language" paper, that proposes a framework to measure LMs' current and future progress. A key idea is that, beyond a certain threshold LMs need other forms of data, such as visual input.
The next step beyond well-known language models; BERT, GPT-3, and T5 is "World Scope 3". In World Scope 3, we move from large text-only datasets to large multi-modal datasets. That is, datasets containing information from multiple forms of media, like both images and text.
The world, both digital and real, is multi-modal. We perceive the world as an orchestra of language, imagery, video, smell, touch, and more. This chaotic ensemble produces an inner state, our "model" of the outside world.
AI must move in the same direction. Even specialist models that focus on language or vision must, at some point, have input from the other modalities. How can a model fully understand the concept of the word "person" without seeing a person?
OpenAI's Contrastive Learning In Pretraining (CLIP) is a world scope three model. It can comprehend concepts in both text and image and even connect concepts between the two modalities. In this video we will learn about multi-modality, how CLIP works, and how to use CLIP for different use cases like encoding, classification, and object detection.
🌲 Pinecone article:
pinecone.io/le...
🤖 70% Discount on the NLP With Transformers in Python course:
bit.ly/3DFvvY5
🎉 Subscribe for Article and Video Updates!
/ subscribe
/ membership
👾 Discord:
/ discord

Komentáře • 33

@ricardojung3849 Před rokem ⁺³
Thanks for reporting, explaining and lastly opening up recent ML!
I found clip to be very interesting since I always frowned at the lost potential of two different embeddings being arbitrary and methodically separate. This is huge!
@jamesbriggs Před rokem ⁺¹
yes there will be plenty more on CLIP and other similar models very soon - some of stuff I've built (and will demo) is awesome and nothing more than zero-shot CLIP, excited to share!
@mszak50 Před 10 měsíci
This was really excellent - some of the pieces are starting to make sense
@konichiwatanabi Před rokem
Thank you so much for this great walkthrough! Looking forward to more
@DallanQuass Před rokem
Great video! Looking forward to your next video diving more into using CLIP for zero-shot classification!
@jamesbriggs Před rokem
Me too, it's fascinating. Thanks for watching!
@adrianarroyo9839 Před rokem ⁺¹
Nice video and explanation! I think on min 28:45 you plotted cos_sim instead of dot_sim!
@ismailashraq9697 Před rokem
This is amazing James. Thanks for the detailed explanation. I am excited for the future CLIP videos 🙂.
@jamesbriggs Před rokem
Thanks Ashraq! As you know, I'm excited for them too
Před 11 měsíci
Thanks James, very good video about CLIP. Funny thing is that you display twice the cos_sim, so the second time it is not the dot_sim which is displayed. And you fighted to find any difference between the two similarity matrices. LOL 🤣
@jamesbriggs Před 11 měsíci
ah did I do that, oops 😅
@justinmiller7150 Před rokem ⁺¹
Great video. I think you may be plotting the same graph twice though (cos sim). In practice it is almost the same though it would seem.
@Gabriel-ey5ky Před rokem
Great video really ! I have just one thing to say, you should let the images longer in the screen I had to pause the video multiple times to be able to understand them
@jamesbriggs Před rokem
Thanks Gabriel, I head the same from another viewer - will do this going forwards :)
@valentinfontanger4962 Před rokem
Excellent video
@debashisghosh3133 Před rokem
Really liked the content...thanks for sharing
@jamesbriggs Před rokem
Thanks for watching!
@behnamplays Před rokem
Excellent content! As a suggestion, can you please keep the images/diagrams a bit longer? They move pretty fast in the video, which means I'll have to rewind the video every now and then.
@jamesbriggs Před rokem
Sure that’s great feedback, thanks!
@mvrdara Před rokem ⁺¹
Excellent explanation! We can build a CZcams video search engine powered by clip, perhaps you can iterate on the Nlp CZcams search video you did?
@jamesbriggs Před rokem ⁺¹
That's a great idea, but it might be difficult for CZcams videos where it is just someone talking, as the image embedding would just be something like "a person talking"
Possibly it could be interesting to embed both the text + images with CLIP, and maybe even an averaged text+image embedding for parts of videos where both the speech + image are important.
I will think about this more, it's a great idea so thankyou!
@user-tx9bl9sf7q Před rokem
Thanks. It is very informative. Can you pls explain and teach us how to do fine tunning on the custome dataset. Pls
@dancinghoka Před 8 měsíci
Thanks a lot!
@AdeleHaghighatHoseiniA Před rokem
Thank you for the good explanation, if we have 2 different embeddings like texts and 3D images, we can use CLIP to predict images?
@shaheerzaman620 Před rokem
fantastic stuff!
@sharanbabu2001 Před rokem
Nice explanation!
@abdirahmann Před rokem
is there a hosted API for clip where you can provide your image data and get the vectors instead of having to host it yourself, kinda like how you give an input to `ada-002`?
@anantzen171 Před rokem
10:23 I believe CLIP is an abbreviation of Contrastive Language Image Pretraining
@pyalgoGPT Před rokem
Plz post on Deep Reinforcement Learning tutorials & projects with python !
@jamesbriggs Před rokem ⁺¹
Eventually I’m sure I will, RL is very cool
@debayudhmitra9432 Před 4 měsíci
can you give the github code please
@mackenzieclarkson8322 Před 4 měsíci
Transitions are too flashy and triggering to my eyes. Good explainer however.

Další v pořadí

Automatické přehrávání

OpenAI's CLIP for Zero Shot Image Classification

OpenAI's CLIP for Zero Shot Image Classification

OpenAI CLIP: ConnectingText and Images (Paper Explained)

OpenAI CLIP: ConnectingText and Images (Paper Explained)

What does it take to create a Text to Image Diffusion Model from scratch?

What does it take to create a Text to Image Diffusion Model from scratch?

Vážně Tohle Řekl? 😨 - OMEGLE RIZZ EDITION💦 ft. Lišák

Vážně Tohle Řekl? 😨 - OMEGLE RIZZ EDITION💦 ft. Lišák

NA toto se úplně ZAPOMÍNÁ! 🙅🏻‍♂️🤯

NA toto se úplně ZAPOMÍNÁ! 🙅🏻‍♂️🤯

The kindhearted bunny officer helps the disabled!#spiderman #Harley Quinn

The kindhearted bunny officer helps the disabled!#spiderman #Harley Quinn

Before VS during the CONCERT 🔥 "Aliby" | Andra Gogan

Before VS during the CONCERT 🔥 "Aliby" | Andra Gogan

Fast intro to multi-modal ML with OpenAI's CLIP

Fast intro to multi-modal ML with OpenAI's CLIP

ChatGPT: 30 Year History | How AI Learned to Talk

ChatGPT: 30 Year History | How AI Learned to Talk

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Fast Zero Shot Object Detection with OpenAI CLIP

Fast Zero Shot Object Detection with OpenAI CLIP

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

LlamaIndex Webinar: LLaVa Deep Dive

LlamaIndex Webinar: LLaVa Deep Dive

OpenAI CLIP model explained

OpenAI CLIP model explained

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

Když se Netrefíš, Tento Elixír Musíš Vypít!

Když se Netrefíš, Tento Elixír Musíš Vypít!

Jak Vypadal Mariánkův Sraz s Fanoušky? #shorts #jonmarianek #marcel

Jak Vypadal Mariánkův Sraz s Fanoušky? #shorts #jonmarianek #marcel

Na stavbě už jsem viděl asi všechno, flastry dáváme každý den, říká autor humorných videí s dělníky

Na stavbě už jsem viděl asi všechno, flastry dáváme každý den, říká autor humorných videí s dělníky

NEJLEPŠÍ KVÍZ NA YOUTUBE @Duklock @EvilBender47

NEJLEPŠÍ KVÍZ NA YOUTUBE @Duklock @EvilBender47

O ka ne ka se gu 초등학생이춘다면??? #춤추는곰돌 #춤추는곰돌의랜덤댄스 #dance #댄스 #kpop #okanekasegu #mamushi #hiphop #춤

O ka ne ka se gu 초등학생이춘다면??? #춤추는곰돌 #춤추는곰돌의랜덤댄스 #dance #댄스 #kpop #okanekasegu #mamushi #hiphop #춤

DRAMA MR.BEASTA

DRAMA MR.BEASTA

video VS backstage 😁🍎

video VS backstage 😁🍎

女孩妒忌小丑女？ #小丑#shorts

女孩妒忌小丑女？ #小丑#shorts