10 years of NLP history explained in 50 concepts | From Word2Vec, RNNs to GPT

Why Does Diffusion Work Better than Auto-Regression?

Explaining the Segment Anything Model - Network architecture, Dataset, Training

Nejlepší zapečené párky 🍺 #ostravskygastrošef #food #heřmangazda

Sad To Announce I Did Not Qualify For Mens 2024 Olympic Gymnastics Team

【斗罗大陆】坏人居然敢欺负唐舞桐？斗罗家族可不好惹哟！#斗罗大陆#唐舞桐#唐三#小舞

Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.

Neural Breakdown with AVB

zhlédnutí 8 534

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 29. 08. 2024

Komentáře • 22

@xxlvulkann6743 Před 23 dny ⁺¹
This was a useful summary for finding papers to research developments in multimodal machine learning models!
@avb_fj Před 22 dny
Thanks! Super glad you found the video resourceful!
@joshuatettey7771 Před 22 dny ⁺¹
Awesome video. Thanks mate🤩
@boogati9221 Před 3 měsíci ⁺³
Dude this video was so fucking good. Keep it up.
@madsfrederiksen6213 Před rokem ⁺⁴
Great and clear video! Heard about multimodal models for the first time today, and i already feel like i have a better grasp of it, thanks to you :)
@meet_minimalist Před 7 měsíci ⁺²
Excellent video with all the paper references. Lot to read and learn from papers. Thanks. :)
@avb_fj Před 7 měsíci
Thanks!🙏🏽
@tomm9716 Před měsícem
Really good stuff mate, subbed
@AI_ML_DL_LLM Před rokem ⁺¹
Wow, there is lots of works behind it, thank you
@avb_fj Před rokem
Haha thanks for the comment! It’s an emerging area, and a lot of groundbreaking research really has happened in the past few years.
@syoyazhou8657 Před rokem ⁺¹
Like your videos. Explain things in a very clear way. Thx for sharing.
@avb_fj Před rokem
Thank you!
@xspydazx Před 4 měsíci
CODE IS BETTER ??
rom transformers import VisionEncoderDecoderModel, VisionTextDualEncoderProcessor, AutoImageProcessor, AutoTokenizer
print('Add Vision...')
# ADD HEAD
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
Vmodel = VisionEncoderDecoderModel.from_encoder_decoder_pretrained(
"google/vit-base-patch16-224-in21k", "LeroyDyer/Mixtral_AI_Tiny"
)
_Encoder_ImageProcessor = Vmodel.encoder
_Decoder_ImageTokenizer = Vmodel.decoder
_VisionEncoderDecoderModel = Vmodel
# Add Pad tokems
LM_MODEL.VisionEncoderDecoder = _VisionEncoderDecoderModel
# Add Sub Components
LM_MODEL.Encoder_ImageProcessor = _Encoder_ImageProcessor
LM_MODEL.Decoder_ImageTokenizer = _Decoder_ImageTokenizer
LM_MODEL
This is how you add vision to llm (you can embed the head inside )
print('Add Audio...')
#Add Head
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
_AudioFeatureExtractor = AutoFeatureExtractor.from_pretrained("openai/whisper-small")
_AudioTokenizer = AutoTokenizer.from_pretrained("openai/whisper-small")
_SpeechEncoderDecoder = SpeechEncoderDecoderModel.from_encoder_decoder_pretrained("openai/whisper-small","openai/whisper-small")
# Add Pad tokems
_SpeechEncoderDecoder.config.decoder_start_token_id = _AudioTokenizer.cls_token_id
_SpeechEncoderDecoder.config.pad_token_id = _AudioTokenizer.pad_token_id
LM_MODEL.SpeechEncoderDecoder = _SpeechEncoderDecoder
# Add Sub Components
LM_MODEL.Decoder_AudioTokenizer = _AudioTokenizer
LM_MODEL.Encoder_AudioFeatureExtractor = _AudioFeatureExtractor
LM_MODEL
This is how you can add vision :
@ahmed_hefnawy1811 Před 5 měsíci ⁺¹
Excellent
@vobbilisettyveera2973 Před 11 měsíci ⁺¹
awesome!!!!!!!!!!
@420_gunna Před 6 měsíci
7:55 lol
@avb_fj Před 6 měsíci
Honest reactions lol😅
@deliciouspops Před rokem
do you think you should tune your audio levels or what? according to youtube, i am your 666th view
@avb_fj Před rokem
Always open for feedback. What kind of tuning are we talking about?
@avb_fj Před rokem
@@LonewolfeSlayer Sounds good... something to keep in mind for my next one. :)

Další v pořadí

Automatické přehrávání

10 years of NLP history explained in 50 concepts | From Word2Vec, RNNs to GPT

10 years of NLP history explained in 50 concepts | From Word2Vec, RNNs to GPT

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Explaining the Segment Anything Model - Network architecture, Dataset, Training

Explaining the Segment Anything Model - Network architecture, Dataset, Training

Nejlepší zapečené párky 🍺 #ostravskygastrošef #food #heřmangazda

Nejlepší zapečené párky 🍺 #ostravskygastrošef #food #heřmangazda

Sad To Announce I Did Not Qualify For Mens 2024 Olympic Gymnastics Team

Sad To Announce I Did Not Qualify For Mens 2024 Olympic Gymnastics Team

【斗罗大陆】坏人居然敢欺负唐舞桐？斗罗家族可不好惹哟！#斗罗大陆#唐舞桐#唐三#小舞

【斗罗大陆】坏人居然敢欺负唐舞桐？斗罗家族可不好惹哟！#斗罗大陆#唐舞桐#唐三#小舞

The First Time You Say ' Mom ' #shortsfeed #funny

The First Time You Say ' Mom ' #shortsfeed #funny

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

New multimodal vision AI models and their practical applications | BRK106

New multimodal vision AI models and their practical applications | BRK106

Kolmogorov Arnold Networks (KAN) Paper Explained - An exciting new paradigm for Deep Learning?

Kolmogorov Arnold Networks (KAN) Paper Explained - An exciting new paradigm for Deep Learning?

What does it take to create a Text to Image Diffusion Model from scratch?

What does it take to create a Text to Image Diffusion Model from scratch?

ChatGPT: 30 Year History | How AI Learned to Talk

ChatGPT: 30 Year History | How AI Learned to Talk

NEW Multi-Modal AI by APPLE

NEW Multi-Modal AI by APPLE

Large Language Models (LLMs) - Everything You NEED To Know

Large Language Models (LLMs) - Everything You NEED To Know

Ilya Sutskever | AI will be omnipotent in the future | Everything is impossible becomes possible

Ilya Sutskever | AI will be omnipotent in the future | Everything is impossible becomes possible

OpenAI CLIP Explained | Multi-modal ML

OpenAI CLIP Explained | Multi-modal ML

FAKT ŠPATNÉ VTIPY (ty nejhorší z nejhorších) w/Stejk

FAKT ŠPATNÉ VTIPY (ty nejhorší z nejhorších) w/Stejk

Get 10 Mega Boxes OR 60 Starr Drops!!

Get 10 Mega Boxes OR 60 Starr Drops!!

7 Days Stranded In A Cave

7 Days Stranded In A Cave

Underwater Challenge 😱

Underwater Challenge 😱

Pouštíme si TRAPNÉ PÍSNIČKY na VEŘEJNÝCH MÍSTECH 2

Pouštíme si TRAPNÉ PÍSNIČKY na VEŘEJNÝCH MÍSTECH 2

Incredible Dog Rescues Kittens from Bus - Inspiring Story #shorts

Incredible Dog Rescues Kittens from Bus - Inspiring Story #shorts

DRAMA MR.BEASTA

DRAMA MR.BEASTA

Zemřel Karel Heřmánek. Měl se zastřelit na střelnici, představiteli fešáka Huberta bylo 76 let

Zemřel Karel Heřmánek. Měl se zastřelit na střelnici, představiteli fešáka Huberta bylo 76 let