In this session of Computer Vision Study Group, Johannes walks us through the paper BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models.
czcams.com/video/k0DAtZCCl1w/video.html Based on the paper in this step, query tokens only attend to each other, whereas text tokens attend to all query tokens and the previous text tokens.
I like your story! Thank you so much to sharing it!
Great intro with the story! Nice and easy presentation. Thank you!
Thank you :)
Thank you for your efforts! It's a great video. Also, the stories at the beginning are always my favorite, lol.
Thank you, I have to admit the stories are also what I have most fun with when creating the presentations ;)
best BLIP2 explanation on the youtube!
Good explaination! Answered some of the questions I had while reading the paper
18:51에 나오는 그림은 query가 text token과 연결되면 안될것이다.
I think you are right, query token cannot attend text token.
very good explanation , Thank you
Awesome video and great explanation! Keep it up!
Great job!
I enjoyed your story!
not the most technical group session but thanks for your effort.
Thanks!
Thanks
Skiles Circle
Shanahan River
czcams.com/video/k0DAtZCCl1w/video.html Based on the paper in this step, query tokens only attend to each other, whereas text tokens attend to all query tokens and the previous text tokens.
Alanna Crescent
Jasper Fork
Keeling Mountain
Orville Court
Austen Centers
Hettinger Ramp
Eliza Cliff