Deploying a Flask Web App on AWS EC2 (Gunicorn + Apache w/ SSL Certificate)

Neil deGrasse Tyson Just Revealed Declassified Photos From Venus By The Soviet Union!

Gemini API is here! Let's try it! | Basic Usage + Function Calling (Python SDK)

YZO & PTK - NO SLEEP GANG / GET LOW (official double music video)

#JasonDeruloTV // Lottery #GotPermissionToPost From @prestige_et_collection #FromTheIslands

This pasta HACK is almost approved

Gemini Demo But With GPT-4 Vision API

Unconventional Coding

zhlédnutí 2 438

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 27. 07. 2024
Github: github.com/unconv/gpt4v-gemini
In today's video I showcase a Python program I have made using OpenAI's GPT-4 Vision API, Speech-to-text API and Whisper, that attempts to accomplish what the Google Gemini multimodal demo shows.
More information on the project coming up in future videos.
Support: buymeacoffee.com/unconv
Consultations: www.buymeacoffee.com/unconv/e...
Memberships: www.buymeacoffee.com/unconv/m...
00:00 Demo
03:25 Bloopers
05:26 Unedited Version
Věda a technologie

Komentáře • 15

@mallardlane8965 Před 7 měsíci ⁺³
Better Demo than Google 🙂
@corvo1068 Před 7 měsíci ⁺¹
Your demo is better than the one from Google, it looks like they hand selected screenshots to send and gave more hints in their prompts, but didn't include the whole prompt in the demo.
@YuraL88 Před 7 měsíci
Wow! Looks impressive!
❤
@Crovea Před 7 měsíci
that last blooper was funny :D
@robrita Před 7 měsíci ⁺¹
nice demo!! the most interesting part here I'd say is when to capture the screenshot - maybe when you pause talking?? 🤔 and maybe you can add multiple capture when there's movement or diff images every second.
@unconv Před 7 měsíci ⁺⁵
When it detects movement, it starts saving all the frames until movement stops. Then it splits the list of frames into six equal parts. Then it takes the sharpest frame from each part and makes a collage from them and sends that to ChatGPT. And yes, when talking stops, it sends a screenshot. If there was movement during talking, it sends the collage as well.
@robrita Před 7 měsíci ⁺¹
@@unconvawesome!! great job!! I love the idea of making a collage - I didn't see that coming. keep it up bro!!
@avi7278 Před 7 měsíci ⁺²
Are you sending all the frames to gpt-v? I have a function which compares subsequent frames in a video and only extracts the ones that meet a difference threshold, so for example out of a 25 second video, it might pull out 7 frames which represent sufficient difference to be significant enough to send to the api.
@robrita Před 7 měsíci
you can even diff screenshots every second, I think that would be sufficient enough.
@avi7278 Před 7 měsíci
@@robrita it depends, for example where scrolling text is involved it is not and there is no point in introducing loss potential where there is no cost. A lot can happen in a second.
@robrita Před 7 měsíci
@@avi7278 of course there's a cost, why assume not?? more images will took long time to respond, it's unnecessary resources grabbing for most of the use cases.. it's not like your yolo v8 on your pc 😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆😆
@unconv Před 7 měsíci ⁺¹
It only sends a collage of 6 "strategically selected" frames during movement (one image). And one image after talking stops.
@DarkNetDragoon Před 5 měsíci
Will it work if I try to change the model to gemini vision with all the parameteres?
@thenoblerot Před 7 měsíci
Great demo!
Unrelated to this video... I tried your "ok-gpt" code with whisper (the tiny model) on a Pi 4. Recognition works fine, but latency is kind of a deal breaker :( I guess I have a reason to get a Pi 5 now!
@unconv Před 7 měsíci
Thanks! Good to know. Maybe I'll try to switch it to using the Whisper API (and leak all conversations to OpenAI lol)

Další v pořadí

Automatické přehrávání

Deploying a Flask Web App on AWS EC2 (Gunicorn + Apache w/ SSL Certificate)

Deploying a Flask Web App on AWS EC2 (Gunicorn + Apache w/ SSL Certificate)

Neil deGrasse Tyson Just Revealed Declassified Photos From Venus By The Soviet Union!

Neil deGrasse Tyson Just Revealed Declassified Photos From Venus By The Soviet Union!

Gemini API is here! Let's try it! | Basic Usage + Function Calling (Python SDK)

Gemini API is here! Let's try it! | Basic Usage + Function Calling (Python SDK)

YZO & PTK - NO SLEEP GANG / GET LOW (official double music video)

YZO & PTK - NO SLEEP GANG / GET LOW (official double music video)

#JasonDeruloTV // Lottery #GotPermissionToPost From @prestige_et_collection #FromTheIslands

#JasonDeruloTV // Lottery #GotPermissionToPost From @prestige_et_collection #FromTheIslands

This pasta HACK is almost approved

This pasta HACK is almost approved

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Shortrocity EP1: Making an AI YouTube Short Generator in Python

Shortrocity EP1: Making an AI YouTube Short Generator in Python

I Built a Talking Santa Claus GPT (with Speech Recognition)

I Built a Talking Santa Claus GPT (with Speech Recognition)

Reverse Engineering Gmail's Autocomplete Feature

Reverse Engineering Gmail's Autocomplete Feature

if __name__ == "__main__" in Python The Real Reason | Why we use this in Python

if __name__ == "__main__" in Python The Real Reason | Why we use this in Python

Python + GPT-4o + Flask = AI Meal Calorie Detector (GPT-4 Vision API)

Python + GPT-4o + Flask = AI Meal Calorie Detector (GPT-4 Vision API)

The nearest most massive black hole found! AND it’s in the mass gap | Night Sky News July 2024

The nearest most massive black hole found! AND it’s in the mass gap | Night Sky News July 2024

Shortrocity EP2: Creating Video Slideshow with OpenCV

Shortrocity EP2: Creating Video Slideshow with OpenCV

Surprising changes to SpaceX Starship Unveiled! Why are they doing this?

Surprising changes to SpaceX Starship Unveiled! Why are they doing this?

Economist fact-checks Scott Galloway’s Anti-Boomer TED Talk

Economist fact-checks Scott Galloway’s Anti-Boomer TED Talk

NOVÉ SAMSUNGY 😅

NOVÉ SAMSUNGY 😅

HAND CRANK Phone Charger?

HAND CRANK Phone Charger?

#best PLAYSTATION CONSOLE #collection #shortvideos #gaming #foryou

#best PLAYSTATION CONSOLE #collection #shortvideos #gaming #foryou

The first two iPads are imitations, just for demonstration purposes, don't worry#ipadkeyboard #ipad

The first two iPads are imitations, just for demonstration purposes, don't worry#ipadkeyboard #ipad

When Companies Copy Each Other...

When Companies Copy Each Other...

Samsung Z Flip 6 Durability Test - I CANT BELIEVE THIS WORKED...

Samsung Z Flip 6 Durability Test - I CANT BELIEVE THIS WORKED...

Rate This Smartphone Cooler Set-up ⭐

Rate This Smartphone Cooler Set-up ⭐

First Time PC Builder Screws Up Again

First Time PC Builder Screws Up Again