GPT-4 Vision Browsing Part 2: Following links with Puppeteer
Vložit
- čas přidán 27. 07. 2024
- In today's video I do continue my GPT-4 Browsing project and make it follow links on pages.
GitHub: github.com/unconv/gpt4v-browsing
Support: buymeacoffee.com/unconv
Consultations: www.buymeacoffee.com/unconv/e...
Memberships: www.buymeacoffee.com/unconv/m...
00:00 Recap
02:41 Finding all clickable elements
31:06 Making GPT-4 Vision read link texts
47:03 Migrating to JavaScript
1:01:58 Clicking links by link text in Puppeteer
1:08:42 Making it conversational
1:17:40 It works! (almost)
1:20:00 It really works! - Věda a technologie
Thanks for making these videos. They are a lot of fun and very informative.
Good to hear! Thanks for watching :)
That was fun to watch. Can't wait for the next episode. 👍
Great Video. Appreciate that you just kept going, trying this and that, as you fought with visionGPT, it is very informative. Keep Going ...
Nice, what I was looking for! Thanks.
You just keep on trucking :-) another good one!
Haven't finished the video yet - but wanted to mention that I've sometimes had GPT behave a little oddly when I specifically use the word 'crawler' in a prompt. It's a bit like it goes into "I shouldn't really be doing this.. mnnghhhhhh!" mode. But telling it that it's a super-duper-wonderful positive thing to help a visually impaired user navigate the web works more reliably. Not sure if it's just random chance - but seemed to work at the time.
"You are a crawler... or else!"
Following up on the hour google meet I purchased - when can we do it? @@unconv
@st-hf2ik Check your email :)
I think you will be able to use a other vision LLM with opensource code and not having this kind of problems
this oddly works, and errors have reduced alot. guess gpts vary in their helping tendency if the person is disabled?
Subscribed. Thank you so much for this. Bravo!
Subscribed! thank you so much for the great content!
Awesome stuff!
Amazing! Always fun to see real time problem solving and how to go about it! Just saw the web crawler used by Jason AI in his latest video...been following both of youll since april... thanks to both !! 🍻🍻
Awesome!
@@unconv can you please make part 3 too :)
@@unconvwe def need the part 3 with the input field functionality
It was amazing
Yeah I'm having some code written for me with selenium uh if you can give me an idea how this is better these are kind of long tutorials but that would be great if you could give a little synopsis
Useful
Hey man - amazing video. How would you go about deploying this? AWS? Vercel?
Great video! I am now a fan. I have pull the code and the link clicking failed each time. I suspect node version maybe. What node version are you running?
I have Node 19.6.0
Do intend to push this to the gpt4v-browsing repo. Would be appreciated.
Ah found it never mind :-D
Great video. Really interesting.
One of the issue I ran into with a similar project sending website screenshots to gpt-4o had to do with long web pages. They would generate long skinny images, which were unreadable by the AI. As I understand it, screenshots are resized on OpenAI's end into a square (ie: 1024x1024), maintaining the original image's aspect ratio. This results in a lot of the text being unreadable (too small). I've tried splitting these long images into part_1, part_2, etc, but it obviously results in some images getting split in nonsensical areas, which also causes problems. Would love to hear your thoughts on this.
In my video "5 Use Cases for GPT-4 Vision API" I scrape an Amazon search results page by splitting the screenshot into parts, but I also "overlap" the parts so that no product gets cut in half. Depending on the specific website you're scraping something like this might work.
Will check it out! 👍