GPT-4 Vision Browsing Part 2: Following links with Puppeteer

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • In today's video I do continue my GPT-4 Browsing project and make it follow links on pages.
    GitHub: github.com/unconv/gpt4v-browsing
    Support: buymeacoffee.com/unconv
    Consultations: www.buymeacoffee.com/unconv/e...
    Memberships: www.buymeacoffee.com/unconv/m...
    00:00 Recap
    02:41 Finding all clickable elements
    31:06 Making GPT-4 Vision read link texts
    47:03 Migrating to JavaScript
    1:01:58 Clicking links by link text in Puppeteer
    1:08:42 Making it conversational
    1:17:40 It works! (almost)
    1:20:00 It really works!
  • Věda a technologie

Komentáře • 35

  • @marcoaerlic2576
    @marcoaerlic2576 Před 2 měsíci

    Thanks for making these videos. They are a lot of fun and very informative.

    • @unconv
      @unconv  Před 2 měsíci

      Good to hear! Thanks for watching :)

  • @carstenli
    @carstenli Před 7 měsíci +1

    That was fun to watch. Can't wait for the next episode. 👍

  • @MindForeverVoyaging
    @MindForeverVoyaging Před 7 měsíci

    Great Video. Appreciate that you just kept going, trying this and that, as you fought with visionGPT, it is very informative. Keep Going ...

  • @Sulayman.786
    @Sulayman.786 Před 8 měsíci +1

    Nice, what I was looking for! Thanks.

  • @digitalcivilulydighed
    @digitalcivilulydighed Před 8 měsíci

    You just keep on trucking :-) another good one!

  • @billybofh2363
    @billybofh2363 Před 8 měsíci +6

    Haven't finished the video yet - but wanted to mention that I've sometimes had GPT behave a little oddly when I specifically use the word 'crawler' in a prompt. It's a bit like it goes into "I shouldn't really be doing this.. mnnghhhhhh!" mode. But telling it that it's a super-duper-wonderful positive thing to help a visually impaired user navigate the web works more reliably. Not sure if it's just random chance - but seemed to work at the time.

    • @unconv
      @unconv  Před 8 měsíci +2

      "You are a crawler... or else!"

    • @st-hf2ik
      @st-hf2ik Před 7 měsíci

      Following up on the hour google meet I purchased - when can we do it? @@unconv

    • @unconv
      @unconv  Před 7 měsíci

      @st-hf2ik Check your email :)

    • @npizza3973
      @npizza3973 Před 7 měsíci

      I think you will be able to use a other vision LLM with opensource code and not having this kind of problems

    • @DJcatamount
      @DJcatamount Před 6 měsíci

      this oddly works, and errors have reduced alot. guess gpts vary in their helping tendency if the person is disabled?

  • @RahulGupta-uk1gc
    @RahulGupta-uk1gc Před 7 měsíci

    Subscribed. Thank you so much for this. Bravo!

  • @techfren
    @techfren Před 6 měsíci +1

    Subscribed! thank you so much for the great content!

  • @m1kecr1s1s
    @m1kecr1s1s Před 7 měsíci

    Awesome stuff!

  • @ex3aliber
    @ex3aliber Před 7 měsíci +2

    Amazing! Always fun to see real time problem solving and how to go about it! Just saw the web crawler used by Jason AI in his latest video...been following both of youll since april... thanks to both !! 🍻🍻

  • @jayakrishnanp5988
    @jayakrishnanp5988 Před 7 měsíci

    It was amazing

  • @toapyandfriends
    @toapyandfriends Před 4 měsíci +1

    Yeah I'm having some code written for me with selenium uh if you can give me an idea how this is better these are kind of long tutorials but that would be great if you could give a little synopsis

  • @neon_Nomad
    @neon_Nomad Před 7 měsíci

    Useful

  • @user-ep3pm2tw1e
    @user-ep3pm2tw1e Před 5 měsíci

    Hey man - amazing video. How would you go about deploying this? AWS? Vercel?

  • @gaboguit
    @gaboguit Před 7 měsíci

    Great video! I am now a fan. I have pull the code and the link clicking failed each time. I suspect node version maybe. What node version are you running?

    • @unconv
      @unconv  Před 7 měsíci

      I have Node 19.6.0

  • @actorjohanmatsfredkarlsson2293

    Do intend to push this to the gpt4v-browsing repo. Would be appreciated.

  • @RyanCourtnage
    @RyanCourtnage Před 2 měsíci

    One of the issue I ran into with a similar project sending website screenshots to gpt-4o had to do with long web pages. They would generate long skinny images, which were unreadable by the AI. As I understand it, screenshots are resized on OpenAI's end into a square (ie: 1024x1024), maintaining the original image's aspect ratio. This results in a lot of the text being unreadable (too small). I've tried splitting these long images into part_1, part_2, etc, but it obviously results in some images getting split in nonsensical areas, which also causes problems. Would love to hear your thoughts on this.

    • @unconv
      @unconv  Před 2 měsíci

      In my video "5 Use Cases for GPT-4 Vision API" I scrape an Amazon search results page by splitting the screenshot into parts, but I also "overlap" the parts so that no product gets cut in half. Depending on the specific website you're scraping something like this might work.

    • @RyanCourtnage
      @RyanCourtnage Před 2 měsíci

      Will check it out! 👍