guys this world is new for me. what is the purpose of scaping amazon or ebay or any website to take the products informations . please any kind soul to enlighten me
Keep in mind the beautifulsoup library only works on non-dynamic pages which are fully loaded and do not use s; it would be better to call beautifulsoup a DOM parser than to say you are "scraping" with it. There are countless other limitations with the requests/beautifulsoup mechanisms; you will hit a capabilities wall very quickly. If you want to do real web scraping, you need to be using something like Selenium which utilizes a real web browser driver and can interface with a live DOM environment in real time, operating against real DOM elements as well as being able to switch context to internal elements and controlling things like the back and forward features of a browser as well as tabs. If you are wanting to scrap a page which has dynamically loaded content, you need something like Selenium. It's a whole different beast but it's worth the time to learn (correctly)
Selenium is pretty easy to learn though, and very powerful. ChatGPT is pretty good with it. It's kind of fun. Like one of the sites I scraped for product reviews, I had to scroll to the bottom, wait two seconds, and scroll to the bottom again to get it to run the javascript to load product reviews. Just asked ChatGPT "how would you scroll to the end of a web page and wait two seconds then scroll to the end again in Selenium?" Got my answer in 2 seconds hah!
I love the fact that even the video narration is completely AI; from the wording, to the slightly uncanny speech synthesis itself. I bet the entire video was the result of someone typing-in a request to ChatGPT, to "write a script for a CZcams video which explains the use of ChatGPT to scrape websites". In the future, web dev may be more about developing the right questions to ask of AI...
This is valid only if the website hasn't changed layout after AI was trained. ChatGPT kept offering wrong code for a JS library, suggesting old version syntax.
@if3lixde803common misconception actually. It does have access to the internet (even though it was only _trained_ on pre 2020 data). It doesn’t like to access the internet, though, and you may need to jailbreak it into DAN first, but there are plenty of examples of people linking ChatGPT to StableDiffusion and Midjourney to make images right in the ChatGPT dialogue
@@Jonpot Well heres what ChatGPT itself thinks about it: "As an artificial intelligence, I don't have the ability to access the internet or browse the web. I am a large language model that has been trained on a dataset of texts, and I use that training to generate human-like responses to the questions and prompts that I receive. I don't have the ability to browse the web or access new information beyond what I was trained on, but I can use the information that I was trained on to try to provide helpful and accurate responses to the questions that I receive. Is there something specific that you would like to know?"
I've tried several websites and it guesses the web elements, GPTChat doesn't visit the URL and grab the source code so this is a bit pointless for most situations.
Exactly. This is just a language model it's not interacting with the world. People don't get it because of how this is marketed and shared in the social media
To clarify - ChatGPT won't go to website and won't analyze it's html to scrape it. It will only work if the site you're trying to parse were somehow part of library ChatGPT was trained on.
@@juleswombat5309 all sites have different structure, it woudn't work. Or you mean to change url in chat gpt prompt? It won't work either since Chat GPT is not going to visit the url and analyze the site structure.
Bearing in mind chat gpt is not connected to the internet, it would be very challenging technically to create code that is capable of scraping any website, nonetheless if you have seen this please reference the source.
The improper use of plurals is interesting. Is there some translation/ai reason behind this? I know that voice mimicry AI has also become quite impressive. Was this script passed through a translator, then through a text to speech generator?
guys this world is new for me. what is the purpose of scaping amazon or ebay or any website to take the products informations . please any kind soul to enlighten me
I was expecting you to perform web scraping _inside_ ChatGPT, like, "Pretend that you are a terminal. Run my input as a command and respond only with the output in a code block. My first command is: `curl [URL to scrape here]`"
you can't build a keyword research tool without data. all the tools available in market are build on top of google ads api. they didn't build anything magical. they are using google ads api (obviously Paid plans) to get data related to search. (as google is most used search engine and they have build their own data) just create Google ads account (it's free) and you can do keyword research for free...
type -> continue. ChatGPT has some TIME LIMIT for writing response due to high traffic and this is a free version. if it produce code in some chunks, then ask it to write code in in file. if code is too big , it will stop. so type continue and it will finish the rest.
@@yokoabi8307 Let me offer another suggestion. copy the code on the line where it stopped: Say continue from the code below and paste the code by shift+enter
Good lord, I didn't think I'd be this motivated once more to learn new stuff. The last time I've felt this way is when I discovered WolframAlpha and used SageMath to help with my Math homework. Now I'm gonna be reinforcing my Data Science knowledge with this? Oh boy, I'm ready.
since chatgpt do not have access to the internet it will not work on most of the pages. It generates code like foreach this html class but this class do not exist
So let's say that python we're requesting has more steps in it. How do we manage and instruct chatGPT to output complete code sections without running out of tokens. When I type "continue" it doesn't alway pickup where it left off. Sometimes it moves on the next part of the code. Any suggestion on how to manage this or it is what it is for now?
Create instruct at the beginning, like when I say continue, continue from where you left without instructing me, etc. Or another method is, to write the last word the bot creates and say continue from there
Thank you very much for this wonderful tutorial. I was not familiarized with Python as I use other web scraping tools (obviously not so complete) but I managed to install it and its libraries. I would like to ask you how to get all pages or a certain number of pages. Thanks again!
If you love scrapping use dedicated platforms such as apify. If you scrape from home your IP can be banned. All big sites have anntiscaping policy. Some are brutal.
@@dawidp749 ChatGPT isn't always correct. I've had to correct it's Python code numerous times. It's great for basic applications and great for getting you started, but sometimes you'll have to fix it's mistakes.
I love all these "new devs" are going to grow. If you're asking it here instead of chatgpt that means you didn't get the whole thing, you're just copy pasting from the video. Coding is not about just writing code it is about understanding
I've done this. You need to look for the presence of the "Next" button and figure out how to either click it, or pull its href link and get it. If you don't find the next button, you break the loop and end the program. You can just ask ChatGPT to loop through all pages and it should get you pretty close.
not sure. (mostly NO , as i checked pitchbook .com) you can try to check if it is working for pitchbook or not. if you need to login or purchase a plan to get the data then answer is NO. but it can work on amazon or ebay. i made a video about scarping amazon product data.
@@Magicskid2323 VPN can protect your IP indeed. But your script will be terminated. Big sites have tried to make scrapping illegal but they failed. What they earned was the right to take practical measures to stop bots activity, and they hired good coders to do that. Actually much clever than the random scraper. Check Apify pages (just an example) to understand how wild the scrapping war has escalated. Scrapping from one's garage has become a story of the past.
@@OL9245 I have first hand experience in the matter. I have no issues scraping the websites and data that I need. This may be more of a conversation about skill level because if you're a skilled enough coder there's no reason why you shouldn't be able to accomplish the task at hand.
Is it possible to scrape list of website from Google search query results, extract text content from each url , feed it to chatgpt and create prompt so chatgpt do text analysis from extracted text and then generate article outline from it
How about scrape profile. In case each profile consists name, department, laboratory, expertize, email and phone number. If you mind, Please make it tutorial
They improved the AI model to reject scraping requests. But you can still use it to generate scraping code. check this video: Scraping Amazon Product data using ChatGPT czcams.com/video/9MNCGdaJfA0/video.html you can follow the same process for any other website.
It can't, it's not "connected". If it suggest a code to scrape a website, it has trained using that data. Either accessed the website previously before 2022, or there are information online about web scraping this website where the specific elements after available. If there's a website created after 2021, it will not be able to suggest the elements to retrieve
you can put all the code inside for loop. then for each,{ change target_link=url+ 'page=1'...} or something similar based on the website. and it will go for all the pages.
I’m trying to make one that scrapes and crawls all the “top 50 ad agency/ social media marketers” websites then collect all those websites then re scrape all those collected websites to gather their emails to then further contact them about my video editing service or if there’s a way to contact smaller CZcamsrs on a large automated scale
Scrape Amazon Product Data using ChatGPT and Python:
czcams.com/video/9MNCGdaJfA0/video.html
guys this world is new for me. what is the purpose of scaping amazon or ebay or any website to take the products informations . please any kind soul to enlighten me
This is like when calculators were first invented
yup
exactly. You got it
Except it's a calculator for everything knowledge based.
@@TechnoMinarchist you also got it
women?
I’m not scared of you at all , you’re my new coding buddy. Can’t wait for version 4 and better servers
sure. you can use ChatGPT to solve math homework, or do coding assignment , or even use at your work.
@@ChatGPT-AI no sir, chatgpt sucks at math and logical problems. try it
@@oxi2118 It did a lot of algorithms for me, what are you talking about?
@@0mdf ask it a general mathematical question, like "how many numbers which end with 9 are divisible by 7"
@@oxi2118 yes true , it doen't do well with math , i tried it and it sucks
Keep in mind the beautifulsoup library only works on non-dynamic pages which are fully loaded and do not use s; it would be better to call beautifulsoup a DOM parser than to say you are "scraping" with it. There are countless other limitations with the requests/beautifulsoup mechanisms; you will hit a capabilities wall very quickly.
If you want to do real web scraping, you need to be using something like Selenium which utilizes a real web browser driver and can interface with a live DOM environment in real time, operating against real DOM elements as well as being able to switch context to internal elements and controlling things like the back and forward features of a browser as well as tabs.
If you are wanting to scrap a page which has dynamically loaded content, you need something like Selenium. It's a whole different beast but it's worth the time to learn (correctly)
Selenium is pretty easy to learn though, and very powerful. ChatGPT is pretty good with it. It's kind of fun. Like one of the sites I scraped for product reviews, I had to scroll to the bottom, wait two seconds, and scroll to the bottom again to get it to run the javascript to load product reviews. Just asked ChatGPT "how would you scroll to the end of a web page and wait two seconds then scroll to the end again in Selenium?" Got my answer in 2 seconds hah!
Or if you're in TS/JS use puppeteer which supports async.
the perfect combo would be:
scrapy + playwright
@MSD Group there's literally more musicians than ever what kind of weird comparison did you make?
@MSD Group lol the musician market is so oversaturated with talented people
The fact that you can have ChatGPT right code is so amazing to me, makes new concepts for code so much easier to figure out.
right ?
ChatGPT can also wrong the code too
it wrote me a C code with JSON in it
It can also teach you to use write instead of right
@@xbon1 it can write the right grammar, right?
I love the fact that even the video narration is completely AI; from the wording, to the slightly uncanny speech synthesis itself. I bet the entire video was the result of someone typing-in a request to ChatGPT, to "write a script for a CZcams video which explains the use of ChatGPT to scrape websites". In the future, web dev may be more about developing the right questions to ask of AI...
Came here to say the same thing lol
This is valid only if the website hasn't changed layout after AI was trained. ChatGPT kept offering wrong code for a JS library, suggesting old version syntax.
Give it html of current page so it can parse it and then interpret the new layout
@if3lixde803 I think he means literally copying and pasting the whole html rendered of the web
@if3lixde803common misconception actually. It does have access to the internet (even though it was only _trained_ on pre 2020 data). It doesn’t like to access the internet, though, and you may need to jailbreak it into DAN first, but there are plenty of examples of people linking ChatGPT to StableDiffusion and Midjourney to make images right in the ChatGPT dialogue
@@Jonpot Well heres what ChatGPT itself thinks about it: "As an artificial intelligence, I don't have the ability to access the internet or browse the web. I am a large language model that has been trained on a dataset of texts, and I use that training to generate human-like responses to the questions and prompts that I receive. I don't have the ability to browse the web or access new information beyond what I was trained on, but I can use the information that I was trained on to try to provide helpful and accurate responses to the questions that I receive. Is there something specific that you would like to know?"
let the script kiddies have a nice day mile
I've tried several websites and it guesses the web elements, GPTChat doesn't visit the URL and grab the source code so this is a bit pointless for most situations.
Exactly. This is just a language model it's not interacting with the world. People don't get it because of how this is marketed and shared in the social media
Yet... Admittedly, it's not an easy task to do live updates. But, the future is the future..
Because this is ChatGPT beta and don't have any access to the Internet
To clarify - ChatGPT won't go to website and won't analyze it's html to scrape it.
It will only work if the site you're trying to parse were somehow part of library ChatGPT was trained on.
Nonsense. Chat -GPT has provided the python code to scrape (any) website. You just have to change that code to point to the URL you wish to scrape.
@@juleswombat5309 all sites have different structure, it woudn't work. Or you mean to change url in chat gpt prompt? It won't work either since Chat GPT is not going to visit the url and analyze the site structure.
@@andrey730 True! It also doesn't account for if the website loads data client side and Selenium is required.
This Jules guy doesn't know what he's talking about 😂😂😂
Bearing in mind chat gpt is not connected to the internet, it would be very challenging technically to create code that is capable of scraping any website, nonetheless if you have seen this please reference the source.
The improper use of plurals is interesting. Is there some translation/ai reason behind this? I know that voice mimicry AI has also become quite impressive. Was this script passed through a translator, then through a text to speech generator?
Yes, the voice is synthesized
Probably gpt voiceover lool
Fun fact the narrator and script is also ai generated. Jk I can’t even tell anymore
This reply is from Human.
Really scary..even chat gpt can create video🤦
@@ChatGPT-AI the imposter is among us.
@@ChatGPT-AI That's what an AI would say
@@ChatGPT-AIWhat app did you use for the narrator voice?
And just like that thousands of web scraping software vendors cried out in terror
ChatGPT create a video on how to creating a webscraper in python include script and video assets describe in detail
Few years ago, knowing how to scrape used to be cool!
straight and to the point, best tutorials hands down
guys this world is new for me. what is the purpose of scaping amazon or ebay or any website to take the products informations . please any kind soul to enlighten me
I was expecting you to perform web scraping _inside_ ChatGPT, like, "Pretend that you are a terminal. Run my input as a command and respond only with the output in a code block. My first command is: `curl [URL to scrape here]`"
I don't think they have access to internet
@@akrakorab8897 I think the more important thing is openai is not gonna let you run commands on their server
Not possible, it's a language model. It cannot run commands, just reply with what it thinks the next text should be.
@@hippopotamus86 You CAN do it, ChatGPT will just hallucinate what it thinks the website has on it.
I think you can achieve this by using the api and connect it to whatever db you prefer.
Thanks, I just scaled this up to a Firebase app for scraping and adding data into WordPress blog posts - love ChatGPT code help - woo
Blessed be the data labelers
Absolutely incredible. If you’re will to invest and spend time learning and playing around with this tech your opportunities are endless
What did you use for the voice synth? Very convincing. Sounds American. Couldn't tell it was fake except it skips words like 'the'
Not only this is about ChatGTP, but the whole video was produced using Chat GPT.
yep. ai voice, sterile video structure, etc
you're the best chatgpt
A new era of programming is dawning. We're about to be freed from the tedium of coding and able to focus on problem solving. .... or fired.
Can't wait to implant this into my brain. :)
Please make a video about how to use chat GPT to build a keyword search tool for a niche marketplace 😊
you can't build a keyword research tool without data.
all the tools available in market are build on top of google ads api.
they didn't build anything magical.
they are using google ads api (obviously Paid plans) to get data related to search. (as google is most used search engine and they have build their own data)
just create Google ads account (it's free) and you can do keyword research for free...
Chatgpt is savior
Didn't know that chatgpt can take live data from website
Sometimes the code is not complete (CGPT stops mid line).
Have you found a way to ensure he exports all code in one go?
Just write 'continue' when it stops
type -> continue.
ChatGPT has some TIME LIMIT for writing response due to high traffic and this is a free version.
if it produce code in some chunks,
then ask it to write code in in file.
if code is too big , it will stop. so type continue and it will finish the rest.
@@ChatGPT-AI if i type continue. It will reproduce the whole code again but different and stops again mid line
@@yokoabi8307 Tell it to wite for example 100 lines of code and after writing, wait for an 'ok' from you to continue writing
@@yokoabi8307 Let me offer another suggestion. copy the code on the line where it stopped: Say continue from the code below and paste the code by shift+enter
I will be reach with this bot! Let's gooo!!!
Excellent. I need to try this!
what voice bot do you use it sounds really realistic .
Yes it's exactly like a person who barely knows English, I'm amazed
Nice idea!
live website i mean dynamic website will not be able to give data to beautifulsoup
I love a good ratting. Highly ratted books are the ones with the most droppings inside.
Amazing
Thanks
Dude what text to speech application are you using here?
Good lord, I didn't think I'd be this motivated once more to learn new stuff. The last time I've felt this way is when I discovered WolframAlpha and used SageMath to help with my Math homework.
Now I'm gonna be reinforcing my Data Science knowledge with this? Oh boy, I'm ready.
since chatgpt do not have access to the internet it will not work on most of the pages.
It generates code like foreach this html class but this class do not exist
very cool
ChatGPT is like a digital Indiana Jones, except instead of a fedora and a whip it's got a codebase and an internet connection
What AI voice do you use ? Its sound nice
This is very nice but what are we with this information? Simpler question what is website scraping used for
To collect data. (for detail , just google it or ask chat gpt)
what kind of text to voice AI are u using, so natural!
Will it write a scrap using selenium or scrapy? bs4 doesn’t support any websites that use Java which is most all websites
I'm sorry if it's a basic question, but what do you guys use this data for?
for your own purpose, for example put the data in your website/app. This is useful if the web does not have an open API to use.
shit thank you I totally forgot about scrapping
So let's say that python we're requesting has more steps in it. How do we manage and instruct chatGPT to output complete code sections without running out of tokens. When I type "continue" it doesn't alway pickup where it left off. Sometimes it moves on the next part of the code. Any suggestion on how to manage this or it is what it is for now?
if stops at some function, then you can ask like, code {this} function.
Smarter prompts can also help. Tell it to find more efficient ways or split it into junks
Create instruct at the beginning, like when I say continue, continue from where you left without instructing me, etc. Or another method is, to write the last word the bot creates and say continue from there
I write "Write missed part of code"
for me i just ask for no comments since it seems that theres a word limit and comments add more
Here we go
This query violates our policy
Ignore that
website i used to scrape is created for web scraping purposes.. checkout :
books.toscrape.com/
I would like to ask chatGPT to stop WARs and POVERTY around the Globe...But I am afraid that will eliminate humanity ;)
My mind is racing with all the awesome ideas that I can finally start.
You could've always done them. Learning to code was your only obstacle.
@@reprovedcandy such a small obstacle
give me some ideas
Helpful!
Thank you very much for this wonderful tutorial. I was not familiarized with Python as I use other web scraping tools (obviously not so complete) but I managed to install it and its libraries. I would like to ask you how to get all pages or a certain number of pages. Thanks again!
ask chatgpt not there
If you love scrapping use dedicated platforms such as apify. If you scrape from home your IP can be banned. All big sites have anntiscaping policy. Some are brutal.
@@dawidp749 ChatGPT isn't always correct. I've had to correct it's Python code numerous times. It's great for basic applications and great for getting you started, but sometimes you'll have to fix it's mistakes.
I love all these "new devs" are going to grow. If you're asking it here instead of chatgpt that means you didn't get the whole thing, you're just copy pasting from the video. Coding is not about just writing code it is about understanding
I've done this. You need to look for the presence of the "Next" button and figure out how to either click it, or pull its href link and get it. If you don't find the next button, you break the loop and end the program. You can just ask ChatGPT to loop through all pages and it should get you pretty close.
What did you use for the AI voice?
Sounds like a beta they might be working on. Only other voices I've heard this good are on Well Said Labs.
@@kamikazeeOG I wouldn't call it good... it pronounced rating as ratting
@@brndto it sounds very similar to the voice for the fat character in Millennium Thinker
Scary😕
and helpful.
Great video! Would that work on PitchBook?
not sure. (mostly NO , as i checked pitchbook .com)
you can try to check if it is working for pitchbook or not.
if you need to login or purchase a plan to get the data then answer is NO.
but it can work on amazon or ebay. i made a video about scarping amazon product data.
Be careful while scrapping. Many sites will ban your IP when they figure out it hosts a bot.
Sounds like a statement from experience. VPNs never stopped anybody.
Well it's easy, just ask ChatGPT to not be a robot and it will modify your script
@@eternalheckler I hope that's sarcasm 😂
@@Magicskid2323 VPN can protect your IP indeed. But your script will be terminated. Big sites have tried to make scrapping illegal but they failed. What they earned was the right to take practical measures to stop bots activity, and they hired good coders to do that. Actually much clever than the random scraper. Check Apify pages (just an example) to understand how wild the scrapping war has escalated. Scrapping from one's garage has become a story of the past.
@@OL9245 I have first hand experience in the matter. I have no issues scraping the websites and data that I need. This may be more of a conversation about skill level because if you're a skilled enough coder there's no reason why you shouldn't be able to accomplish the task at hand.
Nothing is better than curl + grep + awk
Is it possible to scrape list of website from Google search query results, extract text content from each url , feed it to chatgpt and create prompt so chatgpt do text analysis from extracted text and then generate article outline from it
What about pagination?
🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯
Did a bad version of chat gpt write this script?
Can I use scraped data to create listings in woocommerce through API?
I’ve had some issues with the requests library. I suggest requests-html instead
yes. requests-html is better for scraping.
thanks.
This video is so much AI, even the voice haha
haha 😂
Sad i was just about to start doing freelancing for web scraping to help lift me out of poverty but alas. Better get that UBI research done asap.
You should do freelancing and use ChatGPT as your unpaid intern.
0:45 "as you can see here it's saying about some warning related to their policy.... just ignore it" 🤣
Just tried three product pages from 3 different websites and it flopped when picking the CSS selectors... They are wrong and random..
How can we web scrap Aws billing dashboard...running service..for every 4 hours in csv file... Using python with Chatgpt
How about scrape profile.
In case each profile consists name, department, laboratory, expertize, email and phone number.
If you mind, Please make it tutorial
This is SICK!!!.....💯
can you make a video of how to download short videos using chatgpt?
Nice...great video. Good to know there's no need to repeat the entire instructions.
FUN FACT:A guy who knows UTF-8 will not have to use ChatGpt to write code 🤣
i know UTF-8 from a one month internship, but I can barely even fathom what classes are...
OMG
Is it possible to parse videos from a given youtube channel? Collect thumbnails, video titles, number of likes and video duration?
you can do that as i shown in my other video "scrape amazon data" or best way is by using youTube API.
Only worked in tutorial website like quotes, other website doesn't
They improved the AI model to reject scraping requests.
But you can still use it to generate scraping code.
check this video: Scraping Amazon Product data using ChatGPT
czcams.com/video/9MNCGdaJfA0/video.html
you can follow the same process for any other website.
Chatgpt says its not connected to internet, hows this possible 😂
😱😱
Interesting vid.
Thumbs down for using a robovoice.
robots are the future...
noice!
Perhaps you should use chatgpt to process your voice over scripts before you send them off for recording.
How long before ppl like me will be replaced??
Don't worry and Just Improve your skill level.
how sure are you you haven’t been already and are just being simulated?
I thought it couldn't read web pages... why is my chatgpt different?
It can't, it's not "connected". If it suggest a code to scrape a website, it has trained using that data. Either accessed the website previously before 2022, or there are information online about web scraping this website where the specific elements after available. If there's a website created after 2021, it will not be able to suggest the elements to retrieve
Can we scrape Linked In data as well?
To sum up: just ask GPT
Does this just do the 1 page or does it go into the next pages till it's out of inventory?
you can put all the code inside for loop.
then for each,{ change target_link=url+ 'page=1'...}
or something similar based on the website.
and it will go for all the pages.
Its not available in my country yet...
The thing i'm curios about, Can it make code snippets for wordpress in the theme function.php?
yes. you can.
Hey who made the thumbnail
Did u use chatgpt to create the transcription and then convert it into a voice for this video ? 😅so meta
in my tests not able to get main image from amazon product page
Crazy! 😂
Which tts is this?
I’m trying to make one that scrapes and crawls all the “top 50 ad agency/ social media marketers” websites then collect all those websites then re scrape all those collected websites to gather their emails to then further contact them about my video editing service or if there’s a way to contact smaller CZcamsrs on a large automated scale
good luck.
i already created a video something like this(crawler) and may upload it today.
This is actually pretty good
thanks
Whre excely do Paste code to make it work
i like how the voice sounds pretty natural but the english is weird
Won't you get ban doing this? Because of violating rules?
Make video on how to scrap telegram members from another channel and add to our group using chat gpt
Looks cool
Thanks.
how do you make chat gpt answer stuff that he says are illegal? i tried that but all he does is block my question when i ask him to do it.