Master Web Scraping | Python Tutorial - Make an extra $500 over a weekend. Up your Amazon FBA Game

SATSifaction

zhlédnutí 33 080

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 10. 06. 2019
Learn one of the hottest skills in Data Science today and that is web scraping. Combine that with the ability to analyze data and you've just marked yourself as a hot commodity. Companies charge thousands per month per client using the same techniques you will learn in this video.
Code on github: github.com/satssehgal/Booksto...
Watch how to do this with Selenium 👉 • Web Scraping with Sele...
Watch how to do this with Scrapy 👉 • Introduction to Scrapy...
Learn how to scrape the same page with scrapy: • Introduction to Scrapy...
Github link to code: github.com/satssehgal/Booksto...
👉 Facebook Group: / theaiwarriors
👉 Instagram: @theaiwarriors
👉 Corporate Training and Upskilling: levers.ai
Netfirms (Affiliate) - bit.ly/2KdJ4Dp
Linode Server - bit.ly/2XpqGi9
Bluehost (Affiliate) - bit.ly/2GxxBh1
PythonAnywhere (Affiliate) - bit.ly/2kWORVe
Heroku - www.heroku.co
NordVPN (Affiliate) - bit.ly/2W87je0
Here is a link to my python for beginners, master python course: bit.ly/2HIZS42
Music: ONE by Lahar / musicbylahar
Creative Commons - Attribution 3.0 Unported - CC BY 3.0
Free Download / Stream: bit.ly/ONE-Lahar
Music promoted by Audio Library • Video
Jak na to + styl

Komentáře • 81

@SATSifaction Před 4 lety ⁺²
Watch Next --> czcams.com/video/p42e8NBnrGI/video.html
@oscarmartinezbeltran Před 4 lety ⁺⁷
I love your tutorials when you explain line by line every pice of code. Thank you!!!!
@BrianThomas Před 4 lety ⁺²
Wow!! I just got to the end and I'm so floored. I had no idea. You made this so simple to follow.
@powerb_i Před 4 lety ⁺²
This was by far one of the best videos I've watched on web scraping and I finally "get" it after watching many tutorials. Well done and thank you for explaining this so well . I finally have an answer to my project that I've been trying to solve for weeks. Great job!
@patrickstingley4126 Před rokem ⁺¹
This was a terrific tutorial! You are very clear and easy to follow/understand. I have subscribed and I'll be reviewing your previous tutorials. Thanks!
@curioussouls5438 Před 3 lety ⁺²
This is exactly what I was looking for and even everyone out there I am sure. Thanks a bunch dude! 😊🙌🏻
@tjzz425 Před 4 lety ⁺²
I just started studying data science and your channel has everything i been thinking of learning! Thanks for your great work and sharing it! You are awesome
@SATSifaction Před 4 lety ⁺¹
Thanks glad it’s helping
@christianjimenezhernandez4262 Před 3 lety ⁺¹
I followed you step by step and this is amazing. Thank you very much for your time, and patience at clarifying, teaching and sharing your knowledge.
@SATSifaction Před 3 lety
You are very welcome
@nicolewang6455 Před 4 lety ⁺¹
Thanks a lot! My first successful web scraping, it means a lot to me!
@SATSifaction Před 4 lety ⁺¹
Amazing keep up the good work
@lajosfidy3785 Před 3 lety ⁺¹
Great vid, very useful. I found a small tweak for the title extraction, as titles were getting cut off with "...". The full title was stored in a tag with the , a a "tilte" parameter. for i in soup.findAll('h3'):
#print(i.a.attrs['title'])
ttl=i.a.attrs['title'] #within the h3 tag is an 'a' tag, that has a 'title' parameter with the title in it
titles.append(ttl)
@jaymanalastas Před 5 lety
Nicely done. Thank you for sharing!
@AliRaza-vi6qj Před 2 lety
its really work thank you so much sir, carry on great work.
@patrickdancel5627 Před 4 lety ⁺⁵
Thank you for this tutorial. The way you teach and explain makes it easy for dumb dumbs like me to be able to follow along. You just got a lifetime subscriber. Keep up the great work and would love to learn more from you and your tutorials.
@SATSifaction Před 4 lety ⁺¹
Thank you 🙏
@BrianThomas Před 4 lety ⁺¹
I have to fully agree with you man. You really got me hooked for life.
@davebeckham5429 Před 4 lety
Excellent tutorial. - Thanks for sharing.
@chetanvgoudar9079 Před 3 lety
the level knowledge you have is awesome..
@sivexokashe6423 Před 2 lety
Damn, this was satisfactory well explained, thanks
@adnanyounas2541 Před 3 lety ⁺¹
Thanks for such a valuable stuff 👍👍👍
@SATSifaction Před 3 lety
Awesome glad it added some value for you
@fraann Před 4 lety ⁺¹
Thanks! I integrate this tutorial with Flask and Mysql
@SATSifaction Před 4 lety
Amazing keep it up
@blood4bones366 Před 4 lety ⁺¹
Thanks alot , Works well for me
@SATSifaction Před 4 lety
+Adebayo Taiwo awesome
@deedanner6431 Před 3 lety
You did an excellent job explaining the process of web scraping .
I have one question. How is it that tags can receive src as an argument? I understand why you did it but not how it works.
Thanks!
@BrianThomas Před 4 lety
Question. I'm looking to pull data from multiple Excel spreadsheets that is located on a SharePoint and add the data into a new database. Would there be an easier way of doing this or would scrapping the data be just a simple?
@morello6061 Před 4 lety
Great Video Thanks
@thennarasuthen9179 Před 2 lety ⁺¹
Please zoom in a bit...Great video...thank you
@leventbozkurt9796 Před 2 lety ⁺¹
Great teacher
@SATSifaction Před 2 lety
Thank you
@drac8854 Před 4 lety
How to scrap an image which gives a status code of 302?
@icedgodz428 Před 3 lety ⁺¹
Hi, I think there is 1 error in this tutorial - 31:35
When you click on cell A1 on the Excel sheet, the title reads as A Light in the ...
Instead of A Light in the Attic which is how it is represented on the website.
Any clarity on how to get the exact/full title?
Other than that, video was flawless
@jackhales6179 Před 3 lety
Potentially, a website you scraped had a propensity to not show the entire string until a page loaded in or the title was clicked. Just an idea - check your raw data with a print or something. (I haven't watched the entire video.)
@lenac3587 Před 4 lety ⁺⁹
Hi thanks for the tutorial. Your screen is too small to read the codes comfortably. There is huge blank empty spaces on each side of the main frame, if you could zoom in abit more.
@SATSifaction Před 4 lety ⁺⁴
Thank you for the comment. Yes you are right. I posted the code on github so you all can follow. In most of my newer tutorials I’ve switched to jupyter notebooks and it’s a lot more clear.
@williambeasley838 Před 3 lety
@@SATSifaction I just finished a Data Analytics course at the university of Miami and they did everything through python in anaconda and Jupyter notebooks and Visual Code.
@BobGamble Před 2 lety
Excellent tutorial and well explained. I had one hangup and I spent a day on trying to figure it out. I used Google Colab and this is the first time it threw this error. I couldn't put the dataframe in to an excel file. I kept getting no such file or directory. Finally, I put it into a Linux terminal and it ran without issues.
One thing I'm not sure of, is does the data repeat? It does for me.
@SATSifaction Před 2 lety
It doesnt for me. You can probably alter the code to suit your needs.
@user-lh4hv3tx8b Před 3 lety
I was getting a value error when running the code. I could use some help. It says "ValueError: arrays must all be same length". Any help would be very appreciated. I've attached the code below I
import requests
from bs4 import BeautifulSoup as bs4
import pandas as pd
pages = []
prices = []
stars = []
titles = []
urlss = []
pages_to_scrape = 5
for i in range(1, pages_to_scrape + 1):
url = ('books.toscrape.com/catalogue/page-{}.html').format(i)
pages.append(url)
for item in pages:
page = requests.get(item)
soup = bs4(page.text, 'html.parser')
for i in soup.find_all('h3'): # Gets tiitles
ttl = i.getText()
titles.append(ttl)
for i in soup.find_all('p', class_='price_color'):
price = i.getText()
newprice = price.replace("Â", "")
prices.append(newprice)
for s in soup.find_all('p', class_='star-rating'):
for k, v in s.attrs.items():
star = v[1]
stars.append(star)
divs = soup.find_all('div', class_='image_container')
for thumbs in divs:
tgs = thumbs.find('img', class_='thumbnail')
urls = 'books.toscrape.com/' + str(tgs['src'])
newurls = urls.replace("../", "")
urlss.append(newurls)
data = {'Titles': titles, 'Price': prices, 'URLS': urlss, 'Stars': stars}
print()
print(data)
df = pd.DataFrame(data=data)
df.index += 1
df
@previncoin8592 Před 3 lety
I get this error:
TypeError Traceback (most recent call last)
in
17 page = requests.get(item)
18 soup = bs4(page.text, 'html.parser')
---> 19 for i in soup.findALL('h3'):
20 ttl=i.getText()
21 titles.append(ttl)
TypeError: 'NoneType' object is not callable
@michellelee7585 Před 4 lety
I think the code works if the url format actually has it's pages (i.e page 1 - page 5) increases incrementally by 1 at each click, but don't think this works with say Amazon or other sites. How do we go about that ?
@SATSifaction Před 4 lety
The code is site specific. With Amazon I would use scrapy and for pagination they have a method for next page in scrapy that would handle that for you. It’s explained well in the scrapy docs
@jay-rp6bm Před rokem
Awesome boss, Do you think this field of data scrapping will be needed more and more. its 2023 right now and I'm taking a Google Cert on Data Analysis ,Do you think we can be still valuable in Data Scraping ? Its not taught in the class by Google. Please Advise >..JAY Thank you
@danloyer6241 Před 3 lety ⁺¹
Hi; I really enjoy this video , my background isn't in programming but I am really interested in this Web Scraping (Data Science) type of work but have no idea or direction as to how to get started, I know on Udemy has some related web scraping courses and how to learn HTML, CSS and Python etc.. I also know that some University or College do offer courses in Data Science, but the cost is very high almost $10,000 for a one year course.
@SATSifaction Před 3 lety
Hi there. Web scraping is a very good area to get into especially around data collection. There are a lot of great free youtube resources. I would first try out a few projects before paying that sum of money to an institution. Several time I’ve seen people invest a lot of money in a hot skill but later have no passion for it. To get you started you can view my web scraping courses...all free...enjoy -> czcams.com/play/PLM30lSIwxWOjrr-6zuMj28fC5RxrPY_Tc.html
@adnanyounas2541 Před 3 lety ⁺¹
Hi can you explain why you use format(i) little bit more
@SATSifaction Před 3 lety
You would use .format when you want to format a string to include a variable. If variable X=12 for examples the following code ‘I am happy to be {}’.format(X) would give ‘I am happy to be 12. Hope that helps.
@dennismartin5455 Před 3 lety
When I run the df.to_excel() I get the openpyxl file not found.
Other than that, good so far.
I can copy the code and run it in Sublime and Gitbash terminal with no errors. And the excel file is produced.
@jjasghar Před 2 lety
I had to do a pip install openpyxl to get past that error for me. I made sure I ran all this in a virtualenvironment.
@josecarlosalaodeoliveira9463 Před 3 lety
Pls, what is your file in github for this program, I found others programs but not this. The tutorial is great!!
@josecarlosalaodeoliveira9463 Před 3 lety ⁺¹
I already found the github file
@kasiopeaxerxa9754 Před 3 lety
If I am using Spyder to program in Python. Would you suggest to use BeautifulSoup or Scrapy?
@SATSifaction Před 3 lety
Either are fine though the two have different applications. BS is more for a quick and dirty web scrape while scrapy is more of a framework that has more applicability
@CurrentElectrical Před 3 lety
What do you suggest for scrapping sites that have a login?
@SATSifaction Před 3 lety
For logins i would recommend selenium. Check out my video on building a billing bot which covers the login process -> czcams.com/video/HsA0mJ4kNKE/video.html
@jesuschrist1501 Před 4 lety
i dont understand how someone can be ip blocked or something for web scraping... i mean is it not just reading the html code and searching and finding targeted tags and paths and then putting them in a storage and organize them... it seems like its something that's completely client-sided, how do they blacklist or find out you're scraping them?
@SATSifaction Před 4 lety ⁺²
To we scrape you will be using the request module. In order to get the data from their server you make a request to their server which will return the html content. Every request you make hits their server. If you make too many requests they can ban the IP address that makes the requests, in other words your ip. Also if a user doesn’t respect the robot.txt file which outlines what you can and cannot scrape then they can ban the ip for making requests that aren’t authorized.
@First_Principals Před 4 lety
I'm wondering why you decided to use jupiter books?
@SATSifaction Před 4 lety
No specific reason other than the fact that it’s a great tool to teach and train python with...
@muhammadatif2215 Před 4 lety
Sir there is one website bol.com it's hard to scrap would you please teach me how to scrap that website
@RajeshkumarPandidurai Před 4 lety
Hi. I can help you with this.. email more to take this further.
@joelfuentescuriel6174 Před 3 lety
As a full-stack web developer, that means a lot to me. My feelings are hurt as fuck, lol.
@riteshpatel-yz7rd Před 3 lety
How to scrape amazon products URL and asin number, please tell me
@SATSifaction Před 3 lety
Watch this video. It’s a similar concept that you can apply to Amazon. However Amazon is a lot more difficult to scrape. 👉🏼 czcams.com/video/NXNhqNyYpHI/video.html
@SL-yj9vt Před 3 lety ⁺¹
how would you turn the prices into usd?
@SATSifaction Před 3 lety
You could always connect to an an external API o convert it. It depends on how the website displays the data. if its in USD it will scrape that and you can add your own API for currency conversion.
@raviranjansharma8953 Před 4 lety
Thanks for sharing Sir, We are uncomfortable to read the code . These are too small, please share with zoom font size.
@SATSifaction Před 4 lety ⁺¹
raviranjan sharma thanks for the note. I cannot edit the video however I have uploaded the code on GitHub for you to use and follow. It’s in the link description. All my new videos use much bigger font 😊
@raviranjansharma8953 Před 4 lety
Thank you Sir
@BeingVikram16 Před 3 lety
What is exactly prerequisite to learn Web scraping???? Please Sir make video of prerequisite of web scraping 🙏🙏🙏🙏🙏🙏🙏
We need exact knowledge of learning web scraping 🙏🙏🙏🙏🙏🙏🙏🙏🙏
@LemonWarfare Před 4 lety
Great Video sir! NoobQuestion: Are Web scraping jobs only revolves in extracting these kinds of data then to be later shown in a table?
@SATSifaction Před 4 lety
+Jon Jimlin Sumalhay no there is more applications. Datasuch as prices, that are web scraped can be inputs to machine learning models like pricing algorithms as an example
@nitusidhu6808 Před 4 lety ⁺²
@@SATSifaction would love to see an example of that someday
@relaxinggospelmusic2421 Před 5 měsíci
🤑🤩
@storytimekids123 Před 4 lety
Awesome create a web scrapper using django with input url from user
@allrounder8816 Před 2 lety
Helpful tutorial.but poor video quality

Další v pořadí

Automatické přehrávání

Introduction to Scrapy API | Practical Python Web Scraping Tutorial (Part 1 of 2)