Comprehensive Python Beautiful Soup Web Scraping Tutorial! (find/find_all, css select, scrape table)
Vložit
- čas přidán 21. 07. 2024
- Practice your Python Pandas data science skills with problems on StrataScratch!
stratascratch.com/?via=keith
In this video we walk through web scraping in Python using the beautiful soup library. We start with a brief introduction to HTML & CSS and discuss what web scraping is. Next we start getting into the basics of the beautiful soup library. This includes how to load a webpage, the basic commands you need to know such as find & find_all, grabbing strings from an HTML elements, etc. The final section of this tutorial is a series of exercises where you can practice your skills. In this section we scrape a webpage for links, we learn how to scrape a table and load it into a pandas dataframe, and we see how you can scrape & download a web image. Hope you enjoy!
I’m looking into making future videos on more complex things you can do with web scraping as well as other libraries that are helpful such as Selenium & ScraPy. Subscribe to not miss those.
Join the Python Army to get access to perks!
CZcams - / @keithgalli
Patreon - / keithgalli
---------------------
Resources used in this video
Simple webpage: keithgalli.github.io/web-scra...
Example webpage: keithgalli.github.io/web-scra...
Link to source code: github.com/KeithGalli/web-scr...
Beautiful Soup Documentation: www.crummy.com/software/Beaut...
CSS Selector Reference: www.w3schools.com/cssref/css_...
---------------------
Learn more about HTML/CSS
@Traversy Media HTML Crash Course: • HTML Crash Course For ...
@Traversy Media CSS Crash Course: • CSS Crash Course For A...
Codecademy: www.codecademy.com/catalog/la...
---------------------
Video timeline!
0:00 - Intro & Video Overview
1:09 - What is web scraping?
3:51 - Introduction to HTML
Using the beautiful soup library (5:29)
6:31 - Loading in a webpage (requests library)
8:21 - Starting to scrape
9:18 - find & find_all methods
16:00 - Finding specific text/strings in our HTML (regex)
18:38 - Select method (CSS path selections)
25:55 - Grabbing the string/text from an HTML element
28:17 - Getting a property of HTML element (href, src, id, class, etc)
29:41 - Code navigation (parents, children, siblings)
Let’s practice our skills! (33:57)
35:53 - Exercise #1: Grab all social links on webpage in 3 different ways
42:09 - Exercise #2: Scrape an HTML table into a Pandas Dataframe
53:09 - Exercise #3: Grab all fun facts that contain the word “is”
57:59 - Exercise #4: Use beautiful soup to help download an image from a webpage
1:04:20 - Exercise #5: Solve the mystery challenge!!!
---------------------
Follow me on social media!
Instagram | / keithgalli
Twitter | / keithgalli
---------------------
If you are curious to learn how I make my tutorials, check out this video: • How to Make a High Qua...
*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.
I made a new tutorial building off of the knowledge learned in this video! Check it out!
czcams.com/video/DcI_AZqfZVc/video.html
Shouts to Keith for giving us all an MIT education without the MIT debt
Haha I took one for the team xD
haha
Haha. That was a good one.
@@KeithGalli how to start never been good in math 50 years old sitting at home? thnx;-)
@@KeithGalli since breaking bad minivans are you know swag 😉
I have watched a couple of other videos on BeautifulSoup but believe me this one from Keith is the best one. Keith will take you from scratch to a decent level. Thank you so much.
Keith, many thanks for giving us too many excellent information about hard topics. You do the things seem totally simple to do. Sincerely, your tutorials are the best. Again, thank you so much for sharing all of this with us.
This tutorial was incredibly helpful! Web scraping is something I've always found interesting but just hadn't been bothered to start learning, yet this video made it easy to understand and covered a huge range of ways to deal with potential problems. Seriously can't thank you enough for this video and will certainly be sticking around for any new tutorials you upload.
This is one of the finest videos i have ever seen on training. You are an amazing trainer and most importantly you are explaining things in very simple english, also with examples or exercises that would give an hands on experience for viewers......thanks.
I love that you have exercises for us to do in the videos! Learned so much from this.
One of the best web scraping contents I've seen to date. But the ending was hilarious!
Another fabulous real-wold tutorial. Thanks for the google and stack overflow searches and the errors with recovery!
I wanted to attempt your recent Advanced web scraping tutorial where I then stumbled upon this amazing tutorial and I'm so glad I did! Thank you very much
Keith you'll be the first one I cite when I write my nobel prize winning book or whatever it is nobel prize winners write. Golden content. Gracias!
Keith, your videos are excellent. You are totally getting me through grad school just watching your tutorials. Keep it up!
The Best thing about your tutorial are that you start from scratch and teach basic and explain each fragment of code with concept. Love from India.
Hi Keith,
I'm really excited to watch this video. Actually, I used to watch your all Python-related videos, especially the Pandas one.
Keep going, and I hope to meet you one day.
THANKS, A LOT
The best python video i have ever seen. No wasted words, dive into the important topic. Lol, great!
Thank you Keith, amazing content, easy to follow, clear explanition, great exercices (with walkthrough) and love the funny breaks/comments during the video. Followed and like
This is by far the best tutorial I have found after searching through the internet for hours. I subscribed just because of this one great video. Please keep doing videos of practical applications of Python. Project tutorials are the best.
absolutley
As someone earlier said, Big SHOUT OUT to Keith for getting the community such amazing content!
Thanks a lot! Your video are clear and pretty useful! And it’s a joy watching them! I’m glad that I found your channel ✨
i am from india . we really dont get this quality stuff here.. so thanks to youtue and you.. for spreading wonderful knowledge.. keep rocking !
Yes! Awesome tutorial dude. Looking forward to your next web scraping video. Cheers!
Value for time invested in watching your videos. Along with the subject knowledge, we understand how to practically approach a problem. Thanks a ton for sharing your knowledge.
I paid a bootcamp for learning. But Keith you are way above all that. I understood the concepts from your video only. I owe you man!! Keep going and please don't stop putting up such videos.
I appreciate the support! Happy that the videos are helpful
Hey this tutorial is great! I've been looking for a decent one like it for some time now and I can't believe it took the algorithm this long to show this on my recommended page
Wow, really impressive. One of the best channel ! Keith you are very clear with your explanations.
Thank you for sharing your knowledge :)
Thank you so much for this wonderful tutorial Keith! Words cannot describe how much I am grateful to you for making this gem of a video that covers everything you need to successfully scrape a webpage! Trust me when I tell you that NOBODY HAS MADE A BETTER VIDEO ON BEAUTIFULSOUP than you!!! If I could have the liberty of suggesting future videos, I would love if you made a video about "Regular Expressions". Keep up the good work and God bless!!!
Very happy to hear you enjoyed!! A regex video is a great idea :)
You saved my life, I hope you're getting all the beautiful things in life you deserve
by far the best tutorial on youtube for web scraping. you are very good at dumming it down, even total beginner can even understand.
waiting for NLTK tutorial.
thank you
Glad you enjoyed it!
Your tutorials are the best, honestly. Thank you so much for doing this.
Glad you enjoy them!! You're very welcome :)
Thanks for sharing this! I am mostly just popping in to learn, but this is helping me know how to think about data & see that there are a lot of options.
one of the best beautiful soup videos, and really want to say thanks! Keith
I am very glad that I found your videos. I learnt more from you than all other tutorials combined. Please do a tutorial on xlwings. Thank you
This tutorial was incredible. I've done 2 Python courses that touched the 'Web Scraping' subject, but I wasn't able to fully understand it. This video was one of the two videos that made me fully understand it, and I couldn't be more happy about it. And finding out the secret message was amazing too :D
wanna share the other video you found helpful ? :)
This is a fantastic tutorial. When I last tried to learn beautiful soup, we were in the awkward transition phase between python 2 and 3 and every tuturial was in python 2 because they hadn't released code for 3 yet. I learned 3 because it was "the future". Of course, I then wanted to use BS so I had try and figure out what I wanted to do in python 2. I gave up in total frustration. This is a crystal clear guide and now I actually understand how it works and how to use it. Thanks Keith!
Happy that this tutorial could clarify the details and remove some of that frustration! :)
This is such a great tutorial ! I loved being able to pause and figure out the problems on my own. I really learned a lot! Thanks Keith, you rock!
did the same 🤓🤓
Only a third of the way through this video and I already feel like I understand this better. Thank you, brand new at this
Well done! First class of Web Scrapping! Awesome
Man watching that ending was almost like watching Jack sink, beautiful ending!! keep it up man, great content
Your video really help a lot to understanding the Beautiful Soup, thank you, Keith!
I like how you make simple stuffs that were really scary. Bravo man.
Oh man.. your tasks are excellent. It helped me to get a better confidence in working with soup..
Great video .. and I watched A LOT videos about beautiful soup. Keep going with the series
Very very good video and great exercises specially last one.
Thanks for such videos
Subscribing, coz I loved it.
Glad I found you @keith.
Exploring your channel now.
Appreciate the way you did it so perfectly making it simpler to understand for me.
Thank you Keith! This is the best video that i watch about bs4. 👏👏
This is a very will organized web scrapping tutorial. Thanks for sharing.
LOL, loved the secret message. Great work, thanks for the video
This is the best web scraping tutorial. Thank you so much!
Brilliant, amazing channel. Major kudos to you Keith!!
This was a fun video! Thank you Keith Galli.
You have a lifetime sub from me. Been looking for videos like this for a long time. Keep up with the great content!
Amazing.
Thanks Keith!
Looking forward to the Selenium and scrapy series.
You're welcome!
@@theduck3126 Try John watson Rooney channel.
He's got everything covered.
You are perfect ! You know how to teach. Thank you so much man. Liked your style, and got the subject i have been struggling. Liked and subbed.
Wow path navigation is so powerful! Thanks for this!
Please do a Seaborn Tutorial ! like you did with Pandas, Matplotlib etc. I watched all of them, really glad i found your channel. Simple, informative & on point.
@Lucas agree, Derek Banas has a great Seaborn tutorial at his channel!
If you know matplotlib you know most of seaborn, its a matplotlib wrapper. all matplotlib methods work in seaborn too
so surprised to find treasure youtuber here, will go through all your perfect tut in my summer holiday, hope that u will gain more and more subscribers~
This guy deserves the world
Beautiful video Kieth. Loved it.
Thanks Keith. A really great video. Keep them coming, really useful videos I am learning a great deal from you. Many thanks.
How do you know you’ve learned something ?
Completing the challenge within 1 minute no hints. Thank you so much for all your efforts :)!
man when i accomplished that secret word challenge on my own i swear to god i just clapped for u and wished i hadn't already liked the video and being a subscriber so i can like and subscribe. thank you very much keith. you're like my geeky genius best friend that im learning from
Haha this comment put a big smile on my face :). I appreciate the support!
i learned numpy ,pandas and other things from ur play list. i was strucked for the past 3 days in webscraping i watched a lot of yt videos bt i coudnt understand as ur content...Thank you so much brother :D . Now i hit(smashed) the bell icon too...
Awesome glad this video could help clarify some of the confusion you had. Thanks for smashing the bell icon! xD
Thank you Keith, love your tutorials ! I was able to solve the last exercise :D
The idea with the secret message was super cool!) You've got that like! Well deserved.
Glad you enjoyed it! I had fun setting that up :)
Bro you're great at these videos. Keep it up. I'm very glad I found your channel and I'm learning a lot from you.
Regarding the task of getting the "is" from fun-facts, you can get them by this simple one liner:
[li.get_text() for li in webpage.select('ul.fun-facts li') if 'is' in li.get_text()]
no regex, no extra loops... just plain string methods with list comprehension!
Man you save my life. your tutorials are amazing.
Thanks for the tutorial Keith! Keep up the great work
Amazing Lecture. Here we understood Everything. Thanks a lot Broo 🔥👍🙂
wow a fun exercise !! Have a great fun , Next one is the Pandas One
Thanks for great tutorial, Keith!
Wow! This is just too good. Thanks for the video Keith
of course i will smash that button!! Sos un crack amigo, gracias por la buena onda y dedicacion!
Thank you for the video!
You explain things so clearly
Great video and well done. I learned a lot from it. Thanks Keith!
I literally hit like because of that secret message lol. Cheers man really grateful for all your content
This is exactly what I was looking for. Thumbs up Keith for this awesom video :-)
52:30 Not sure if this was posted before but this works for the duplicates. Thanks for all the help Keith!!!
import pandas as pd
table = soup.select("table.hockey-stats")
df = pd.read_html(str(table))[0]
df
Very good video Keith! Very clear and useful. Thank you.
The last time I tried to understand BeatifulSoup I gave up. You explain it so easy to understand. Thanks for the hard work and the time you spend on teaching us :)
Love to hear it! You are very welcome :)
Me too! It's almost like Keith is a godsend
@@rahuldavid4831 he sure is :)
It was my first time learning from you and I must say it was pretty awesome:-)
Awesome video av always! Would love to see tutorials for selenium and scrapy aswell. Also PyTorch and Seaborn would be very interesting to learn more about! Your videos are soo easy to follow and learn from :)
That's the data scientist's way to tell "Like and Subscribe ". Thanks for sharing knowledge!!
best most concise and detailed tutorial on bs
CV Update: Web Scraping expert.
Joke aside what an awesome tutorial. Felt so satisfying to get the secret message with what you taught!!
Brilliant work!
Haha love to hear it! I had a lot of fun putting this one together, so I'm happy to hear that you enjoyed it :)
@@KeithGalli your tutorials are really appreciated! thanks man :)
LOL. Nice secret message.
I appreciate your effort :)
Love your videos im watching them nonstop...thank you❤️❤️
*SMASH*ing that like button with all 10 fingers! Wish I had more fingers! Thank you so much! Amazing tutorial. You are so meticulous and your teaching is very methodical. Thank you!
7 star out of seven .Thanks a lot keith , i found your tutorial after searching a lot about web scrapping , can not compare your lactures with any . plaese make one video with road map of web scrapping , next expected tutorial from you on Machine learning , thanka again .
Son increibles tus videos!! Gracias Keith
Really learned a lot. Loved the exercises.
When keith do it, its perfect 🤩
Aww I appreciate that 😊
One spot for all my Python needs. Thanks Keith! ; )
As usual.... Awesome Tutorial!!!
i wish i had a cool teacher like you
Thanks alot for the comprehensive tutorial! Really appreciate it
Hi Keith, i second many of the viewers comments. Your tutorial on web scraping is by far one of the best ones out there. Thank you so much for producing this. I do have a question though. Hope you can help clarify, I've not had much success googling. Can you clarify what the difference between select function vs. find_all function? when would you use one over another?
The best videos! Love your videos and way to present the ideas.
Are you god? I have a simple approach to your videos. I like them first, then I watch the video. Thanks a lot man, you and few others youtubers are going to put universities out of business
Bro loving your videos and the cool stuff you teach us. It really helpful for me. hope you and your channel grow well. plus you keep posting these crystal clear skills well. hoping too see new video soon.
Very happy to hear that you've been liking the videos!
Dang! that was a good tutorial. I love you Keith, sincerely.