Comprehensive Python Beautiful Soup Web Scraping Tutorial! (find/find_all, css select, scrape table)

Sdílet
Vložit
  • čas přidán 21. 07. 2024
  • Practice your Python Pandas data science skills with problems on StrataScratch!
    stratascratch.com/?via=keith
    In this video we walk through web scraping in Python using the beautiful soup library. We start with a brief introduction to HTML & CSS and discuss what web scraping is. Next we start getting into the basics of the beautiful soup library. This includes how to load a webpage, the basic commands you need to know such as find & find_all, grabbing strings from an HTML elements, etc. The final section of this tutorial is a series of exercises where you can practice your skills. In this section we scrape a webpage for links, we learn how to scrape a table and load it into a pandas dataframe, and we see how you can scrape & download a web image. Hope you enjoy!
    I’m looking into making future videos on more complex things you can do with web scraping as well as other libraries that are helpful such as Selenium & ScraPy. Subscribe to not miss those.
    Join the Python Army to get access to perks!
    CZcams - / @keithgalli
    Patreon - / keithgalli
    ---------------------
    Resources used in this video
    Simple webpage: keithgalli.github.io/web-scra...
    Example webpage: keithgalli.github.io/web-scra...
    Link to source code: github.com/KeithGalli/web-scr...
    Beautiful Soup Documentation: www.crummy.com/software/Beaut...
    CSS Selector Reference: www.w3schools.com/cssref/css_...
    ---------------------
    Learn more about HTML/CSS
    @Traversy Media HTML Crash Course: • HTML Crash Course For ...
    @Traversy Media CSS Crash Course: • CSS Crash Course For A...
    Codecademy: www.codecademy.com/catalog/la...
    ---------------------
    Video timeline!
    0:00 - Intro & Video Overview
    1:09 - What is web scraping?
    3:51 - Introduction to HTML
    Using the beautiful soup library (5:29)
    6:31 - Loading in a webpage (requests library)
    8:21 - Starting to scrape
    9:18 - find & find_all methods
    16:00 - Finding specific text/strings in our HTML (regex)
    18:38 - Select method (CSS path selections)
    25:55 - Grabbing the string/text from an HTML element
    28:17 - Getting a property of HTML element (href, src, id, class, etc)
    29:41 - Code navigation (parents, children, siblings)
    Let’s practice our skills! (33:57)
    35:53 - Exercise #1: Grab all social links on webpage in 3 different ways
    42:09 - Exercise #2: Scrape an HTML table into a Pandas Dataframe
    53:09 - Exercise #3: Grab all fun facts that contain the word “is”
    57:59 - Exercise #4: Use beautiful soup to help download an image from a webpage
    1:04:20 - Exercise #5: Solve the mystery challenge!!!
    ---------------------
    Follow me on social media!
    Instagram | / keithgalli
    Twitter | / keithgalli
    ---------------------
    If you are curious to learn how I make my tutorials, check out this video: • How to Make a High Qua...
    *I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.

Komentáře • 449

  • @KeithGalli
    @KeithGalli  Před měsícem +1

    I made a new tutorial building off of the knowledge learned in this video! Check it out!
    czcams.com/video/DcI_AZqfZVc/video.html

  • @BennyHarassi
    @BennyHarassi Před 4 lety +433

    Shouts to Keith for giving us all an MIT education without the MIT debt

    • @KeithGalli
      @KeithGalli  Před 4 lety +130

      Haha I took one for the team xD

    • @Viralvlogvideos
      @Viralvlogvideos Před 4 lety

      haha

    • @hkemal2743
      @hkemal2743 Před 3 lety

      Haha. That was a good one.

    • @krishnahare3638
      @krishnahare3638 Před 3 lety

      @@KeithGalli how to start never been good in math 50 years old sitting at home? thnx;-)

    • @tanmaytiwari2450
      @tanmaytiwari2450 Před 3 lety

      @@KeithGalli since breaking bad minivans are you know swag 😉

  • @symnshah
    @symnshah Před 3 lety +5

    I have watched a couple of other videos on BeautifulSoup but believe me this one from Keith is the best one. Keith will take you from scratch to a decent level. Thank you so much.

  • @MarceloSantos-nc9wq
    @MarceloSantos-nc9wq Před 4 lety +3

    Keith, many thanks for giving us too many excellent information about hard topics. You do the things seem totally simple to do. Sincerely, your tutorials are the best. Again, thank you so much for sharing all of this with us.

  • @doomimic315
    @doomimic315 Před 3 lety +16

    This tutorial was incredibly helpful! Web scraping is something I've always found interesting but just hadn't been bothered to start learning, yet this video made it easy to understand and covered a huge range of ways to deal with potential problems. Seriously can't thank you enough for this video and will certainly be sticking around for any new tutorials you upload.

  • @santoshvaidya3752
    @santoshvaidya3752 Před 3 lety +3

    This is one of the finest videos i have ever seen on training. You are an amazing trainer and most importantly you are explaining things in very simple english, also with examples or exercises that would give an hands on experience for viewers......thanks.

  • @TheFearlessGoat
    @TheFearlessGoat Před 3 lety +6

    I love that you have exercises for us to do in the videos! Learned so much from this.

  • @chiranthchangappa6231
    @chiranthchangappa6231 Před 3 lety +1

    One of the best web scraping contents I've seen to date. But the ending was hilarious!

  • @rogerwprice
    @rogerwprice Před 4 lety +1

    Another fabulous real-wold tutorial. Thanks for the google and stack overflow searches and the errors with recovery!

  • @rogueknight2414
    @rogueknight2414 Před měsícem

    I wanted to attempt your recent Advanced web scraping tutorial where I then stumbled upon this amazing tutorial and I'm so glad I did! Thank you very much

  • @LoganNinefingers
    @LoganNinefingers Před 3 lety +3

    Keith you'll be the first one I cite when I write my nobel prize winning book or whatever it is nobel prize winners write. Golden content. Gracias!

  • @soesevenonesix
    @soesevenonesix Před rokem

    Keith, your videos are excellent. You are totally getting me through grad school just watching your tutorials. Keep it up!

  • @ajaykushwaha-je6mw
    @ajaykushwaha-je6mw Před 3 lety

    The Best thing about your tutorial are that you start from scratch and teach basic and explain each fragment of code with concept. Love from India.

  • @nasser_omar
    @nasser_omar Před 2 lety +1

    Hi Keith,
    I'm really excited to watch this video. Actually, I used to watch your all Python-related videos, especially the Pandas one.
    Keep going, and I hope to meet you one day.
    THANKS, A LOT

  • @wahaha108
    @wahaha108 Před 2 lety

    The best python video i have ever seen. No wasted words, dive into the important topic. Lol, great!

  • @OgoidRei
    @OgoidRei Před 3 lety +1

    Thank you Keith, amazing content, easy to follow, clear explanition, great exercices (with walkthrough) and love the funny breaks/comments during the video. Followed and like

  • @andrewp319
    @andrewp319 Před 3 lety +1

    This is by far the best tutorial I have found after searching through the internet for hours. I subscribed just because of this one great video. Please keep doing videos of practical applications of Python. Project tutorials are the best.

  • @mohitupadhayay1439
    @mohitupadhayay1439 Před 2 lety

    As someone earlier said, Big SHOUT OUT to Keith for getting the community such amazing content!

  • @lokotock
    @lokotock Před 3 lety

    Thanks a lot! Your video are clear and pretty useful! And it’s a joy watching them! I’m glad that I found your channel ✨

  • @ranveersharma1666
    @ranveersharma1666 Před 4 lety +9

    i am from india . we really dont get this quality stuff here.. so thanks to youtue and you.. for spreading wonderful knowledge.. keep rocking !

  • @armandoacevedoluna3393

    Yes! Awesome tutorial dude. Looking forward to your next web scraping video. Cheers!

  • @lalitsharma-gl4kr
    @lalitsharma-gl4kr Před 3 lety

    Value for time invested in watching your videos. Along with the subject knowledge, we understand how to practically approach a problem. Thanks a ton for sharing your knowledge.

  • @apsilal
    @apsilal Před 3 lety +37

    I paid a bootcamp for learning. But Keith you are way above all that. I understood the concepts from your video only. I owe you man!! Keep going and please don't stop putting up such videos.

    • @KeithGalli
      @KeithGalli  Před 3 lety +5

      I appreciate the support! Happy that the videos are helpful

  • @jamesdavies5386
    @jamesdavies5386 Před rokem

    Hey this tutorial is great! I've been looking for a decent one like it for some time now and I can't believe it took the algorithm this long to show this on my recommended page

  • @gavreleric3493
    @gavreleric3493 Před 3 lety

    Wow, really impressive. One of the best channel ! Keith you are very clear with your explanations.
    Thank you for sharing your knowledge :)

  • @rahuldavid4831
    @rahuldavid4831 Před 4 lety +8

    Thank you so much for this wonderful tutorial Keith! Words cannot describe how much I am grateful to you for making this gem of a video that covers everything you need to successfully scrape a webpage! Trust me when I tell you that NOBODY HAS MADE A BETTER VIDEO ON BEAUTIFULSOUP than you!!! If I could have the liberty of suggesting future videos, I would love if you made a video about "Regular Expressions". Keep up the good work and God bless!!!

    • @KeithGalli
      @KeithGalli  Před 4 lety

      Very happy to hear you enjoyed!! A regex video is a great idea :)

  • @modernmistyk4341
    @modernmistyk4341 Před 2 lety

    You saved my life, I hope you're getting all the beautiful things in life you deserve

  • @manu93ize
    @manu93ize Před 4 lety +1

    by far the best tutorial on youtube for web scraping. you are very good at dumming it down, even total beginner can even understand.
    waiting for NLTK tutorial.
    thank you

  • @lefu7812
    @lefu7812 Před 4 lety +63

    Your tutorials are the best, honestly. Thank you so much for doing this.

    • @KeithGalli
      @KeithGalli  Před 4 lety +7

      Glad you enjoy them!! You're very welcome :)

  • @sarahburkhardt2037
    @sarahburkhardt2037 Před 3 lety

    Thanks for sharing this! I am mostly just popping in to learn, but this is helping me know how to think about data & see that there are a lot of options.

  • @user-ke5gm4sf8c
    @user-ke5gm4sf8c Před 9 měsíci

    one of the best beautiful soup videos, and really want to say thanks! Keith

  • @nallym82
    @nallym82 Před 3 lety +1

    I am very glad that I found your videos. I learnt more from you than all other tutorials combined. Please do a tutorial on xlwings. Thank you

  • @ikki411
    @ikki411 Před 3 lety +1

    This tutorial was incredible. I've done 2 Python courses that touched the 'Web Scraping' subject, but I wasn't able to fully understand it. This video was one of the two videos that made me fully understand it, and I couldn't be more happy about it. And finding out the secret message was amazing too :D

    • @h4zmeister
      @h4zmeister Před rokem

      wanna share the other video you found helpful ? :)

  • @ClaireCodesStuff
    @ClaireCodesStuff Před 4 lety +6

    This is a fantastic tutorial. When I last tried to learn beautiful soup, we were in the awkward transition phase between python 2 and 3 and every tuturial was in python 2 because they hadn't released code for 3 yet. I learned 3 because it was "the future". Of course, I then wanted to use BS so I had try and figure out what I wanted to do in python 2. I gave up in total frustration. This is a crystal clear guide and now I actually understand how it works and how to use it. Thanks Keith!

    • @KeithGalli
      @KeithGalli  Před 4 lety +1

      Happy that this tutorial could clarify the details and remove some of that frustration! :)

  • @investandcyclecheap4890
    @investandcyclecheap4890 Před 3 lety +10

    This is such a great tutorial ! I loved being able to pause and figure out the problems on my own. I really learned a lot! Thanks Keith, you rock!

  • @dusty6193
    @dusty6193 Před 3 měsíci

    Only a third of the way through this video and I already feel like I understand this better. Thank you, brand new at this

  • @pablomora7880
    @pablomora7880 Před 3 lety

    Well done! First class of Web Scrapping! Awesome

  • @fabianrestrepo82
    @fabianrestrepo82 Před 3 lety +2

    Man watching that ending was almost like watching Jack sink, beautiful ending!! keep it up man, great content

  • @shin-mg7hn
    @shin-mg7hn Před rokem

    Your video really help a lot to understanding the Beautiful Soup, thank you, Keith!

  • @kallenmulilonalyanya4181

    I like how you make simple stuffs that were really scary. Bravo man.

  • @Some_random_guy_16
    @Some_random_guy_16 Před 3 lety

    Oh man.. your tasks are excellent. It helped me to get a better confidence in working with soup..

  • @abdoooooo8583
    @abdoooooo8583 Před 4 lety

    Great video .. and I watched A LOT videos about beautiful soup. Keep going with the series

  • @muhammadkazimraza3456
    @muhammadkazimraza3456 Před 2 lety +1

    Very very good video and great exercises specially last one.
    Thanks for such videos

  • @irfanshaikh262
    @irfanshaikh262 Před 3 lety +1

    Subscribing, coz I loved it.
    Glad I found you @keith.
    Exploring your channel now.
    Appreciate the way you did it so perfectly making it simpler to understand for me.

  • @adrianobavaresco76
    @adrianobavaresco76 Před rokem

    Thank you Keith! This is the best video that i watch about bs4. 👏👏

  • @bhupindersingh4347
    @bhupindersingh4347 Před 2 lety +1

    This is a very will organized web scrapping tutorial. Thanks for sharing.

  • @bernardobritto8352
    @bernardobritto8352 Před 3 lety

    LOL, loved the secret message. Great work, thanks for the video

  • @khinekhinezaw65
    @khinekhinezaw65 Před 3 lety +4

    This is the best web scraping tutorial. Thank you so much!

  • @alic
    @alic Před rokem

    Brilliant, amazing channel. Major kudos to you Keith!!

  • @muthonigathage263
    @muthonigathage263 Před 2 lety

    This was a fun video! Thank you Keith Galli.

  • @Dee-bk3gk
    @Dee-bk3gk Před 3 lety

    You have a lifetime sub from me. Been looking for videos like this for a long time. Keep up with the great content!

  • @esspi9
    @esspi9 Před 4 lety +11

    Amazing.
    Thanks Keith!
    Looking forward to the Selenium and scrapy series.

    • @KeithGalli
      @KeithGalli  Před 4 lety +3

      You're welcome!

    • @esspi9
      @esspi9 Před 2 lety

      @@theduck3126 Try John watson Rooney channel.
      He's got everything covered.

  • @unsignedperson476
    @unsignedperson476 Před 3 lety

    You are perfect ! You know how to teach. Thank you so much man. Liked your style, and got the subject i have been struggling. Liked and subbed.

  • @futuregootecks
    @futuregootecks Před 2 lety

    Wow path navigation is so powerful! Thanks for this!

  • @dhruvrathore2022
    @dhruvrathore2022 Před 4 lety +32

    Please do a Seaborn Tutorial ! like you did with Pandas, Matplotlib etc. I watched all of them, really glad i found your channel. Simple, informative & on point.

    • @andyn6053
      @andyn6053 Před 3 lety

      @Lucas agree, Derek Banas has a great Seaborn tutorial at his channel!

    • @fardinahsan2069
      @fardinahsan2069 Před 3 lety

      If you know matplotlib you know most of seaborn, its a matplotlib wrapper. all matplotlib methods work in seaborn too

  • @sunnywen9483
    @sunnywen9483 Před 4 lety +1

    so surprised to find treasure youtuber here, will go through all your perfect tut in my summer holiday, hope that u will gain more and more subscribers~

  • @amranazad4540
    @amranazad4540 Před 3 lety

    This guy deserves the world

  • @Amulya7
    @Amulya7 Před rokem

    Beautiful video Kieth. Loved it.

  • @andvad6475
    @andvad6475 Před 3 lety

    Thanks Keith. A really great video. Keep them coming, really useful videos I am learning a great deal from you. Many thanks.

  • @sagebaram5951
    @sagebaram5951 Před 3 lety

    How do you know you’ve learned something ?
    Completing the challenge within 1 minute no hints. Thank you so much for all your efforts :)!

  • @panakitikos
    @panakitikos Před 3 lety

    man when i accomplished that secret word challenge on my own i swear to god i just clapped for u and wished i hadn't already liked the video and being a subscriber so i can like and subscribe. thank you very much keith. you're like my geeky genius best friend that im learning from

    • @KeithGalli
      @KeithGalli  Před 3 lety +1

      Haha this comment put a big smile on my face :). I appreciate the support!

  • @hemanthkumaar3681
    @hemanthkumaar3681 Před 4 lety +1

    i learned numpy ,pandas and other things from ur play list. i was strucked for the past 3 days in webscraping i watched a lot of yt videos bt i coudnt understand as ur content...Thank you so much brother :D . Now i hit(smashed) the bell icon too...

    • @KeithGalli
      @KeithGalli  Před 4 lety +1

      Awesome glad this video could help clarify some of the confusion you had. Thanks for smashing the bell icon! xD

  • @ivm6878
    @ivm6878 Před 3 lety

    Thank you Keith, love your tutorials ! I was able to solve the last exercise :D

  • @AndyRhye
    @AndyRhye Před 3 lety

    The idea with the secret message was super cool!) You've got that like! Well deserved.

    • @KeithGalli
      @KeithGalli  Před 3 lety

      Glad you enjoyed it! I had fun setting that up :)

  • @victordias8899
    @victordias8899 Před 3 lety

    Bro you're great at these videos. Keep it up. I'm very glad I found your channel and I'm learning a lot from you.
    Regarding the task of getting the "is" from fun-facts, you can get them by this simple one liner:
    [li.get_text() for li in webpage.select('ul.fun-facts li') if 'is' in li.get_text()]
    no regex, no extra loops... just plain string methods with list comprehension!

  • @rodrigomonteiro8780
    @rodrigomonteiro8780 Před 3 lety

    Man you save my life. your tutorials are amazing.

  • @benlucke7763
    @benlucke7763 Před 2 lety

    Thanks for the tutorial Keith! Keep up the great work

  • @soumyaranjandash3597
    @soumyaranjandash3597 Před 2 lety +1

    Amazing Lecture. Here we understood Everything. Thanks a lot Broo 🔥👍🙂

  • @pratiksarani4947
    @pratiksarani4947 Před rokem

    wow a fun exercise !! Have a great fun , Next one is the Pandas One

  • @carlmerrigan5403
    @carlmerrigan5403 Před 2 lety

    Thanks for great tutorial, Keith!

  • @chineduezeofor2481
    @chineduezeofor2481 Před 3 lety

    Wow! This is just too good. Thanks for the video Keith

  • @carlosroquesuarezgurruchag8681

    of course i will smash that button!! Sos un crack amigo, gracias por la buena onda y dedicacion!

  • @PrielCohen1
    @PrielCohen1 Před rokem

    Thank you for the video!
    You explain things so clearly

  • @tralfazy
    @tralfazy Před 11 měsíci

    Great video and well done. I learned a lot from it. Thanks Keith!

  • @zakriajanjua9170
    @zakriajanjua9170 Před 3 lety

    I literally hit like because of that secret message lol. Cheers man really grateful for all your content

  • @zainabkhan5859
    @zainabkhan5859 Před 3 lety

    This is exactly what I was looking for. Thumbs up Keith for this awesom video :-)

  • @sreeragmsudheesh
    @sreeragmsudheesh Před 2 lety

    52:30 Not sure if this was posted before but this works for the duplicates. Thanks for all the help Keith!!!
    import pandas as pd
    table = soup.select("table.hockey-stats")
    df = pd.read_html(str(table))[0]
    df

  • @WondererSeeker
    @WondererSeeker Před 4 lety

    Very good video Keith! Very clear and useful. Thank you.

  • @adrianapetrova196
    @adrianapetrova196 Před 4 lety +8

    The last time I tried to understand BeatifulSoup I gave up. You explain it so easy to understand. Thanks for the hard work and the time you spend on teaching us :)

    • @KeithGalli
      @KeithGalli  Před 4 lety +2

      Love to hear it! You are very welcome :)

    • @rahuldavid4831
      @rahuldavid4831 Před 4 lety +1

      Me too! It's almost like Keith is a godsend

    • @andyn6053
      @andyn6053 Před 3 lety

      @@rahuldavid4831 he sure is :)

  • @vargabh8180
    @vargabh8180 Před 3 lety

    It was my first time learning from you and I must say it was pretty awesome:-)

  • @andyn6053
    @andyn6053 Před 3 lety +1

    Awesome video av always! Would love to see tutorials for selenium and scrapy aswell. Also PyTorch and Seaborn would be very interesting to learn more about! Your videos are soo easy to follow and learn from :)

  • @Beyond..Horizon
    @Beyond..Horizon Před 2 lety

    That's the data scientist's way to tell "Like and Subscribe ". Thanks for sharing knowledge!!

  • @iklintsov
    @iklintsov Před 3 lety

    best most concise and detailed tutorial on bs

  • @kinwong6383
    @kinwong6383 Před 3 lety +3

    CV Update: Web Scraping expert.
    Joke aside what an awesome tutorial. Felt so satisfying to get the secret message with what you taught!!
    Brilliant work!

    • @KeithGalli
      @KeithGalli  Před 3 lety +3

      Haha love to hear it! I had a lot of fun putting this one together, so I'm happy to hear that you enjoyed it :)

    • @andyn6053
      @andyn6053 Před 3 lety

      @@KeithGalli your tutorials are really appreciated! thanks man :)

  • @nasser_omar
    @nasser_omar Před 2 lety +1

    LOL. Nice secret message.
    I appreciate your effort :)

  • @gyugyugyu.1
    @gyugyugyu.1 Před 4 lety

    Love your videos im watching them nonstop...thank you❤️❤️

  • @pchebbi
    @pchebbi Před 3 lety

    *SMASH*ing that like button with all 10 fingers! Wish I had more fingers! Thank you so much! Amazing tutorial. You are so meticulous and your teaching is very methodical. Thank you!

  • @shakilahmed31
    @shakilahmed31 Před 2 lety

    7 star out of seven .Thanks a lot keith , i found your tutorial after searching a lot about web scrapping , can not compare your lactures with any . plaese make one video with road map of web scrapping , next expected tutorial from you on Machine learning , thanka again .

  • @luchoargentina1
    @luchoargentina1 Před 4 měsíci

    Son increibles tus videos!! Gracias Keith

  • @vatsdimri3675
    @vatsdimri3675 Před 2 lety +1

    Really learned a lot. Loved the exercises.

  • @kaustubhgupta12
    @kaustubhgupta12 Před 4 lety +17

    When keith do it, its perfect 🤩

  • @shrutipancholi3544
    @shrutipancholi3544 Před 3 lety

    One spot for all my Python needs. Thanks Keith! ; )

  • @jatinkumar4410
    @jatinkumar4410 Před 3 lety

    As usual.... Awesome Tutorial!!!

  • @schoolstudentarea4199
    @schoolstudentarea4199 Před 4 lety

    i wish i had a cool teacher like you

  • @aagambakliwal3654
    @aagambakliwal3654 Před 4 lety

    Thanks alot for the comprehensive tutorial! Really appreciate it

  • @danniliu2544
    @danniliu2544 Před 2 lety +1

    Hi Keith, i second many of the viewers comments. Your tutorial on web scraping is by far one of the best ones out there. Thank you so much for producing this. I do have a question though. Hope you can help clarify, I've not had much success googling. Can you clarify what the difference between select function vs. find_all function? when would you use one over another?

  • @julianaaguiar6375
    @julianaaguiar6375 Před 3 lety

    The best videos! Love your videos and way to present the ideas.

  • @fahad203
    @fahad203 Před 4 lety

    Are you god? I have a simple approach to your videos. I like them first, then I watch the video. Thanks a lot man, you and few others youtubers are going to put universities out of business

  • @yogeshuttekar8542
    @yogeshuttekar8542 Před 4 lety

    Bro loving your videos and the cool stuff you teach us. It really helpful for me. hope you and your channel grow well. plus you keep posting these crystal clear skills well. hoping too see new video soon.

    • @KeithGalli
      @KeithGalli  Před 4 lety +1

      Very happy to hear that you've been liking the videos!

  • @MrBeezy514
    @MrBeezy514 Před 3 lety

    Dang! that was a good tutorial. I love you Keith, sincerely.