Python Web Scraping - Should I use Selenium, Beautiful Soup or Scrapy? [2020]

Sdílet
Vložit
  • čas přidán 23. 07. 2024
  • In this video, you’ll learn the pros and cons of the three Python web scraping frameworks you should know - beautiful soup, selenium, and scrapy. I’ll also give you scenarios of when a certain framework is more effective than the others. Let’s jump in.
    ⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. We made this CZcams channel and Kite to help you be more productive: kite.com/download/?...
    JOIN our online community of people who want to level up their developer skills ►
    / 505658083720291
    Subscribe ► czcams.com/users/KiteHQ?sub_...
    Twitter ► / kitehq
    ***************************************
    Additional Python Resources
    Web Scraping with Cats! ►
    • Python Scrapy Tutorial...
    6 Python Tips and Tricks YOU Should Know ►
    • 6 Python Tips and Tric...
    Best IDEs for Python ►
    • Best IDEs for Python
    ***************************************
    Be sure to subscribe for tutorials, project breakdowns and more!
    czcams.com/users/KiteHQ?sub_...
    STAY TUNED:
    Kite ► kite.com/
    CZcams ► / @kitehq
  • Věda a technologie

Komentáře • 115

  • @aabergkvist
    @aabergkvist Před 3 lety +72

    Beautiful Soup
    + User Friendly
    + Easy to Learn & Master
    - Requires Dependencies
    - Inefficient
    Scrapy
    + Efficient
    + Portability
    - Not User Friendly
    Selenium
    + Versatile
    + Works well with Javascript
    - Not Meant to be a Web Scraper
    - Inefficient

    • @sebastianfors4491
      @sebastianfors4491 Před 2 lety +3

      I hope you scraped this from the video because that looks like an awful lot of work to type out...

    • @aabergkvist
      @aabergkvist Před 2 lety +2

      @@sebastianfors4491 actually, I was stuck on a commute iirc and wrote it down to "fortify" the learnings from the video :)

  • @guillaumehoareau1161
    @guillaumehoareau1161 Před 3 lety +6

    Wow, the quality of the video and the editing is outstanding.

  • @sozno4222
    @sozno4222 Před 3 lety

    Great job on this video. I love how precise it is.

  • @aydencraig9542
    @aydencraig9542 Před 3 lety

    Thank you for video! I'm going to check out your web scraping tutorials now!

  • @adamblomfield7914
    @adamblomfield7914 Před 3 lety +1

    Love this quick video summary. content is perfect and I got exactly what I came for. Some tiny constructive feedback on the delivery would be to speak about them in the same order throughout.
    0:39 - Selenium, BS, Scrapy
    main content - BS, Selenium, Scrapy (best order in my opinion)
    summary (4:34) - BS, Scrapy, Selenium
    Keep up the great work!

  • @airmanfair
    @airmanfair Před 3 lety +1

    I actually downloaded kite as per your suggestion and am using it now with jupyterlab. It's pretty neat!

  • @bluesdog88
    @bluesdog88 Před 4 lety +5

    Great tutorial, thanks for the insight, you saved me a lot of reading ;)

  • @kotarouriderblack6118
    @kotarouriderblack6118 Před rokem +2

    For my use case using Selenium is perfect because I hate dealing with pesky buttons on dynamic webpage.

  • @stanlukash33
    @stanlukash33 Před 4 lety +3

    Appreciate this video man. Lots of stuff clarified.

    • @gaurav2979
      @gaurav2979 Před 3 lety

      What did you chose ? What was clarified to you.?

  • @ambarishkapil8004
    @ambarishkapil8004 Před 4 lety +5

    HI, Firstly I want to congratulate you on your new youtube channel and hope that it will be as successful as your product. You are putting in great content, and the dev community really appreciates the hard work. As a future video idea, I would like to suggest "Design Patterns". This would cater to python enthusiasts falling in both ends of the spectrum. Thanks, Cheers!

  • @idromano
    @idromano Před 2 lety

    This video is so well done!

  • @daviyokogawa4237
    @daviyokogawa4237 Před 3 lety

    Your video was so easy to understand and help me a lot to know which way to go

    • @KiteHQ
      @KiteHQ  Před 3 lety +1

      Glad we could help, Davi! :)

  • @vincentjanse
    @vincentjanse Před rokem

    Thank you! That was very helpful!

  • @detaineddeveloper
    @detaineddeveloper Před 4 lety +4

    Thank you for making this video! I'm glad I watched this first before starting to build a scraper.

    • @gaurav2979
      @gaurav2979 Před 3 lety

      What did u chose ? Scrapy or selenium ?

    • @gaurav2979
      @gaurav2979 Před 3 lety

      As beautiful bla bla seems like for kids

  • @AlessandroBottoni
    @AlessandroBottoni Před 3 lety

    Excellent video! Kudos!

  • @dishydez
    @dishydez Před 3 lety

    This is great! Thanks a lot. By the way, could you do a guide on helium? It's a wrapper for selenium but easier to use though I can't get it to work for some reason. Would appreciate a guide video/series.

  • @alihusham1560
    @alihusham1560 Před 4 lety +3

    what do you use for your video animations and graphics?

  • @araza554
    @araza554 Před 4 lety +11

    Hey, did you use Adobe After Effects or some other tool in the starting of video where you were elaborating the agenda of this video?

  • @Tracks777
    @Tracks777 Před 4 lety +24

    nice content

  • @KapilSharma-co8xq
    @KapilSharma-co8xq Před 4 lety +2

    Which elements scrapy can fetch??? Like beautiful soup can extract HTML and XML.
    I have switched to beautiful soup.

  • @saurabhbhambry
    @saurabhbhambry Před 4 lety +8

    Great video! Love how it's concise and to the point. Quick question, can Scrapy be used for scraping sites that use Javascript for dynamic loading too? Or is Selenium the only choice for such a scenario?

    • @abc.2924
      @abc.2924 Před 3 lety

      It can, if you combine it with splash and run it using docker

  • @luishenriquedavilapossatti5308

    Hi!
    In my application I need to open a web page, fill a form and then click in a button, then get some data that will be loaded in the page. What would you recommend?
    Thanks in advance

  • @bah0n1
    @bah0n1 Před 4 lety +1

    Is there possible to add some python selenium script backend of our website. If it is not then why and if will then how. I go a website like auto like/auto followers/auto reaction is they use some kind of selenium script

  • @anonymosranger4759
    @anonymosranger4759 Před 3 lety +6

    Amazing Content, New Sub Here!

  • @yashhhhraj
    @yashhhhraj Před 3 lety

    Can we use python requests library with scrapy to make post requests to api? I'm done with web scraper but stuck at api

  • @AmitTiwari-wf1xj
    @AmitTiwari-wf1xj Před 4 lety +5

    Mark my word! If you continuously put videos of such great content than you will reach million sub in few years. by the way, I suscribed

    • @gaurav2979
      @gaurav2979 Před 3 lety

      Do you have any idea on skrapping

  • @homeheart1276
    @homeheart1276 Před 2 lety +2

    Well done, Sir. You just made it into my "0. Top Resources" Bookmark folder...the competition to get in there is insane and your roommates are very few and far between. It's not what you did in this video per se, it is HOW you did it. Concise, clear, to the point, and not made artificially long to improve your CZcams revenue. *Make sure* you are advertising to entrepreneurs and I.T. professionals; we have little time (or patience). Thanks again! Well done.

  • @Saywhatohno
    @Saywhatohno Před 3 lety

    Great video!!! Can you login to a website like you can with selenium? becasue with selenium you can parse through your userid and password and log into salesforce for eaxample and then scrape accordingly. Do any of the other dependencies or python library provides that feauture? btw this is also the reason i like using selenium.

  • @nicodemus399
    @nicodemus399 Před 4 lety

    hi, Have you tried Scrapy Splash? for js pages.

  • @dickanf
    @dickanf Před 4 lety

    How about supporting of dynamic javascript content? which 1 better?

  • @ibarix
    @ibarix Před 3 lety

    ok, i want to scrape a football teams' forms from a js website, the amount of data is not big. i should go with selenium then? tnx

  • @aronpaul7544
    @aronpaul7544 Před 4 lety +1

    When are you going to post new videos on this playlist? It's been a while 😒

  • @TropicalDev
    @TropicalDev Před 4 lety

    Aight bet I’m downloading kite

  • @user-xq6st3zl1k
    @user-xq6st3zl1k Před 4 lety +2

    Very interesting video)) I like it =)

  • @mikaelmonjour_programming

    asyncio & aiohttp + parser of choice 😘

  • @user-if8zq6lk4i
    @user-if8zq6lk4i Před 4 lety +2

    Hello, your videos very useful

  • @alixaprodev
    @alixaprodev Před 4 lety

    Nice work 👍

  • @nikhildeshpande1247
    @nikhildeshpande1247 Před 2 lety

    I was extracting text from perticular website it is giving response [500] error ?? anyone knows what it is??

  • @Captinofthemudslayer
    @Captinofthemudslayer Před 3 lety

    bs4 content returned different than page im viewing for example @t. any ideas

  • @prabaharanp2825
    @prabaharanp2825 Před 3 lety

    If inspect element not allowed for a page,how could we scrsp

  • @artabra1019
    @artabra1019 Před 4 lety

    what is better python scrapings or js scraping

  • @anttonalamettala5367
    @anttonalamettala5367 Před 2 lety

    nice content, thnkyou

  • @BookOfMorman
    @BookOfMorman Před 3 lety +2

    Great content! Thanks! Quick note, maybe center yourself higher in the frame for the camera. Most people have only a little room from the top of the frame to the top of their head. When you center your head in frame as you did, it kinda just makes you look short. Like you could be 7 feet tall but that centering makes you look like a hobbit!
    Anyway, keep up the great work!

  • @SaurabhGupta-ns8gx
    @SaurabhGupta-ns8gx Před 2 lety

    Which can be the best for web scraping?

  • @KiteHQ
    @KiteHQ  Před 4 lety +5

    Let us know what topic we should cover next!

    • @nghiepcrypto7034
      @nghiepcrypto7034 Před 4 lety +2

      Pandas and Numpy, working with excel please!
      I know there are a lot of video content talking about these, but I believe that you can do that better. Thanks!

  • @jackbird5839
    @jackbird5839 Před 3 lety +3

    awesome tutorial, thank you for your video. it is very clear and easy. Also as newby in Shopify eCommerce i am using ""e-scraper"" to scrape shopify stores, all product data from my supplier sites and other sources. It helps me a lot. maybe it helps somebody too.
    Thank you for your input!!!

    • @willjohn6807
      @willjohn6807 Před 3 lety +1

      Thank you Jack, ESCRAPER helped me a lot. Plus now I know the pros and cons of the three Python web scraping frameworks. Thank you Kite.

    • @vskiy26
      @vskiy26 Před 3 lety +1

      Jack, eScraper is an awesome solution! Thank you.

  • @zone66
    @zone66 Před 2 lety +2

    so if i want to scrape a large amount of webpages while also activating javascript, i would need to go with Selenium, event though Scrapy would be crawling much faster (out of the box). Would be great to have some kind of tutorial for using Scrapy together with Selenium. I think those too should get along somehow. I guess the only problem is Scrapy is single-threaded and Selenium would Block when its called in this single-threaded environment multiple times, or something like hat.

    • @janpost8598
      @janpost8598 Před rokem

      Was wondering that as well. Won't mind a steeper learning curve as long as it is both efficient and handles JavaScript

  • @jasp402
    @jasp402 Před 3 lety

    How can I enter a page that has recaptcha2?

  • @GlennMascarenhas
    @GlennMascarenhas Před 4 lety

    One could use Helium over Selenium. Helium is built on Selenium but much easier in terms of function calls

  • @odhypradhana6556
    @odhypradhana6556 Před 3 lety +1

    I came looking for copper and found gold
    btw great video as always, the edits are really cool~!

  • @elyasmoshirpanahi7184
    @elyasmoshirpanahi7184 Před 3 lety +1

    nice video really informative

    • @KiteHQ
      @KiteHQ  Před 3 lety

      Glad you found it useful!

  • @imaginzationworld
    @imaginzationworld Před 2 lety

    Do you charge to create a web scrapper?

  • @allcool27gaming
    @allcool27gaming Před 3 lety +19

    This dude looks like his birthday is on May 2nd

    • @24mem0
      @24mem0 Před 3 lety

      yoooo facts

    • @s.predator536
      @s.predator536 Před rokem

      Why you said like that🤔
      I am curious to know that because my birthday is on may 2nd

  • @pankajjoshi8292
    @pankajjoshi8292 Před 4 lety

    Sir what about parseHub ? is parseHub free ?

  • @anilpanwar8710
    @anilpanwar8710 Před 3 lety

    @Kite , i want to scrap 1 million records from a website and there are some javascript like some click event require, i know scrapy and selenium, so please tell me which what should i use , scrapy or selenium ?

  • @brandflouride3867
    @brandflouride3867 Před 3 lety

    nice animations bro

  • @Buhassan5656
    @Buhassan5656 Před 4 lety +1

    I want to scrape amazon.com (for monitor arms) and extract prices and shipping weight of each item. Therefore, (it is required to open the page of each item). So what framework you think suits this situation??

    • @PropertyTak
      @PropertyTak Před 4 lety

      Yes i can Paid. I'm Freelancer

    • @fabianrodriguez1226
      @fabianrodriguez1226 Před 4 lety

      Scrapy would be the best approach although you should use proxies and agents

    • @SWIFTzTrigger
      @SWIFTzTrigger Před 4 lety

      @@fabianrodriguez1226 can you explain what you mean by use proxies and agents?

  • @juanroman7130
    @juanroman7130 Před 2 lety

    I want to just save image url to hyper link to it

  • @fallenboy1947
    @fallenboy1947 Před 4 lety

    wanted to install and test kite.
    i became so sad when found out it needs avx instructions in order to install.

  • @pythonenthusiast9292
    @pythonenthusiast9292 Před 4 lety

    How do we know if the website is using JavaScript or HTML/XML to load contents?

    • @willy7968
      @willy7968 Před rokem

      Disable javascript from the browser

  • @ajosifoski
    @ajosifoski Před 4 lety +2

    I would add requests and lxml in adition

    • @ajosifoski
      @ajosifoski Před 4 lety

      Btw, after knowing kite, I cant live without it, python without kite is empty shell, but together are two pearls!

  • @paulowiz
    @paulowiz Před 4 lety +1

    I would like a video to explain more about scrapy , because there are few information on internet

  • @djosearth3618
    @djosearth3618 Před rokem

    thx

  • @GunjanShrimali
    @GunjanShrimali Před 4 lety +1

    Good

  • @willkingsley8454
    @willkingsley8454 Před 4 lety +1

    Aside from the learning curve, would Scrapy be the best option?

    • @ahmadaminfarooq8495
      @ahmadaminfarooq8495 Před 4 lety

      Yes but only downside is it doesn't allow JS rendering.

    • @fabianrodriguez1226
      @fabianrodriguez1226 Před 4 lety +1

      @@ahmadaminfarooq8495 You could use Scrapy and Splash to render JS

    • @shashwatpuri6496
      @shashwatpuri6496 Před 4 lety +2

      It definitely is, Scrapy is powerful and sole purpose is scraping and handling huge amounts of data
      Moreover , middlewares and pipelines allow you to clean the data and store them to database like mongo, sqlite3 !
      Moreover, to scrape JavaScript websites, there's good support for Scrapy-splash integration via docker !

  • @KiteHQ
    @KiteHQ  Před 4 lety

    If you liked this video, join the Kite Developer Community on Facebook for access to more resources + support from fellow Python developers. Time to level up! facebook.com/groups/505658083720291

  • @notsure6834
    @notsure6834 Před 4 lety

    What is scrappy?

  • @darktealglasses
    @darktealglasses Před 2 lety

    Please tune the video's audio up

  • @draytond
    @draytond Před 3 lety +6

    Summary: Use Scrapy if your data set will be large, else use BeautifulSoup

    • @NathanKwadade
      @NathanKwadade Před 3 lety +1

      Thanks 🙏 😊

    • @NathanKwadade
      @NathanKwadade Před 3 lety

      I used BeautifulSoup and made me a nice broth of data which I converted to CSV file format. Thx

    • @matthiasoberleitner5942
      @matthiasoberleitner5942 Před 3 lety

      I just generally use Scrapy. As soon as you know how to set it up its not really a hassle even for small things. If anything's reactive and too hard to fetch, then I use Selenium in my Scrapy framework for those things I need it for

    • @informativecontent4778
      @informativecontent4778 Před 3 lety

      Lolx selenium is better

  • @expat2010
    @expat2010 Před 3 lety

    What's not to like about this video?

  • @king-star9860
    @king-star9860 Před 4 lety

    why dont you use kite

  • @kestonsmith1354
    @kestonsmith1354 Před 2 lety

    I hate Beautifulsoup because it never works for me

  • @user-qv7rw7dq1d
    @user-qv7rw7dq1d Před 4 lety +5

    I hate to be that guy, but Beautiful soup, is not a framework. It's a package/library designed for basic scraping, but its not a framework. In fact, you could, in theory, use BS along with Scrapy as the engine. In comparison, you wouldn't be able to use both Flask and Django together for example, because they are on the same level (Frameworks).
    Comparing BS to Scrapy, is like comparing Jinja to Django. It doesn't really make sense... even though they both sort of accomplish similar tasks.
    It kind of feels like you pieced this video together quickly and are giving slightly misleading info.

    • @gaurav2979
      @gaurav2979 Před 3 lety

      What is your advice for a selenium with python automation guy new to the task and domain of web skrapping.

    • @user-qv7rw7dq1d
      @user-qv7rw7dq1d Před 3 lety +3

      @@gaurav2979 Don't focus on tutorials after the first few weeks. Try to build something as quickly as possible. That's what makes you better.

  •  Před 3 lety

    Why do you say bs4 needs dependencies and the others do not? To begin with, scrapy needs twisted, you even mention it yourself!

  •  Před 4 lety

    I like! Keep it up! Would you like to be CZcams friends? :)

  •  Před 4 lety

    Good! Keep it up! Would you like to be CZcams friends? :)

  • @dchaba154
    @dchaba154 Před 3 lety

    tldr
    Simple project? BeautifulSoup
    Complex? Scrappy
    Selenium? Better not use it 😁

  • @TheRedTeam
    @TheRedTeam Před 3 lety

    Dude sounds nervous

  • @crabbyfish3691
    @crabbyfish3691 Před 3 lety

    Just use pyautogui lmao