Build a Web Scraper (super simple!)
Vložit
- čas přidán 25. 09. 2021
- ⭐ Sign up for my Full Stack Developer Course: www.codewithania.com
In this video I show you how to build a Web Scrapper in a super simple beginner friendly way using Node.js.
Web scraping refers to the extraction of data from a website quickly and accurately. Many people move onto selling their web scraping tools for money, either by building them as a chrome extension or API, or selling them to data capturing companies. So, the option to make money off this tool, is there for you too.
Common questions:
- This tutorial assumes you have nodemon installed globally on your computer. If you do not, use command: npm i nodemon -g
Part 2 For Express Routing: • Get Data from Backend ...
Final code for Part 1 and Part 2: github.com/kubowania/nodejs-w...
___
⭐ Use promo code ANIAKUBOW for 3 months free of WebStorm IDE here (I get no commission from this link, but am in a partnership): jb.gg/get_webstorm
⭐ New to code and none of this is making sense? Watch my '12hr+ CZcams Coding Bootcamp' in which you will learn HTML, CSS and JavaScript Fundamentals completely from scratch. It's on my channel and its 100% free.
⭐ In most videos I use Tabnine as my A.I autocompletion tool. You can download it for free here (I get no commission from this link, but am in a partnership): bit.ly/tabnine-top-tool
⭐ You can get a blockchain domain with my affiliate link here: bit.ly/get-a-crypto-domain
⭐ If you would like to buy me a coffee, well thank you very much that is mega kind! : www.buymeacoffee.com/aniakubow
⭐ Sign up for weekly coding tips from my newsletter partnership: bit.ly/JS-tips
You can also find me on:
Twitter: / ania_kubow
Instagram: / aniakubow
#codingbootcamp #coding
thanks Ania its working perfectly, may i ask why we cant use an arrow function inside the each on line 15 of the code when you call the cheerio ($) function i try it and i got all undefine but i cannot wrap my head around the why....
I *think* it’s because cheerio is not configured to use arrow functions - but I can’t be sure - I haven’t looked into it enough :) I will have to when I’m back from holiday. I will pin this so others can see your great question 😄
@@austinps4026 This work for me:
try {
$('.fc-item__title').each(function () {
const title = $(this).text
const link = $(this).find('a').attr('href')
articles.push({
title,
link
})
console.log(articles)
})
}catch(err){
console.log(err)
}
because the arrow function scopes the function outside of 'this'
You ca use an arrow function with the each method, but you must pass 2 parameters, an index and the element:
$(".wathever", html).each((i, el) => {
const title = $(el).attr("title")
const image = $(el).find("img").attr("src")
article.push({
titolo,
image
})
Arrow functions use this from the calling context. Before arrow functions to do this you had to use the bind method when calling your function to switch the this context of your function. You could implement bind before it existed with apply when calling to have the function run with a different this… arrow functions just make what you usually want easier which is to carry the this context forward.. however many libraries exist that take advantage of the fact that the function you pass into another function can be called in away that changes the this…
For anyone wondering… you’ll need to make sure the nodemon package is installed. You can run ‘npm i nodemon -g’ to use it globally on your machine. Or alternatively, you can run ‘npm i nodemon -D’ inside the project directory to use it as a development dependency while the project is running. Great video Ania, keep it up 👍
Thanks for sharing this!!!! You are totally right I missed explaining this part- I will make sure to cover it in my next videos :)
thank you sempai
As a total newbie it saved me thanks.
thx, but I like my demons
thank u!! love when u have an issue that's way over your head and it's solved in the first comment u read
hey just here to say "thank you" !!
you put a lot of effort into these videos i hope your channel grows fast and get to the top 🙌
Crazy, I just happened to need do something similar for a project and here you are uploading a video that helps a ton! Thanks!
Thanks Ania! Tried already two small projects of yours, and I must say, that you're a teacher, that speaks to novices as well. The samples you made are easy to follow, and simple enough to figure it out, how the packages work with each others. Of course that also shows a great understanding, how to produce working program, but having success with working sample encourages to do more. Thank you with appreciations!
This is so lovely for you to say and has made my day :) 🥰 thank you so much Arto!
@@aniakubow ... Sorry - forgot to wish Merry Xmas! ... and now I do. Happy New Year goes with the same package ;-)
I've been waiting for a tutorial like this! Thanks Ania!
Would love a follow up to this with more advanced scraping such as a page that is behind a login wall, or something that requires a query to be filled before getting the final web content. Thanks heaps!
I have subscribed to many programming channels, however you are one of those who really add value..keep it up
Thanks for the tutorial =) As a few others have said, an alternative option might be to use puppeteer. It has a nice syntax and is very flexible. You can simulate natural browsing using simulated click events and run additional commands such as taking screenshots of html content based on css classes.
yes, I design a template in 4 minutes to scrape title and url using a tool based on puppeteer: czcams.com/video/rB5BHg0XyKs/video.html
This comment doesn't belong to this video, >> You saved me in my senior project last 6 months while I watched tutorial about web development ❤️❤️❤️❤️❤️
Thanks so much for coding with me 💚
I am a boomer coding virgin except for a minute amount of C++ for my Arduino. No one has made sense to me before as there’s usually an assumption of a greater knowledge than I have. You are the first person who was coherent to me, every step explained and details of what is happening behind the scenes. Your beautiful diction helps a lot as well, thank you.
Damn. Wish there were videos to follow like this 20+ years ago when I was starting out lol. You make it very easy to understand
Wow! You explained this so well and left nothing vague. A lot of other tutorials leave out so many parts, assuming everyone knows what they're telling.
It's the first time I've come across your channel, and you just gained a subscriber. Thanks a lot!
Thanks Ania, you're a wonderful teacher. As a newbie, your lesson is as simple as it can be for me to understand. I even worked around it to try some other projects too that I can think off 🤠💕
Love how you just get into it without any sitcomish-intros, just straight to the point. Your videos have given me the confidence to finally start applying to dev jobs. Thank you so much, Ania!
This was a very good intro video to web scraping, thanks!
As a quick tip, when running the `npm init` command you can just append `-y` at the end so it becomes `npm init -y`, and then it will proceed to skip the checks and create the package.json file without you having to press enter several times.
Thank you, Ania! It worked perfectly. I had no idea how to complete this task. You saved my day and gave me a lot of knowledge and fun too. I send you love from Venezuela, you are a genius! ♥
Loving the bit more "fullstack-oriented" content! Actually have a project in mind where I could apply this perfectly. Thanks for the inspiration! ;)
Hi, this is my first time here, and i really love the way you teach. thank you for your very informative tutorial.
Great video. I love the longboard on the wall. Thanks for posting!!!
You provide an amazing technique for web scrapping. It would be good if you explain also how to manage sites using an anti scrapping technique, for example instead of just populate a text directly, they wrap every character in a different tag or even replace some digits with a similar text characters and so on.
Hi Ania, love the videos! Would love one on setting-up a web scraper to scrape every minute or so in a Litespeed server. Keep up the great work.
Your tutorials are very consice and to the point. Its greatly appreciated.
Works purrfectly and super simple code to use and follow. Can you do a part 2 where you learn us to crawl() with this scraping code? I mean the entire site for an example
Yes please Khaleesi of codes do this!
This is a wonderful explanation of every line of code. I've learned code through online resources, mentors and college. Some of this stuff I knew I had to do but I did not know why. Thank you!
Pretty helpful and easy to follow step by step explanation. I'm fairly new to programming so thank you 💕
Clear and precise english. This woman explains well all topic she talks about. Thank you for being a content creator. It fits you so well
That is why learning docker was important for me. We can package our program to work on any machine without worrying about breaking changes from node or express. Thank you for the nvm tip!
yes docker is a life saver especially when you work on both mac and windows
Thanks Ania. You are a great teacher ☺️
You make it looks so easy.
I wanted to learn the power of web scraping. This video shows just that! Thank you Ania
Learning from Ania's style of teaching is easy and relieving. Gonna binge this stuff
Amazing tutorial !
Keep up the great work !
The web scraper would now enable us to develop alternatives for all applications using APIs and this gives us great confidence.
Big thanks, Ania!
With your help i managed to write a webscrapper to parse through all private repositories in my organization (had to add authorization as well), "read" it's package.json file, save it to a file, then run another conversion script that forms array of strings in certain way & then save it to .csv file, to be able to create a pivot table & analyze tech stack of our product :)
Thank you so much i needed to learn how to build a web scrapper
You are the best❤️
Love your content Ania! I'm currently doing a bootcamp and your videos are helping me through!! :D
The mere fact that you have a couple longboards behind you convinced me to subscribe and like this video. You're a cool coder 🌝
🤮
this channel is amazing!!!!
study coding and simping at the same time
This was a fantastic tutorial, and as someone that is a lover of JS, it's nice to see this approach.
I just have to say, I know that they are no longer required, but my brain just gets a completely unnecessary comfort out of using semicolons 😅 even though I can definitely agree that it looks cleaner without. I still have the desire to go through and add them everywhere! haha
Thank you so much!! I really appreciate your comment 💚. Haha yes , I’m team no semi colons for my projects on here haha. I used to use them at work, but even then an extension added them in for me 😛
hahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahahah >=[
Just had a web scraper project on work last week so used your idea. Thanx Ania. had some problems with nademon... But just installed nodemon wit npm i nodemon
Amazing!!! I’m glad this video helped and you managed to get nodemon installed :) thanks for leaving your advice for others too!
Hi Ania, thanks for all the amazing content!👍👍👍 Your channel is truly helpful!
You're simply great! 😉😇
Very cool simple project! One thing I noticed is that you must have nodemon installed globally (unless it's packaged with a version of Node I don't have?). For anyone that doesn't, the "npm start" script won't work. Of course you can just install it locally into this project with "npm i nodemon" in the terminal.
My man, thank you! Had this exact problem and you solved it. :)
Thanks man! This is the command for a Mac: sudo npm install -g nodemon
thanks alot man
well done bro! installed on the project ;)
This is great that you are going through the doc during video :-) It is great because you teach people how to read the doc... I do the same on my Python videos :-) Pozdrawiam! :)
Thanks for update my knowledge on npm. video was short and sweet.
Ania....it is just awesome...someday if you have holiday to Bali I will treat you and show you aound
Really impressive tutorial. Thank you!:)
Well done, wonderful video, really makes it easy to start with scraping. Thanks Kubow!
Great tutorial! Love it! Thank you, Ania!
I like 😂😂😂. Just found your channel and I'm loving it. Excellent content. Keep it up and keep em coming 👍👍👍
Really easy and helpful, I love the way you teaching ♥ ❤
I love your videos!!
Everything you explained is quite clear and simple .So very helpful to learn.Thank you 😘
Thanks so much Sue!!! It’s great to hear this feedback as I’m always trying to improve 😄💚
@@aniakubow looking forward to your new tutorial!
Thank you Ania, awesome stuff as always.
Wonderfully clear exposition. A very effective teacher.
Thanks Ania for the video. You did a perfect job. All the best , stay healthy and have a wonderful week
Cheers Christian
I am assigned as data engineer and really need to scrape some data from a marketplace. this is a blessing!
Omg, idk what words to say, i love you, this is so easy to follow
Seeing couple of software out there that helps people scrape data from the website. It's great to watch this video so I can be able to do the same by just learning programming from best people over on youtube.
Muito obrigado!
Tutorial simples e objetivo.
Outstanding! great project to try, thanks Ania
Thanks, this is really clear and simple to follow!
really nice! I did some of it in python, but it's kind of tricky sometimes, once many websites uses mechanisms to block selenium, scrapy and bsoup... I'll definitly try this one. Thanks for sharing!
Always great content❤
This is great Ania, thanks for your video! BTW just as an idea, would be great to have a second and a more advanced part, that shows how to do it in a site like Linkedin that requires Login or a site that has dynamic ids and classes.
Cheerios!
short, concise and useful. THANK YOU
Awesome Video.
Clear Precise and great Audio, easy to follow and listen to.
Great lesson! Special thanks for enlarging the screen with the code!
You have helped changed the lives of many, thank you for sharing your priceless knowledge
I was looking for a completely different thing, but you gave a good idea with this video! 🤗 Thanx! 😁
Hello Ania, thank you for the quality of your videos !
I have almost 4 years of experience as a js dev, but still learning with your content.
Just one question, why you install express and turn the app into a server, since you only need the http client (axios) in order to scrap the page?
Same question
same ques
Thank you for another amazing tutorial!
Subscribed! cause your contents are really great!
I love your videos!. Thank you again for everything :).
Great tutorial. Very well spoken. Very well communicated. Great Job Ania.
Great and clear explanation, as ever, Thanks a lot!
Excellent video Ania! your explication is very simple and easy to understand, thank you
Thank you this is a great help, It was working. I just had an issue with nodemon and installed it in my machine. I am just encountering the PORT not showing up on my terminal. I have an older version of node js though. Haven't tried using nvm yet. Going to try and practice that tip you gave. This was really helpful.
Thank you so much. I always wanted to learn how to scrape the web I didn't know it's this easy.
Scrape it for what?.. if you don't mind me asking.
thanks, Ania it is very good and working perfectly.
Great course. I enjoyed it.
Thanks for the video! I had some issues with the "start" script on Windows but found that "node index.js" not "nodemon index.js" worked for me.
What a simple and superb video. Thanks !!
Great video and you explain stuff so well!
Great tutorial as always ✌️❤️
Thanks! I'm surprised at how simple it was.
You have inspired greatness my friend. love your work.
Brilliant explanation, lovely and clear and I really feel I have learned something. Subscribed. 👍🏻
That is super kind of you to say! Thank you Macl4ren!
Thanks Ania, had a little bit issues but after i got nodemon package installed and realized i was missing a . for calling the class it worked like a charm :). One thing to note i think is that windows doesnt have a NVM, so i had to als install one of those just so i could make sure i am using same version as you
Always a pleasure to watch. Well explained in an accessible demo. Useful too. Thanks so much.
YEAHH!!!! ILL BE SCRAPING ALL OF YOUR VIDEOS 😍
Amazing thanks for this tutorial😉
Love it, so simple and unique way of teaching. Love from India >3000
great video and nice accent! It was pleasant listening to this tutorial, I've subscribed : )
Thanks for the intro. After that I was able to move around and play around to build a scraper for blogspot
This was extremely helpful and well explained, thank you!
czcams.com/video/Lc6xJTJjWpk/video.html
veaan para saber como seguir a Cristo no puedes trabajar para Dios y para el dinero, Mateo6:24 Hechos 4:33 Hechos 2:45 Lucas14:33
czcams.com/video/wOc4vb0lvPs/video.html
look to know how to follow Christ you cannot work for God and for money... Matthew 6:24 luke 14:33
Wow, that's really cool! Thanks a lot.
its 6 am and I was waiting for this I love your channel
Thanks for joining Cris!!
Omg 😱, waiting for this👩💻😊
Thank you for such awesome tutorial. Dziękuję Ci.
Thanks for the video. You are obviously an expert.
completed the project. thank you very much❤️
More simple impossible!. Thank you so much!
Thank you so much for this tutorial sis💖