Automate Excel Work with Python and Pandas

Sdílet
Vložit
  • čas přidán 25. 07. 2024
  • Excel tasks are repetitive and boring! Automate them and make your life easier using Python and Pandas. Opening CSV and XLSX files into a Pandas Dataframe is super easy, and setting up to do some basic editing and manipulation with that data can save you hours off your day job.
    I will show you how automating tasks such as combining CSV files into one, moving columns around, creating pivot tables and vlookups is easy in Pandas, as well as exporting into all the popular file formats.
    Learning how to use Pandas is also highly recommended for anyone interested in data as it is Pythons go to for Data Science, so starting small with some basic dataframe manipulation will set you off down the right path.
    Support Me:
    DISCORD (NEW): / discord
    Amazon US: amzn.to/2OzqL1M
    Amazon UK: amzn.to/2OYuMwo
    Hosting: Digital Ocean: m.do.co/c/c7c90f161ff6
    Gear Used: jhnwr.com/gear/ (NEW)
    Patreon: / johnwatsonrooney (NEW)
    Scraper API: www.scrapingbee.com/?fpr=jhnwr
    -------------------------------------
    Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
    -------------------------------------
    timestamps
    00:00 Intro
    00:38 Data
    01:45 read_csv
    03:46 Change Columns
    08:23 Edit Column Data
    09:56 Add Multiple Files Together
    12:56 Pivot Tables
    17:00 vlookups (merging)
    20:00 Exporting
    21:10 Outro
  • Věda a technologie

Komentáře • 118

  • @RS-Amsterdam
    @RS-Amsterdam Před 3 lety +22

    Yesterday, before this video, I was testing this subject and I got an error and it took me a while to figure it out.
    The problem was that my Excel conversion to CSV gave a ; as delimiter instead of a comma (,)
    The solution was df = pd.read_csv('data.csv', *sep=';'* )
    Thanks for sharing your wisdom ;-)

  • @ruffhouse9760
    @ruffhouse9760 Před 2 lety +1

    Easy to follow tutorial, appreciate it john!!!!

  • @maryamzarabian4617
    @maryamzarabian4617 Před 2 lety +3

    Thank you very much for the gentle introduction of Panda library . It was very useful for me

  • @user-kp7re7nb6q
    @user-kp7re7nb6q Před rokem +1

    Really useful from start till the end. Highly appreciated.

  • @training7574
    @training7574 Před 3 měsíci +1

    Masterly content and presentation, thanks. The ideal pace for making the viewer understand what is being done and how.

  • @tubelessHuma
    @tubelessHuma Před 3 lety +2

    Very useful common operations for data science projects. 👍💖

  • @TheeMatrixUno
    @TheeMatrixUno Před rokem +1

    Hi John, Thank you for the informative video! Top job

  • @priya-ok9ur
    @priya-ok9ur Před rokem +1

    This video is really helpful. Thank you so much.

  • @senthilsds
    @senthilsds Před 3 lety +1

    I learned a lot about web scrapping and data handling methods from you in very short time. Thank you

  • @OBPagan
    @OBPagan Před 3 lety +1

    Just wanted to say a huge THANKYOU as you have taught me so much!

  • @wladcapiekla
    @wladcapiekla Před 2 lety +2

    Thanks a lot for your videos. These are really helpful and easy to understand.

  • @AlvinRyellPrada
    @AlvinRyellPrada Před 2 lety +1

    I am getting started with touching python for my excel files and glad i found this video.
    I can easily follow this thru
    Looking forward for more :)

  • @ammadkhan4687
    @ammadkhan4687 Před rokem +1

    Thats perfect how you explained.

  • @bartdziubek327
    @bartdziubek327 Před 2 lety +1

    great material, thanks

  • @jisuresh
    @jisuresh Před 3 lety +10

    Hi John Watson! I'm watching your channel regularly and updating my skills. You are my real teacher. Thanks!

    • @pr0skis
      @pr0skis Před 3 lety

      Absolutely one of the most underrated CZcamsrs out there. I guess Google's ML algo probably identifies John as a conservative and that's why his channel hasn't exploded in subscriber count yet.

  • @noone370
    @noone370 Před rokem +1

    Excellent. Thanks!

  • @user-qe5st3wz2f
    @user-qe5st3wz2f Před 9 měsíci +1

    Excellent video, thank you

  • @lifted1785
    @lifted1785 Před rokem

    thanks rooney, a true ace!

  • @yannhk
    @yannhk Před 2 lety +1

    It's a very useful tip to get a list of df column names

  • @patocarlo69
    @patocarlo69 Před rokem +1

    you are amazing man! keep going

  • @Greygon313
    @Greygon313 Před 2 lety +1

    Great walkthrough

  • @michamr-o6960
    @michamr-o6960 Před 14 dny

    Very nice job. Good luck.

  • @ceknowledge9342
    @ceknowledge9342 Před 2 lety +1

    Great.awesome video...great learning time

  • @ajayrawat5046
    @ajayrawat5046 Před rokem +1

    Thanks John for this helpful content.

  • @tnssajivasudevan1601
    @tnssajivasudevan1601 Před 3 lety +1

    Great Video Sir.

  • @ceknowledge9342
    @ceknowledge9342 Před 2 lety +1

    Awesome bro its save my time

  • @Robert-hb5tc
    @Robert-hb5tc Před 6 měsíci +1

    whew! that was a good one

  • @bisratgetachew8373
    @bisratgetachew8373 Před 3 lety +1

    Thanks again, really good video.

  • @NotBeHaris
    @NotBeHaris Před 3 lety +1

    Awesome brother

  • @Chrisfagan8881
    @Chrisfagan8881 Před 2 lety +1

    Good video thanks

  • @PeterFletcherDNADeliverer

    great video, going to check if you do an in-depth video using pandas

  • @mariuscheek
    @mariuscheek Před 11 měsíci +3

    As someone more used to VBA, I'm starting to learn Python as my employer is going to deprecate VBA due to security concerns, so this is really useful.
    However, I have yet to find any tutorials at all, anywhere on youtube, that show how to actually deploy for end users within an Excel environment. My users aren't going to have an IDE, they need to be able to click a button and set code running.

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 11 měsíci +1

      Microsoft are in the process of launching python for excel - directly use python in excel. It’s currently in developer preview but it sounds like what you are looking for

  • @SimpleExcelVBA
    @SimpleExcelVBA Před 2 lety +2

    You just make it soo easy, nice tutorial. Python seems be to the best alternative for VBA.

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 2 lety +1

      I never got into VBA but I heard it’s very powerful for this… but Python is so good for most things :)

    • @SimpleExcelVBA
      @SimpleExcelVBA Před 2 lety

      ​@@JohnWatsonRooney Yes it is. Even not only for Excel/Office things, but the lack of support made VBA as zombie - dead but will exist as long as Excel will. The support/ new libraries are making me curious into Python world.

  • @Destide
    @Destide Před 2 lety +3

    Nice to get tutorials from an actual work flow rather than a reinterpretation of the manual.

  • @leandrov07013
    @leandrov07013 Před rokem +1

    Thanks❤

  • @TheeMatrixUno
    @TheeMatrixUno Před rokem

    John, Can you target a specific sheet within an excel workbook to create a new db?

  • @rajesh9sn
    @rajesh9sn Před 2 lety +4

    Hi John, very well explained and covered most topics which are used in excel. Superb. if you can make one for using python create pivot table and paste it in excel. How to dissect the original data to make smaller data which can then be used to create chart in excel. so I don't have to rely on formula or excel pythons to make charts. Python would simple process the larger dataset and format data in a way which would put it in excel which can then be used for charts

  • @roybuchler7502
    @roybuchler7502 Před 5 měsíci

    Hi - what do you do in the case that you want to merge the data but some of the data in the references tab does find a corresponding match in the original merged spreadsheet?

  • @absoluteRandom69
    @absoluteRandom69 Před 3 lety

    Hello, John can you make a video about the VScode Debugger about that how to setup a debugger, how to use it and it's setting and all that stuff. Thanks in Advance

  • @mushinart
    @mushinart Před 3 lety +1

    Cool stuff bro

  • @jonathanfriz4410
    @jonathanfriz4410 Před 3 lety +3

    Hi John, glad to see a pandas video here. Very good one, the Timestamps are really appreciated. John could you make a video only about the vlookup, merge, iloc, drop duplicated? Even when I manage to used them, I couldn't say I really understand it.

  • @TheBIMCoordinator
    @TheBIMCoordinator Před 3 lety

    Awesome video! Can we get the real python link. I couldn't find it in your description.

  • @creyes879
    @creyes879 Před 2 lety +1

    Is there a way to save or create a function of the lines of code for repeated tasks with new data sets that come in lets say weekly? So essentially have like a .exe or .bat file, or even a GUI with a run button that when clicked on it automates the process and gives results fast

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 2 lety

      Hey what I do for weekly reports is tailor my script to get them from a specific folder, put the new files in there and run the script from the terminal when ready

  • @rohitmethare1986
    @rohitmethare1986 Před 3 lety

    Thanks a lot for your videos. These are really helpful and easy to understand.
    Can we connect ?had something to discuss

  • @HadarsGrasp
    @HadarsGrasp Před 2 lety

    Do you have a book you'd recommend to learn Pandas for this type of work? Most I'm seeing is heavy on the math. I mostly need to be able to find duplicates and in a new column assign the duplicates a new id. So Company A may have 10 records. I want to automate the process of finding them and assigning all of them a CompanyID.

    • @markgreen2170
      @markgreen2170 Před rokem

      maybe export into csv file and use gawk, fast lightweight utility to process text, easy to learn and use... courtesy chatgpt: write a gawk script to find duplicates and in a new column assign the duplicates a new id.
      Here's a simple example of a gawk script to find duplicates and assign each duplicate a new ID:
      BEGIN {
      FS="," # Set the field separator to comma
      OFS="," # Set the output field separator to comma
      count=1 # Initialize the count to 1
      id=1 # Initialize the ID to 1
      }
      {
      if ($0 in seen) {
      print $0, id
      } else {
      seen[$0]=count
      count++
      print $0, ++id
      }
      }
      This script sets the input field separator (FS) and output field separator (OFS) to ,, and initializes two variables: count and id. The BEGIN block runs before any input is processed.
      The main body of the script uses an associative array (seen) to keep track of which lines have been seen before. If the current line ($0) is in the seen array, it prints the line and the current value of id. If the current line is not in the seen array, the script adds it to the seen array with a value of count, increments count, and prints the line with a new value of ++id.

  • @ismahenelarbi5403
    @ismahenelarbi5403 Před 2 lety +1

    Hi, thanks for the tutorial, where can i get the csv files you are using ?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 2 lety

      Hi I never put them up online but I generated them from a free service called mockaroo

  • @mangeshw9766
    @mangeshw9766 Před 3 lety

    Thanks bro ,...need help .......in the realtime I am getting data from the broker terminal ......I want one condition like previous data is in percentage I want that last previous percentage is less than current percentage data and vice versa ....and I want every 5 min

  • @edbull4891
    @edbull4891 Před 2 lety +1

    Your tutorials are sublime :) :)
    However, where can I get access to the excel test files, as I want to reproduce your demo.

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 2 lety +1

      Thanks! Ahh I’ll try to find them, but all my fake data comes from mockaroo

  • @ugwuanyiarinze5626
    @ugwuanyiarinze5626 Před 3 lety +3

    You could have used the jupyter extension of vscode for easy interaction

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety

      Yeah absolutely. I don’t know why I’ve just never been a fan of notebooks. I should probably revisit that idea though

  • @ajinkyapatil1642
    @ajinkyapatil1642 Před 3 lety

    Hello sir .
    I watched u r videos of how to read google sheet data using pandas.i got it.but after getting data i want to clear my data from database automatically. What can i do for that

  • @zohebdholakia3782
    @zohebdholakia3782 Před 10 měsíci +1

    subscribed 🤩

  • @shaikhzishan5342
    @shaikhzishan5342 Před 2 lety

    Where can we get this sample data

  • @nishant1998
    @nishant1998 Před rokem

    So when you do something (like when you changed the price of having $ to not having it) the memory saves for it? Like that part is stored so it remembers you did it? The reason im asking is because you deleted the lines of code when you made that change. So it must remember that you originally made that change to get rid of the $ right?

  • @RedMaster-mw6ti
    @RedMaster-mw6ti Před 3 lety

    Hey John. Long time watcher. First time student( spending my Sunday time programming instead of watching ). . So, John, Why doesn’t the code from selenium IDE for chrome work; when it’s generated for an ?
    I want to add the code to a Python script.
    The gets recorded, but doesn’t work when I replay it.

  • @MasterofPlay7
    @MasterofPlay7 Před 2 lety +1

    how do you output data that's not able to merge with the reference data?

  • @maciekpaciarski9343
    @maciekpaciarski9343 Před 3 lety +1

    great job . how about Heroku ? is it in your plans for future contents ?

  • @josephbrown8968
    @josephbrown8968 Před 2 měsíci +1

    Very new to python. How do I get the same user interface as you? When I downloaded it, it just gave me the shell/IDLE

  • @pr0skis
    @pr0skis Před 2 lety +1

    Not sure if you'll see this... but I just noticed you're running Ubuntu in WSL?
    That would be an interesting series to do - especially when production level scrapers almost always need to use a rotating IP and that's usually only possible in Linux.
    I'm still doing the good old VM way of using Linux haha

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 2 lety

      Sure, I’ve used WSL or dual booted into Linux ever since I’ve been coding properly. I just got used to the commands. I could look at doing a video on the benefits

  • @sheikhshah2593
    @sheikhshah2593 Před 3 lety +1

    Good video

  • @SDILUYNTsiu39fnd
    @SDILUYNTsiu39fnd Před rokem +1

    what theme are you using? it looks really nice

  • @aperxmim
    @aperxmim Před měsícem

    please provide the sample file?

  • @RS-Amsterdam
    @RS-Amsterdam Před 3 lety +1

    21:28 : how do you get the output colorized ?

    • @bhavik15
      @bhavik15 Před 3 lety +1

      Rainbow CSV addon

    • @RS-Amsterdam
      @RS-Amsterdam Před 3 lety +1

      @@bhavik15 Thank you kindly
      Installed and it works ;-)

  • @adnankattekaden7568
    @adnankattekaden7568 Před 3 lety

    Can You Create an Telgram Group To discuss more about WebScraping & Python?

  • @alexlijesen6197
    @alexlijesen6197 Před 3 lety +1

    Hello John
    we read in certain supplier invoices for customers only with a number of suppliers (invoices) we have problems reading in. via sep='\t' have tried but no result.
    we now first go to excel and read it in and then we save it to csv then it is changed to sep=';' then this we read in. ??
    what are we doing wrong when reading this format csv
    greetings alex

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety

      Hi Alex. Hard to say without seeing the file and how it reads in, what file type is the first invoice? Csv, xlsx or something else?

    • @alexlijesen6197
      @alexlijesen6197 Před 3 lety +1

      @@JohnWatsonRooney Hello John
      can i send you the file?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety

      Sure, my email is on my main CZcams page

    • @Arvinth14
      @Arvinth14 Před 3 lety

      May be the below given code should work,
      dataframe_name = pd.read_csv('filename.extension', delimiter='\t')

    • @alexlijesen6197
      @alexlijesen6197 Před 3 lety +1

      @@Arvinth14 Thanks for your suggestion, unfortunately that doesn't work either. what I find strange if I first divide it into columns in excel and then write it to csv it works fine. this only happens with the dutch supplier at csv from the web i have no problems

  • @LaoshiBhel
    @LaoshiBhel Před rokem

    How can i contact you.

  • @mrnargil
    @mrnargil Před 3 lety +1

    Your content is amazing, you need to work on your thumbnail to get more impression & click.

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety

      Thanks, I’ve tried a few different types of thumbnail, if you have some good examples you can link them to me?

    • @mrnargil
      @mrnargil Před 3 lety

      @@JohnWatsonRooney I just found out CZcams deleted my comment because I linked to a thumbnail photo 😑

  • @irfanshaikh262
    @irfanshaikh262 Před rokem

    21:12 you don't say John😊

  • @mandarraut9565
    @mandarraut9565 Před 3 lety +1

    Hi John,
    Thank you for all your videos.. It really helps.
    I have one request ,Can you help us on how to scrape dynamic website ,i mean the website which changes its query string parameters on pagination..
    If you want i can share the link with you..
    Just drop me your email id
    Thanks.. i am stuck and not able to understand how to scrape that website

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety +1

      Hey, glad you enjoy the videos. My email is on my main CZcams page if you want to drop me a message I will have a look when I get a chance

    • @mandarraut9565
      @mandarraut9565 Před 3 lety

      Thank you for taking time and responding to my comment.i have dropped you an email regarding the website, kindly have a look whenever it is convenient and Please keep making amazing knowledgeable content :D

  • @SkySesshomaru
    @SkySesshomaru Před 2 lety +1

    Yo...

  • @gutv
    @gutv Před 3 lety +1

    why my comment is being deleted?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety +1

      Was it a link? I haven’t deleted any comments from this video it must be CZcams

    • @gutv
      @gutv Před 3 lety +1

      @@JohnWatsonRooney Ohh there was a link indeed! I didn't know youtube doesn't allow it. I had asked if you could help me to solve a scrape issue. I'm trying to scrape a supermarket webpage (carrefouruae) to get the name and the price of a product but because the data is rendered throught javascript and my script is running inside a container where it doesn't have a browser, I don't know which library I can use to scrape. Thank you in advance! I Thank you for your awesome videos!

    • @JohnWatsonRooney
      @JohnWatsonRooney  Před 3 lety +1

      You can render the page with requests-html or have a look in the page source code for “next_data” you might be able to get something useful from the script tag there

    • @gutv
      @gutv Před 3 lety

      @@JohnWatsonRooney thank for the reply! I've tried and i got "Access Denied" :(

  • @rayearth9760
    @rayearth9760 Před rokem +1

    The video title says "Automate Excel Work..." but its content is all about csv data. This manipulation is disappointing.

    • @DevvratSingh007
      @DevvratSingh007 Před 2 měsíci

      No wonder it's called “data manipulation” ;)

  • @rohitguleria100
    @rohitguleria100 Před 2 měsíci

    Hi johny😊.
    I got a job recently as of data analyat but work is quite biring copy pasting in excel....
    So i want to know ia there any tool like chatgpt which can do this work for me....