Cnsolidating/Combining multiple excel files together with the help of Python & Pandas
Vložit
- čas přidán 25. 07. 2024
- Cnsolidating multiple excel files together with the help of python & pandas | Combine Excel Files with Python
If you are wroking in Data Industry then certainly your day to day job involves dealing with many csv/excel files consolidation which makes the task really tedeous and make your work boaring, that's where we bring you this video which will help you do this task automatically.
This video covers:
00:00 - Introduction
02:24 - Import libraries
02:52 - Input and Output loaction assignment to python variables
03:56 - Creating list of input files
05:10 - Reading all the excel files one by one, consolidating them altogether and exporting it to excel automatically with the help of FOR LOOP
Excel File - Excel_Files_to_Consolidate.zip
URL - github.com/LEARNEREA/Python/t...
Script - Consolidate Multiple Excel Files Into One.py
URL - github.com/LEARNEREA/Python/t... - Věda a technologie
The topic/problem description you choose are best.
Excellent Sir, this is destination where I find answer of all my question.
Hi, Well Explained... Really really Appreciated. Thanks a lot. Good going.
Excellent work. I have been just looking the solutions. You explained it very nicely. Thanks for vedio
Glad it helped you
Thanks a lot!..,
Helpfull! thanks a lot
Glad it helped!
Thanks!
Welcome!
Sir. Brilliant video. But this video is for the files which has only one sheet in it. But my files have multiple sheets with them. How to consolidate based on those sheets
nice video ! , i am working on a script for some bank excel files. Is there any of your videos talking about appending an existing sheet?
You can import the existing excel in python/pandas then import other excels which you want to append, then do the appending in python itself and then export the final data to excel
Great video, when I used the code it gave me an error of list index out of range, any idea why that might happen ?
Understood,
please put your code here, we will check and suggest you the solution
Hi,
This is very helpful but i have a question what if my business need is to see only one file with all these excel but i dont want to append it rather i want electronics in separate sheet, Fashion in separate sheet of single workbook what logic can be use in that case
Thanks
Hi Arun,
here is the video for you -
czcams.com/video/vyFbr90fMiM/video.html
Hi, is this the same method to combine if the files each have different headers?
that's little subjective depending on the purpose....
if you just want to combine them one below another, for sure you can use this method... the only problem is, the file which doesn't have a particular column which is available in another one, in the final data you will get missing values in those columns against the rows coming from the file which doesn't have that column
hello thanks for explaining! This is throwing me the following error : "OptionError: "No such keys(s): 'io.excel.zip.reader'"
could you please help me
Of you are using the line
If file.endswith('xlsx')
You mus not get this error
However to be on safer side I recommend you remove the zipped file which you have in the folder which you are looking into for the Excel files
Hi how do I make a column in the consolidated file that states the original file name of the data
use the below line just after the line where it's reading the excel file in for loop in the video
df['imported_file'] = file
what if each file has multiple sheets in it and we have to combine each sheet respectively.i.e, sheet1 in file 1 combined with sheet 1 in files and sheet 2 file1 with sheet2 in file2...
Hope you got its solution in the another video
Mate, thank you very much. However, this will only work if the Excel files have only one sheet, right? What about if those Excel files have many sheets and I only want to append the sheets with a specific name?
Hi @Agustin Nosiglia,
Bringing it, we will make a video for this as well soon.
Hi Agustin,
If I understood the problem statement correctly then this is what you are looking for -
czcams.com/video/r60J5YZKMLs/video.html
df = pd.read_excel(input_loc + files, sheet_name = ' your sheet name')
How to access Localhost like this video?
Iam finding dificulty in excuting the code please help me with this. error in the part below
finalDF = pd.DataFrame()
for files in filelist:
if files.endswith(".xlsx"):
df = pd.read_excel(input_loc+files)
finalDf = finalDf.append(Df)
finalDf.to_excel(output_loc+"finalDf.xlsx")
"NameError: name 'finalDf' is not defined" please help
Look at the first statement where you are defining finalDF, it has capital letter "DF", while when you are using it it has small "f", either make it finalDf everywhere or make it finalDF everywhere..
happy learning.
Sir I am doing everything as shown in video but the values is not showing any n my newdf. How to resolve it please help
Put your script here once so that I could have a look and could support you
For a specific sheet
df = pd.read_excel(input_loc + files, sheet_name = ' your sheet name')
this will work if the sheet name is same in all the excel files... if you need to have this one also dynamic, the below one can give some idea and basis on which it can be achieved -
czcams.com/video/r60J5YZKMLs/video.html
Great video , when I try this , have value error : excel format cannot be determined, you must specify an engine manually. Any idea how to solve this ?
Put your code here, then only we will be to analyse and provide the solution
Sir I m beginner for python, how to work with this Jupyter. I need to install or something else..plz let me know
if you are still looking for an answer to this, refer to the below -
czcams.com/video/rwHqzBiWPHQ/video.html
it should help
super,can please do some videos on merging multiple excel files with multiple sheets to one master excel files with sheets
Sure will do so
Hi sir, is their anyway to combine multiple csv files into one excel workbook in multiple sheets?I want it in multiple sheets in one workbook not in single sheet.
first filter the data and then export that on the respective sheets in same excel file...
let me know if need further help
How can i add file name from which the data is fetched in the concat file.
use the below line just after the line where it's reading the excel file in for loop in the video
df['imported_file'] = file
Hi sir great video. But what if I want to add new data into the final output? How would I do that then?
Depending on what sort of or how you want add.. if you want to add another dataframes you can do so by using concatenation, append or merge whatever fits to the requirements - to learn this, you can start with this video and follow the series
czcams.com/video/06iGLesHi8g/video.html
If you just want to add a column to the final dataframe then simply you can -
FinalDf['newColumn']=your_column
@@learnerea Hi, my sincere apologies for replaying this late but I think the problem to be solved is still waiting for us.:) First through I want to thank you so much for your time and support on the matter! I will send you the sample data with the problem description and what I have done and what I want in the end. Thanks!
i want to marge multiple excel, all excel have common 3 sheets. please help to get code for the same, i want all data merge in new blank excel sheet wise
replied on the another comment, please try..
will be happy to help further
Hi, i try to run ur code and it said "DataFrame' object has no attribute 'append'", i read some article and try to change it with concat and it still doesnt work. Is there any idea on this? Thanks in advance.
Can you please put your script here for us to analyse that
@@learnerea Nvm, Im fine, thank you very much for your code. Appreciate it m8
Can you do it in python script?
If you mean in native and not by using any library, in native python we don't have a dataframe what we have is dictionary, lists, tuples... if you want to merge dataframe then first you gotta convert them in any of the above and then possible it can be done
Hi, that is a good tutorial. However, you have used a small number of excel files and have similar column names. The real challenge is what if you have different column names, or rather inconsistent column names and the number of files is say 300? That is what you get in real-world data?
Hi Anteneh,
Thanks for watching the video and posting your issue.
Doesn't seem to be a huge matter of concern. First of all you need to break down the problem in steps and then solve it -
1. What is the end result that you nee
2. What is the current input which you have
lets start from point 2 -
1. You might have different files with different number of collumns - that's fine when you combine them all together, there will be certain columns with missing values and that's a separate issue to figure out.
2. You might have different files with different column names - Same as point number 1
3. You might have different files with different column names but the content are same.. e.g. one file has the column name "Gender" while other one is "Sex"... this is also fine. my recommendation is to do the consolidation first then figure out all such columns and combine them all together...
Not sure if this is going to help...
if you want, you can share some let say 5-10 files over the mail id learnerea.edu@gmail.com.. and I'll try to provide you solution with the help of this sample
I have understood the code snippet for consolidating multiple input files in one new excel file. My question is if input files contain hidden records. those hidden records should not be visible on output file. How to do python code for this requirement. please share who knows
You must be having some criteria to hide the records..
after consolidation of the data, use that criteria to filter them out..
for filtering the dataframe you can check out these specific videos
czcams.com/video/HqG7Gvq1dWI/video.htmlsi=a7nW_FPrHCBuu6wi
czcams.com/video/FDlOyi9dFt8/video.htmlsi=4RD1bvTc8LszHH3O
Sir I have one request, kindly create a video on OOP with real life situation like we create multiple file in pycharm, VS code etc. and call other other Class from one place like that video i am requesting.
Like we have one main file app.py for EDA whave seperate file, data cleaning seperate file, Model building seperate file, how to connect all of them with app.py file.
will try making it soon
SIR, HOW TO RUN 1 OR MORE EXCEL FILE, BY PYTHON CODE LIVR RUN, FROM LINK , ie. NSE, ETC
Can you give some examples to clarify on this doubt
Gives error "can only concatrbate Str (not list) to str
can you please put your complete script here along with the complete error message.. I'll try my best to solve this.
I follow this code but in output my output file is blank
Put your code here for review
Sir I m following your script but showing error os error file name directory name or volume label syntaxes is incorrect. It is showing. How to resolve this?
Put your code here
We will have a look
@@learnerea
import pandas as pd
import os
input_loc = "D:\document\temp\output"
output_loc = "D:\document\temp\output\temp_out"
fileList = os.listdir(input_loc)
This is my code and I have entered the input location where my excel files are present which need to be merged and the output location where i need merged files. it will be more convenient if you share any personal contact details to ask doubts over there. I will be highly thankful to you
Plz give me reply
The code looks okay, just ensure two things -
1. the location exist in your system
2. if you are working on windows, try changing the backward slashes (\) with forward slashes (/) in the location and always end it with a slash, e.g. instead of just
"D:\document\temp\output"
change it to
"D:/document/temp/output/"
Plz provide your mail id,I will share my code pic over there. I m still getting error.