So I have 1500 Excel Workbooks, each with 10+ sheets with exactly same structure. I tried to combine Multiple Excel Workbooks into one file and succeeded using code below:
import os
import pandas as pd
cwd = os.path.abspath('')
files = os.listdir(cwd)
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(file), ignore_index=True)
df.head()
df.to_excel('Combined_Excels.xlsx')
And as a result I get combined Excel Workbook, but only with 1st sheets(pages) of each Excel Workbook. I figured that I need to add a parameter sheet_name=None
, but this way my dataframe becomes a dict
type and when appended, the result is not quite what I wanted.
import os
import pandas as pd
cwd = os.path.abspath('')
files = os.listdir(cwd)
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(file, sheet_name=None), ignore_index=True)
df.head()
df.to_excel('Combined_Excels.xlsx')
Have you guys had similar problems? How can I at least combine all sheets into one? Combining Multiple Excel Sheets would not be a problem imo as the first sample worked perfectly.
Thanks, Nurbek
You might find a better and more efficient ways to solve this problem, but this is how I did it:
import os
import pandas as pd
# First, combine all the pages in each Workbook into one sheet
cwd = os.path.abspath('')
files = os.listdir(cwd)
df_toAppend = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df_toAppend = pd.concat(pd.read_excel(file, sheet_name=None), ignore_index=True)
df_toAppend.to_excel(file)
# And then append all the Workbooks into single Excel Workbook sheet
cwd_2 = os.path.abspath('')
files_2 = os.listdir(cwd_2)
df_toCombine = pd.DataFrame()
for file_2 in files_2:
if file_2.endswith('.xlsx'):
df_toCombine = df_toCombine.append(pd.read_excel(file_2), ignore_index=True)
df_toCombine.to_excel('Combined_Excels.xlsx')
With Large sets of data, it may take considerable amount of time to combine. Hope this helps someone eventually.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.