简体   繁体   中英

A tested means of combining Excel (*.xls) files into a single (*.xlsx) file

I'd been having a bit of trouble finding meaningful info on how to open multiple Excel ( .xls) files in Python and combine them into a single dataframe and then save them out again as a single Excel ( .xlsx) file.

I found a way that works for me, and I thought i'd share with all the helpful people on Stack Overflow!!!

The solution I found is as follows:

Be sure to install all appropriate packages in Anaconda for library imports to work

import numpy as np
import pandas as pd
import os
import xlrd
import glob2

*** is the directory where your files are located

os.chdir("/***/Extracted Individual Excel Files/")

"combined_df" is what I named the dataframe, but you can name it whatever, just be sure to reflect the change in your code...

combined_df = pd.DataFrame()
for f in glob2.glob("*.xls"):
    df = pd.read_excel(f)
    combined_df = combined_df.append(df,ignore_index=True)

At this point you should be able punch in ”combined_df” and press enter to display the dataframe...

Use the following code to save your newly combined dataframe to a file of your choosing.

Again, *** is where you want your file saved... @@@@ is the file name you give the file

combined_df.to_excel(r'/***/@@@@.xlsx', index = False)

I hope this helps everyone. As a beginner, it was a pain to figure out!!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM