简体   繁体   中英

Adding Multiple .xls files to a Single .xls file, using the file name to name tabs

I have multiple directories, each of which containing any number of .xls files. I'd like to take the files in any given directory and combine them into one .xls file, using the file names as the tab names. For example if there are the files NAME.xls, AGE.xls, LOCATION.xls, I'd like to combine them into a new file with the data from NAME.xls on a tab called NAME, the data from AGE.xls on a tab called AGE and so on. Each source .xls file only has one column of data with no headers. This is what I have so far, and well it's not working. Any help would be greatly appreciated (I'm fairly new to Python and I've never had to do anything like this before).

wkbk = xlwt.Workbook()

xlsfiles =  glob.glob(os.path.join(path, "*.xls"))
onlyfiles = [f for f in listdir(path) if isfile(join(path, f))]
tabNames = []
for OF in onlyfiles:
    if str(OF)[-4:] == ".xls":
        sheetName = str(OF)[:-4]
        tabNames.append(sheetName)
    else:
        pass

for TN in tabNames:
    outsheet = wkbk.add_sheet(str(TN))
    data = pd.read_excel(path + "\\" + TN + ".xls", sheet_name="data")
    data.to_excel(path + "\\" + "Combined" + ".xls", sheet_name = str(TN))

Can you try

import pandas as pd
import glob

path = 'YourPath\ToYour\Files\\' # Note the \\ at the end

# Create a list with only .xls files
list_xls = glob.glob1(path,"*.xls") 

# Create a writer for pandas
writer = pd.ExcelWriter(path + "Combined.xls", engine = 'xlwt')

# Loop on all the files
for xls_file in list_xls:
    # Read the xls file and the sheet named data
    df_data = pd.read_excel(io = path + xls_file, sheet_name="data") 
    # Are the sheet containing data in all your xls file named "data" ?
    # Write the data into a sheet named after the file
    df_data.to_excel(writer, sheet_name = xls_file[:-4])
# Save and close your Combined.xls
writer.save()
writer.close()

Let me know if it works for you, I never tried engine = 'xlwt' as I don't work with .xls file but .xlsx

Here is a small helper function - it supports both .xls and .xlsx files:

import pandas as pd
try:
    from pathlib import Path
except ImportError:              # Python 2
    from pathlib2 import Path


def merge_excel_files(dir_name, out_filename='result.xlsx', **kwargs):
    p = Path(dir_name)
    with pd.ExcelWriter(out_filename) as xls:
        _ = [pd.read_excel(f, header=None, **kwargs)
               .to_excel(xls, sheet_name=f.stem, index=False, header=None)
             for f in p.glob('*.xls*')]

Usage:

merge_excel_files(r'D:\temp\xls_directory', 'd:/temp/out.xls')
merge_excel_files(r'D:\temp\xlsx_directory', 'd:/temp/out.xlsx')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM