Hi I have multiple xlsx files
sales-feb-2014.xlsx
sales-jan-2014.xlsx
sales-mar-2014.xlsx
I have merged all 3 sheets into one data set using file name as INDEX[0]
script :
import pandas as pd
import numpy as np
import glob
import os
all_data = pd.DataFrame()
for f in glob.glob(r'H:\Learning\files\sales*.xlsx'):
df = pd.read_excel(f)
df['filename'] = os.path.basename(f)
df = df.reset_index().set_index('filename')
print(df)
Now Data looks like this :
file name col1 col2 col3
sales-jan-2014.xlsx .... .... ...
sales-feb-2014.xlsx .... .... ...
sales-mar-2014.xlsx .... .... ...
here I want to load new xlsx file where I need to load
sales-jan-2014.xlsx into sheet1
sales-feb-2014.xlsx into sheet2
sales-mar-2014.xlsx into sheet3
I have tried with this script :
writer = pd.ExcelWriter('output.xlsx')
for filename in df.index.get_level_values(0).unique():
temp_df = df.xs(filename, level=0)
temp_df.to_excel(writer,filename)
writer.save()
after executing this script i'm getting error :
loc, new_ax = labels.get_loc_level(key, level=level, AttributeError: 'Index' object has no attribute 'get_loc_level'
can you please suggest where I'm missing
Try using the below code :
import os
import pandas as pd
dirpath = "C:\\Users\\Path\\TO\\Your XLS folder\\data\\"
fileNames = os.listdir(dirpath)
writer = pd.ExcelWriter(dirpath+'combined.xlsx', engine='xlsxwriter')
for fname in fileNames:
df = pd.read_excel(dirpath+fname)
print(df)
df.to_excel(writer, sheet_name=fname)
writer.save()
You can also use your code by make below changes :
for f in glob.glob(r'H:\Learning\files\sales*.xlsx'):
df = pd.read_excel(f)
df['filename'] = os.path.basename(f)
df = df.reset_index()
print(df.columns)
df.set_index(['filename','index'], inplace=True)
and saving it as you have done.
I hope this helps
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.