简体   繁体   中英

Group specific rows from multiple files and save each groups of rows in a new excel file with python (pandas, openpyxl)

Can someone please help me to solve the following issue:

  • I have multiple excel files, some of them have 3 columns ('Year','Car','Price') and others 5 columns ('Year','Car','Color','Places','Country');

  • In a specific column ('Year') of each file, I want to group the rows by year;

  • Then I want to save these groups of rows in different sheets of a new file.

My actual issue is that when python read and group the rows from these files, my code will only save the last file it red.

Thanks a lot by advance!

from tkinter import filedialog
import pandas as pd

window = Tk()
window.title("title")
#(etc.)
label .pack()

def action():
     all_files = filedialog.askopenfilename(initialdir = "/", 
     multiple=True,
     title="select",
     filetypes=(
             ("all files", "*.*"),
             ("Excel", "*.xlsx*")))
      dossier=filedialog.askdirectory()
      final=pd.DataFrame()
      first=True
      for f in all_files:
           step1 =pd.read_excel(f,sheet_name=0)
           final=step1
           final['Year']=final['Year'].apply(str)
           lst1=final.groupby('Year')
           lst0=lst1.get_group('2013')
           with pd.ExcelWriter(dossier+'\\sells.xlsx') as writer:
                lst0.to_excel(writer, sheet_name='2013',index=False)
    tkinter.messagebox.showinfo("Files", "Ready")

ExcelWriter has default mode set to write:

mode{'w', 'a'}, default 'w' File mode to use (write or append). Append does not work with fsspec URLs.

Try specifying append mode with if_sheet_exists set to overlay :

if_sheet_exists{'error', 'new', 'replace', 'overlay'}, default 'error'
How to behave when trying to write to a sheet that already exists (append mode only).

  • error: raise a ValueError.
  • new: Create a new sheet, with a name determined by the engine.
  • replace: Delete the contents of the sheet before writing to it.
  • overlay: Write contents to the existing sheet without removing the old contents.
with pd.ExcelWriter(dossier+'\\sells.xlsx', mode="a", if_sheet_exists="overlay") as writer:
   # ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM