Problem with combining multiple excel files in python pandas

Question

I am quite new to python programming. I need to combine 1000+ files into one file. each file has 3 sheets in it and I need to get data only from sheet2 and make an final excel file. I am facing a problem to pick a value from specific cell from each excel file on sheet2 and create a column. python is picking the value from first file and create a column on that

    df = pd.DataFrame()
            
    for file in files:
        if file.endswith('.xlsm'):
            df = pd.read_excel(file, sheet_name=1, header=None) 
            df['REPORT_NO'] = df.iloc[1][4] #Report Number
            df['SUPPLIER'] = df.iloc[2][4] #Supplier
            df['REPORT_DATE'] = df.iloc[0][4] #Report Number
        df2 = df2.dropna(thresh=15)
        df2 = df.append(df, ignore_index=True)
        df = df.reset_index()
        del df['index']
    df2.to_excel('FINAL_FILES.xlsx')

How can I solve this issue so python can take from each excel and put the information on right rows.

Answer 1

I df.iloc[2][4] refers to the 2nd row and 4th column of the 1st sheet. You have imported with sheet_name=1 and never activated a different sheet, though you mentioned all of the .xlsm have 3 sheets.

II your scoping could be wrong. Why define df outside of the loop? If will change per file, so no need for an external one. All info form the loop should be put into your df2 before the next iteration of the loop.

III Have you checked if append is adding a row or a column?
Even though

df['REPORT_NO'] = df.iloc[1][4] #Report Number
df['SUPPLIER'] = df.iloc[2][4] #Supplier
df['REPORT_DATE'] = df.iloc[0][4] #Report Number

are written as columns they have Report Number/Supplier/Report Date repeated for every row in that column.

When you use df2 = df.append(df, ignore_index=True) check the output. It might not be appending in the way you intend.

Problem with combining multiple excel files in python pandas

Question

1 answers

solution1
0 2021-04-03 20:46:33

Problem with combining multiple excel files in python pandas

Question

1 answers

solution1 0 2021-04-03 20:46:33

solution1
0 2021-04-03 20:46:33