简体   繁体   中英

Read & Combine Data From Multiple Excel Files with Pandas

I want to read several excel files from a directory into pandas and concatenate them into one big dataframe. I have not been able to figure it out though. All the files have 5 columns which are:

 C   N    S    R   Q

Except one file that has 7 columns which are

D   I   C    N   QI   P  L

How can i get a one big dataframe with these columns

C   N    S    R   Q

Code:

import pandas as pd
import glob

path = #path
all_files = glob.glob(path + "/*.csv")

li = []

for filename in all_files:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df)

frame = pd.concat(li, axis=0, ignore_index=True)

NB: I don't have same columns in all my files

How can i fix this?

Just add what columns you want to keep and select them from the dataframes.

import pandas as pd
import glob

path = #path
all_files = glob.glob(path + "/*.csv")

li = []
cols_to_keep = ['C', 'N', 'S', 'R', 'Q']

for filename in all_files:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df[cols_to_keep])

frame = pd.concat(li, axis=0, ignore_index=True)

How will that be possible if you have 100 excel files with similar data points but the headers are written in different ways?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM