简体   繁体   中英

Python - Picking DataFrame columns from all csv files in directory and merging into one

I am trying to read all csv files in a directory and merge a specific column in all files to a new DataFrame . Basically, the files are of the format: file_name.csv

MainColumn A B C

Since the row order is constant in all the files, I am trying to extract the first column in file1 . Then I want only column B from all files. So, the resultant DataFrame has to be:

MainColumn B B B B...

Where the Bs are the individual B columns from file1, file2, etc. This is my code so far:

data = pandas.read_csv('file_1.csv')

import glob

df2 = data[['MainColumn']]

for files in glob.glob("*.csv"):

    data1 = pandas.read_csv(files)
    df = data1[['ColumnB']]
    df2 = df2.append(df)

The resultant df2 is not what is expected (it is of the form all rows from file1, then columnB is added after the rows from file1, etc)

尝试concat:在这里指定串联轴是关键,在df.append()中我没用

df2 = pd.concat([df2,df],axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM