I am trying to read all csv
files in a directory and merge a specific column in all files to a new DataFrame
. Basically, the files are of the format: file_name.csv
MainColumn A B C
Since the row order is constant in all the files, I am trying to extract the first column in file1
. Then I want only column B
from all files. So, the resultant DataFrame
has to be:
MainColumn B B B B...
Where the Bs
are the individual B
columns from file1, file2, etc.
This is my code so far:
data = pandas.read_csv('file_1.csv')
import glob
df2 = data[['MainColumn']]
for files in glob.glob("*.csv"):
data1 = pandas.read_csv(files)
df = data1[['ColumnB']]
df2 = df2.append(df)
The resultant df2
is not what is expected (it is of the form all rows from file1, then columnB is added after the rows from file1, etc)
尝试concat:在这里指定串联轴是关键,在df.append()中我没用
df2 = pd.concat([df2,df],axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.