多个 CSV 合二为一，文件名作为 Pandas 中的列名

Question

I have a directory with a hundred the CSV files inside.我有一个目录，里面有一百个 CSV 文件。 One of the CSV looks like this; CSV 之一看起来像这样；

Time    ID
09:00   A
..      ..

I want to join all of the csv into one dataframe with including name of file (append by axis=1) I used this code:我想将所有 csv 加入到一个 dataframe 中，其中包括文件名（附加轴 = 1）我使用了以下代码：

files = glob.glob(data/*.csv')
df = pd.concat([pd.read_csv(fp).assign(File=os.path.basename(fp).split('.')[0]) for fp in files], axis=1)
df.to_csv('new.csv')
df

I got a result looks like this我得到的结果看起来像这样

Time    ID  File  Time  ID  File    ..
09:00   A   01    09:00 B   02      ..
..      ..  ..    ..    ..  ..      ..

I want to join the ID column name with the file name as a column name.我想以文件名作为列名加入 ID 列名。 my expected result looks like this:我的预期结果如下所示：

Time    01_ID   Time    02_ID   ..
09:00   A       09:00   B       ..
..      ..      ..      ..      ..

Answer 1

You can use dictionary comprehension first:您可以先使用字典理解：

comp = {os.path.basename(fp).split('.')[0]: pd.read_csv(fp) for fp in files}
df = pd.concat(comp, axis=1)

And then filter in list comprehension for convert MultiIndex in columns :然后在列表理解中过滤以MultiIndex in columns ：

df.columns = [f"{a}_{b}" if b == 'ID' else b for a, b in df.columns]
print (df)
    Time 01_ID   Time 02_ID
0  09:00     A  09:00     B

df.to_csv('new.csv')

EDIT: Better solution is create unique columns names:编辑：更好的解决方案是创建唯一的列名：

df.columns = df.columns.map('_'.join)
print (df)
  01_Time 01_ID 02_Time 02_ID
0   09:00     A   09:00     B

多个 CSV 合二为一，文件名作为 Pandas 中的列名

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-09-25 05:28:05

多个 CSV 合二为一，文件名作为 Pandas 中的列名

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-09-25 05:28:05

解决方案1
2 已采纳 2019-09-25 05:28:05