[英]concatenating and saving multiple pair of CSV in pandas
我是python的初學者。 我有一百對CSV文件。 該文件如下所示:
25_13oct_speed_0.csv
26_13oct_speed_0.csv
25_13oct_speed_0.1.csv
26_13oct_speed_0.1.csv
25_13oct_speed_0.2.csv
26_13oct_speed_0.2.csv
and others
我想將配對文件連接在25到26個文件之間。 每對文件都有一個速度閾值(Speed_0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9, 2.0),該標記在文件名上。 這些文件具有相同的結構數據。
Mac Annotation X Y
A first 0 0
A last 0 0
B first 0 0
B last 0 0
因此,串聯分析足以連接這兩個數據。 我使用這種方法:
df1 = pd.read_csv('25_13oct_speed_0.csv')
df2 = pd.read_csv('26_13oct_speed_0.csv')
frames = [df1, df2]
result = pd.concat(frames)
每對文件。 但是這需要時間,而且不是一種優雅的方法。 有沒有一種很好的方法來自動組合配對文件並同時保存?
想法是通過文件列表創建DataFrame,並通過Series.str.split
由第一個_
添加2個新列:
print (files)
['25_13oct_speed_0.csv', '26_13oct_speed_0.csv',
'25_13oct_speed_0.1.csv', '26_13oct_speed_0.1.csv',
'25_13oct_speed_0.2.csv', '26_13oct_speed_0.2.csv']
df1 = pd.DataFrame({'files': files})
df1[['g','names']] = df1['files'].str.split('_', n=1, expand=True)
print (df1)
files g names
0 25_13oct_speed_0.csv 25 13oct_speed_0.csv
1 26_13oct_speed_0.csv 26 13oct_speed_0.csv
2 25_13oct_speed_0.1.csv 25 13oct_speed_0.1.csv
3 26_13oct_speed_0.1.csv 26 13oct_speed_0.1.csv
4 25_13oct_speed_0.2.csv 25 13oct_speed_0.2.csv
5 26_13oct_speed_0.2.csv 26 13oct_speed_0.2.csv
每列組然后循環names
,通過循環組, DataFrame.itertuples
並創造新的數據框與read_csv
,如果需要則添加由值填充新列g
,追加到列表中, concat
和最后一個洞穴,從列由名新文件names
:
for i, g in df1.groupby('names'):
out = []
for n in g.itertuples():
df = pd.read_csv(n.files).assign(source=n.g)
out.append(df)
dfbig = pd.concat(out, ignore_index=True)
print (dfbig)
dfbig.to_csv(g['names'].iat[0])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.