[英]How to put 2 different dataframes in pandas
我想合並 2 個具有不同順序的文件:
我想在file1和file2下面合並:
文件1:
col1 col2 col3
A001 B001 C001
A002 B002 C002
A003 B003 C003
A004 B004 C004
A005 B005 C005
A006 B006 C006
文件2:
col1 col2
A001 8
A002 2
A003 4
A004 1
A005 8
A006 3
B001 7
B002 4
B003 10
B004 11
B005 8
B006 3
C001 2
C002 9
C003 8
C004 1
C005 7
C006 6
獲得以下信息:
col1 col2 col3 col4 col5 col6
A001 8 B001 7 C001 2
A002 2 B002 4 C002 9
A003 4 B003 10 C003 8
A004 1 B004 11 C004 1
A005 8 B005 8 C005 7
A006 3 B006 3 C006 6
我非常感謝你的幫助:)
我會做什么來replace
df=pd.concat([file1,file1.replace(dict(zip(file2.col1,file2.col2))).add_suffix('_1')],axis=1).\
sort_index(axis=1)
col1 col1_1 col2 col2_1 col3 col3_1
0 A001 8 B001 7 C001 2
1 A002 2 B002 4 C002 9
2 A003 4 B003 10 C003 8
3 A004 1 B004 11 C004 1
4 A005 8 B005 8 C005 7
5 A006 3 B006 3 C006 6
這是使用帶有enumerate
和Series.map
的for loop
的更具可讀性的解決方案:
for idx, col in enumerate(df1.columns):
df1[f'{col}_{idx+1}'] = df1[col].map(df2.set_index('col1')['col2'])
df1 = df1.sort_index(axis='columns')
col1 col1_1 col2 col2_2 col3 col3_3
0 A001 8 B001 7 C001 2
1 A002 2 B002 4 C002 9
2 A003 4 B003 10 C003 8
3 A004 1 B004 11 C004 1
4 A005 8 B005 8 C005 7
5 A006 3 B006 3 C006 6
除了我喜歡它之外,沒有人聲稱這更好。
def f():
m = dict(zip(*map(file2.get, file2)))
for i, c in enumerate(file1):
yield file1[c].rename(f'col{i * 2 + 1}')
yield file1[c].replace(m).rename(f'col{i * 2 + 2}')
pd.concat(f(), axis=1)
col1 col2 col3 col4 col5 col6
0 A001 8 B001 7 C001 2
1 A002 2 B002 4 C002 9
2 A003 4 B003 10 C003 8
3 A004 1 B004 11 C004 1
4 A005 8 B005 8 C005 7
5 A006 3 B006 3 C006 6
這真的讓我很困擾,沒有一種超級簡單的方法可以做到這一點。
這是另一個配方
m = dict(zip(*map(file2.get, file2)))
pd.concat({
(c, i): a
for c in file1
for i, a in enumerate([file1[c], file1[c].replace(m)])
}, axis=1)
col1 col2 col3
0 1 0 1 0 1
0 A001 8 B001 7 C001 2
1 A002 2 B002 4 C002 9
2 A003 4 B003 10 C003 8
3 A004 1 B004 11 C004 1
4 A005 8 B005 8 C005 7
5 A006 3 B006 3 C006 6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.