[英]Python Filling dataframe values based on Column Index present in another dataframe value
I have 2 dataframes like below: dataframe df1:我有 2 个数据框,如下所示:数据框 df1:
id val1 val2 val3 val4 val5
abc 0.0 1.0 4.0 3.0 4.0
dsssd 0.0 1.0 1.0 1.0 1.0
dsd 0.0 4.0 7.0
Another dataframe df2:另一个数据帧 df2:
id val1 val2 val3 val4 val5
abc 88 76 55 43 21
dsssd 92.4 21.3 22 45 49
dsd 22.3 87.2 78.2
df1 contains column index as values. df1 包含列索引作为值。 I want to create df3 which has corresponding index value from df2.
我想创建 df3,它具有来自 df2 的相应索引值。 Expected results df3:
预期结果df3:
id val1 val2 val3 val4 val5
abc 88 76 21 43 21
dsssd 92.4 21.3 21.3 21.3 21.3
dsd 22.3 nan nan
I have explored df.lookup and iloc, but couldn't get how it can be done.我探索了 df.lookup 和 iloc,但不知道如何完成。 I am still looking to find solution.
我仍在寻找解决方案。 Meanwhile I posted it here, if anyone knew how it's done.
同时我把它贴在这里,如果有人知道它是怎么做的。
import pandas as pd
import numpy as np
df1= pd.DataFrame({'id': ['abs', 'dssd', 'dsd'],
'val1': [0.0, 0.0, 0.0],
'val2': [1.0, 1.0, 4.0],
'val3': [4.0, 1.0, 7.0],
'val4': [3.0, 1.0, np.nan],
'val5': [4.0, 1.0, np.nan]})
df2= pd.DataFrame({'id': ['abs', 'dssd', 'dsd'],
'val1': [88.0, 92.4, 22.3],
'val2': [76.0, 21.3, 87.2],
'val3': [55.0, 22.0, 78.2],
'val4': [43.0, 45.0, np.nan],
'val5': [21.0, 49.0, np.nan]})
Thanks!谢谢!
You can use DataFrame.set_index
with DataFrame.stack
for reshape, add counter column by GroupBy.cumcount
, left join by DataFrame.merge
and last pivoting by DataFrame.pivot
with change order of id
by DataFrame.reindex
:您可以使用
DataFrame.set_index
和DataFrame.stack
进行重塑,通过GroupBy.cumcount
添加计数器列,通过GroupBy.cumcount
左连接,最后通过DataFrame.pivot
进行DataFrame.merge
并通过DataFrame.pivot
更改id
DataFrame.reindex
:
df11 = df1.set_index('id').stack().rename_axis(index=['id','v']).reset_index(name='idx')
# print (df11)
df22 = df2.set_index('id').stack().rename_axis(index=['id','v']).reset_index(name='val')
df22['idx'] = df22.groupby('id').cumcount()
# print (df22)
df = (df11.merge(df22, on=['id','idx'], how='left')
.pivot(index='id', columns='v_x', values='val')
.reindex(df1['id'])
.rename_axis(None, axis=1)
.reset_index()
)
print (df)
id val1 val2 val3 val4 val5
0 abs 88.0 76.0 21.0 43.0 21.0
1 dssd 92.4 21.3 21.3 21.3 21.3
2 dsd 22.3 NaN NaN NaN NaN
use merge使用合并
https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html
pd.merge(df1, df2, how='outer', left_on='id', right_on='id',
left_index=False, right_index=False, sort=True,
suffixes=('_x', '_y'), copy=True, indicator=False,
validate=None)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.