[英]Merge columns from two data frame with different shapes
I have two df such as these: 我有两个像这样的df:
dfA
Out[191]:
a b c d
0 N M 1 3
1 S F 2 4
1 S F 2 4
And another one like this: 另一个像这样:
dfM
Out[192]:
X Y d1 d2 d3
0 N M 0.1 0.2 0.3
1 S F 1.0 2.0 3.0
Now I want to merge these two to get a df like this: 现在,我想将这两个合并以获得类似的df:
a b c d e
0 N M 1 3 0.1
1 S F 2 4 1.0
1 S F 2 4 2.0
The merged df has value from dfM d's columns which got filled based on number of times the rows of dfA got repeated. 合并的df具有dfM d列中的值,这些列根据dfA行重复的次数填充。 How to do this in python?
如何在python中做到这一点?
One possible solution is, for each dfM
row, use their X
and Y
values to filter dfA
rows and set 'e'
column to the remaining values of dfM
rows list. 一种可能的解决方案是,对于每个
dfM
行,使用其X
和Y
值来过滤dfA
行,并将'e'
列设置为dfM
行列表的其余值。 Check the example below: 检查以下示例:
for i, row in dfM.iterrows():
d_values = row[2:].tolist()
indexes = list(dfA[(dfA.a == row.X) & (dfA.b == row.Y)].index)
dfA.loc[indexes, "e"] = d_values[:len(indexes)]
You can use cumcount
for helper counter column for merge
with left join and also second DataFrame is reshaped by melt
: 您可以将
cumcount
用于辅助计数器列,以与左连接merge
,并且第二个DataFrame也可以通过melt
重塑:
dfA['groups'] = dfA.groupby(['a','b']).cumcount()
dfM1 = dfM.melt(['X','Y'], value_name='e')
dfM1['groups'] = dfM1.groupby(['X','Y']).cumcount()
print (dfM1)
X Y variable e groups
0 N M d1 0.1 0
1 S F d1 1.0 0
2 N M d2 0.2 1
3 S F d2 2.0 1
4 N M d3 0.3 2
5 S F d3 3.0 2
d = {'X':'a', 'Y':'b'}
df = (dfA.merge(dfM1.rename(columns=d), on=['a','b', 'groups'], how='left')
.drop(['variable','groups'],axis=1))
print (df)
a b c d e
0 N M 1 3 0.1
1 S F 2 4 1.0
2 S F 2 4 2.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.