[英]How to merge two dataframes but one dataframe has more than one cell with the condition?
Assuming we have two dataframes:假设我们有两个数据框:
df1 = pd.DataFrame(
[["foo",10], ["bar",20]],
columns=["1", "2"],
index=["x", "y"]
)
df2 = pd.DataFrame(
[["foo",10,20], ["bar",20,30], ["foo",10,30]],
columns=["1", "2", "3"],
index=["x", "y", "z"]
)
This would give us this:这会给我们这个:
output of df1 & df2 df1 & df2 的 output
If I were to merge these data with a condition on two columns:如果我要将这些数据与两列的条件合并:
df3 = pd.merge(df1, df2, how='left', on=['1','2'])
This would give us this:这会给我们这个:
output of merged df3合并df3的output
If I wanted the mean of the values that match the condition from df2 to be output into df3, how would I go about doing that?如果我希望与 df2 中的条件匹配的值的平均值为 output 到 df3,我将如何 go 这样做? (So instead of having two rows of foo & 10, I would only have one row of foo & 10, with the values for the third column being an average of the two rows that match the condition). (因此,我不会有两行 foo & 10,而是只有一行 foo & 10,第三列的值是符合条件的两行的平均值)。 I provided a picture below for clarity:为了清楚起见,我在下面提供了一张图片:
If you want mean then try:如果你想要意味着然后尝试:
df2 = df2.groupby(['1','2'], as_index=False).mean()
df3 = pd.merge(df1, df2, how='left', on=['1','2'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.