[英]Pandas:Conditionally combining multi-indexing dataframes
I have two dataframes (df1 and df2) which are used pd.groupby from a same datafram.我有两个数据帧(df1 和 df2),它们使用来自同一个数据帧的 pd.groupby。 I want to conditionally create another dataframe (df_result) based on the following logic:我想根据以下逻辑有条件地创建另一个 dataframe (df_result):
INPUT1: df1输入1:df1
Type_lv1 Type_lv2 diff
count mean
t1 t_a 1 0.02
t_b 12 0.01
t2 t_a 5 0.12
t_b 22 0.11
INPUT2: df2输入2:df2
Type_lv1 diff
count mean
t1 13 0.011
t2 27 0.11
OUTPUT:df_result OUTPUT:df_result
Type_lv1 Type_lv2 diff_result
mean
t1 t_a 0.011
t_b 0.01
t2 t_a 0.11
t_b 0.11
In sample data above, the first roq and third row in df_result come from df2 since the counts in df1 are 1 and 5, which are smaller than the threshold 6. I have tried to found related samples in past questions but could get what I want.在上面的示例数据中,df_result 中的第一行和第三行来自 df2,因为 df1 中的计数是 1 和 5,小于阈值 6。我试图在过去的问题中找到相关示例,但可以得到我想要的. Could someone give me some directions?有人可以给我一些指示吗?
Thanks!!谢谢!!
Idea is add second level from df1
to df2
for MultiIndex
, so possible repalce by condition in last step, only necessary matching first level of MultiIndex in both DataFrame
s:想法是为MultiIndex
添加从df1
到df2
的第二级,因此可能在最后一步中按条件替换,只需要在 DataFrame 中匹配DataFrame
的第一级:
m = df1[('diff','count')] < 6
mux = pd.MultiIndex.from_product([df2.index.unique(),
df1.index.get_level_values(1).unique()])
df3 = df2.reindex(mux, level=0)
df1.loc[m, ('diff','mean')] = df3[('diff','mean')]
print (df1)
diff
count mean
Type_lv1 Type_lv2
t1 t_a 1 0.011
t_b 12 0.010
t2 t_a 5 0.110
t_b 22 0.110
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.