简体   繁体   English

在 groupby 中使用 isin

[英]use isin inside groupby

I have two dataframes df1 is:我有两个数据帧df1是:

    uid   a   b
0   111   i1   2
1   111   i2   3
2   111   i4   5
3   111   i6   7
4   111   i5   8
5   222   i10   9
6   222   i11  12
7   222   i13  11
8   222   i2   1
9   333   i14  16
10  333   i3   2
11  333   i16  19
12  333   i3   4
13  444   i21  20
14  444   i9   5

and my df2:和我的 df2:

    uid   a   
0   111   i1   
1   111   i2   
2   222   i4   
3   222   i11  
4   222   i13  
5   222   i2   

I want to check if the df2.a is in df1.a group by uid , if it exists then I want to create column "new" and put 1 in this column我想通过uid检查df2.a是否在df1.a组中,如果它存在,那么我想创建列"new"并将 1 放在此列中

My expected output is:我的预期输出是:

    uid   a   b.  new 
0   111   i1   2.  1
1   111   i2   3.  1
2   111   i4   5   0
3   111   i6   7.  0
4   111   i5   8.  0
5   222   i10   9. 0
6   222   i11  12. 1
7   222   i13  11. 1
8   222   i2   1.  0
9   333   i14  16. 0
10  333   i3   2.  0
11  333   i16  19. 0
12  333   i3   4.  0
13  444   i21  20. 0
14  444   i9   5.  0

Does anyone knows how to do it please?请问有人知道怎么做吗?

Let us do assign让我们做assign

out = df1.merge(df2.assign(new=1),how='left').fillna({'new':0})
Out[255]: 
    uid    a   b  new
0   111   i1   2  1.0
1   111   i2   3  1.0
2   111   i4   5  0.0
3   111   i6   7  0.0
4   111   i5   8  0.0
5   222  i10   9  0.0
6   222  i11  12  1.0
7   222  i13  11  1.0
8   222   i2   1  1.0
9   333  i14  16  0.0
10  333   i3   2  0.0
11  333  i16  19  0.0
12  333   i3   4  0.0
13  444  i21  20  0.0
14  444   i9   5  0.0

Try a left merge with indicator then replace the indicator with 1 or 0 based on values in both tables with something like np.where :尝试与指标进行左merge ,然后根据两个表中的值merge指标替换为10 ,例如np.where

new_df = pd.merge(df1, df2, on=['uid', 'a'], how='left', indicator='new')
new_df['new'] = np.where(new_df['new'].eq('both'), 1, 0)

Option 2: get boolean values with eq + astype :选项 2:使用eq + astype获取布尔值:

new_df = pd.merge(df1, df2, on=['uid', 'a'], how='left', indicator='new')
new_df['new'] = new_df['new'].eq('both').astype(int)

new_df : new_df

    uid    a   b  new
0   111   i1   2    1
1   111   i2   3    1
2   111   i4   5    0
3   111   i6   7    0
4   111   i5   8    0
5   222  i10   9    0
6   222  i11  12    1
7   222  i13  11    1
8   222   i2   1    1
9   333  i14  16    0
10  333   i3   2    0
11  333  i16  19    0
12  333   i3   4    0
13  444  i21  20    0
14  444   i9   5    0

Use set_index() , isin() and display results as integers:使用set_index() isin()并显示结果为整数:

df1['new'] = (df1.set_index(['uid', 'a']).index
                 .isin(df2.set_index(['uid', 'a']).index)
                 .astype(int)
             )

Output:输出:

    uid    a     b  new
0   111   i1     2    1
1   111   i2     3    1
2   111   i4     5    0
3   111   i6     7    0
4   111   i5     8    0
5   222  i10     9    0
6   222  i11    12    1
7   222  i13    11    1
8   222   i2     1    1
9   333  i14    16    0
10  333   i3     2    0
11  333  i16    19    0
12  333   i3     4    0
13  444  i21    20    0
14  444   i9     5    0

I would left merge with an indicator.我会留下一个指标合并。

>>> df1.merge(df2, on=['uid', 'a'], how='left', indicator=True)

    uid    a   b     _merge
0   111   i1   2       both
1   111   i2   3       both
2   111   i4   5  left_only
3   111   i6   7  left_only
4   111   i5   8  left_only
5   222  i10   9  left_only
6   222  i11  12       both
7   222  i13  11       both
8   222   i2   1       both
9   333  i14  16  left_only
10  333   i3   2  left_only
11  333  i16  19  left_only
12  333   i3   4  left_only
13  444  i21  20  left_only
14  444   i9   5  left_only

Then convert the indicator column to your variable, renaming it df.rename(columns={'_merge', 'new'}, inplace=True) and then replacing elements in the column with df['_merge'].replace({'both': 1, 'left_only': 0}, inplace=True) .然后将指标列转换为您的变量,将其重命名为df.rename(columns={'_merge', 'new'}, inplace=True)然后用df['_merge'].replace({'both': 1, 'left_only': 0}, inplace=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM