简体   繁体   English

在Pandas Dataframe中获取多个groupby中的公共值

[英]Get common values within multiple groupby in pandas Dataframe

I am new to Pandas dataframe and I would like to find common values of 'col2' within multiple groups grouped by 'col1' 我是Pandas数据框的新手,我想在按“ col1”分组的多个组中找到“ col2”的通用值

 col1    col2
  a       abc
          pqr
          xyz

  b       abc      
          def
          bcd

  c       bcd
          efg

The output should be as follows: 输出应如下所示:

     abc      [a,b]
     bcd      [b,c]

Can anyone help me with the solution? 谁能帮我解决这个问题?

Thanks. 谢谢。

Use: 采用:

df['col1'] = df['col1'].replace('',np.nan).ffill()

s = df.groupby('col2')['col1'].apply(list)
s = s[s.str.len() > 1].reset_index()
print (s)
  col2    col1
0  abc  [a, b]
1  bcd  [b, c]

Explanation : 说明

  1. First replace empty values to NaN s and forward fill NaN s 首先replace空值replaceNaN然后向前填充NaN
  2. For each value of col2 aggregate list s 对于col2集合list的每个值
  3. Filter lists by lengths by boolean indexing 通过boolean indexing按长度过滤列表

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM