Finding intersection of values in a column associated with unique values in another column Pandas

Question

If I have a DataFrame like this (very minimal example)

  col1  col2
0    a     1
1    a     2
2    b     1
3    b     2
4    b     4
5    c     1
6    c     2
7    c     3

and I want the intersection of all col2 values when they are related to their unique col1 values (so in this case, the intersection would be [1,2] ), how can I do so with Pandas? Another way to word this would be the values in col2 that exist for every unique value in col1 .

My ( bad ) solution was to get the unique col1 elements with unique , and then build dictionaries from each unique element in col1 and then take the set intersection of those dictionary values. I feel like there is a mechanism I should be using to relate the columns together however that could make this much easier.

Answer 1

One way is to use pivot_table :

In [11]: cross = df.pivot_table(index="col1", columns="col2", aggfunc='size') == 1

In [12]: cross
Out[12]:
col2     1     2      3      4
col1
a     True  True  False  False
b     True  True  False   True
c     True  True   True  False

In [13]: cross.all()
Out[13]:
col2
1     True
2     True
3    False
4    False
dtype: bool

In [14]: cross.columns[cross.all()]
Out[14]: Int64Index([1, 2], dtype='int64', name='col2')

Answer 2

Another solution:

print df.pivot_table(index="col1", columns="col2", aggfunc=len)
col2    1    2    3    4
col1                    
a     1.0  1.0  NaN  NaN
b     1.0  1.0  NaN  1.0
c     1.0  1.0  1.0  NaN

ser = (df.pivot_table(index="col1", columns="col2", aggfunc=len) == 1).all()
print  ser.index[ser]
Int64Index([1, 2], dtype='int64', name=u'col2')

Finding intersection of values in a column associated with unique values in another column Pandas

Question

2 answers

solution1
2 ACCPTED 2016-05-13 07:04:31

solution2
1 2016-05-13 07:08:25

Finding intersection of values in a column associated with unique values in another column Pandas

Question

2 answers

solution1 2 ACCPTED 2016-05-13 07:04:31

solution2 1 2016-05-13 07:08:25

solution1
2 ACCPTED 2016-05-13 07:04:31

solution2
1 2016-05-13 07:08:25