Was just wondering if there was a better way to do this. Basically I have some categories I want to find all unique combos for each val, and then count the number of instances for each category. The inclusion of the astype(str)
irks me.
df = pd.DataFrame(
{
'cat': ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b'],
'val': [1, 1, 2, 2, 3, 4, 5, 5]
}
)
df.groupby('val')['cat'].apply(lambda x: set(x)).astype(str).value_counts()
Out:
{'a', 'b'} 2
{'c', 'a'} 1
{'b'} 1
{'c'} 1
Name: cat, dtype: int64
The following does not give the desired result
df.groupby('val')['cat'].unique().value_counts()
Out:
[b] 1
[c, a] 1
[a, b] 1
[c] 1
[a, b] 1
You can use GroupBy.agg
into tuple
orfrozenset
since they are hashable, then use Series.value_counts
df.groupby('val').agg(tuple).value_counts()
# _.agg(frozenset).value_counts() works fine too.
cat
(a, b) 2
(a, c) 1
(b) 1
(c) 1
dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.