简体   繁体   中英

How can I use count list value in dataframe

I have a dataframe looks like this

df = pd.DataFrame({'id': ['T01', 'T01', 'T01', 'T02', 'T02', 'T03', 'T03'],
                   'event_list': [(['a', 'b']),
                            (['a', 'c']),
                            (['a', 'b', 'c']),
                            (['a', 'b', 'c']),
                            (['b', 'c'])]})

I wanna group-by id column and count the element inside of the list, so the desired output will look like this

df = pd.DataFrame({'id': ['T01','T01','T01','T02','T02', 'T03', 'T03','T03'],
                   'event': ['a','b','c','a','b','a','b','c'],
                   'count': [3,2,2,2,1,1,2,2],})

Making use of pandas' newer functions we can combine explode with pd.NamedAgg recreating your expected output in the desired order:



id  event_list       
T01 a               3
    b               2
    c               2
T02 a               2
    b               1
T03 a               1
    b               2
    c               2

Just try

out = (df.explode('event_list').value_counts()
       .rename({'event_list': 'event'}, axis=1)
       .sort_values(['id', 'event']))

    id event  count
0  T01     a      3
1  T01     b      2
2  T01     c      2
3  T02     a      2
4  T02     b      1
5  T03     a      1
6  T03     b      2
7  T03     c      2
df.explode('event_list').groupby(['id', 'event_list']).size().reset_index(name='count').rename(columns={'event_list':'event'})

Another way:

from collections import Counter
temp = df.assign(event_list=df['event_list'].apply(Counter)).groupby('id').agg(sum)
out = temp['event_list'].apply(pd.Series).stack()


T01  a    3.0
     b    2.0
     c    2.0
T02  a    2.0
     b    1.0
T03  a    1.0
     b    2.0
     c    2.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM