简体   繁体   中英

Is there a way in Pandas to count (Countifs in excel) in one dataframe and add counts as new column in another dataframe of different length?

I'm translating an excel formula in pandas. I have two dataframe df1 and df2 , and I will need to count values in a column first dataframe df1 and populate dataframe df2 where the values counted in df1 is equal to a value in df2 . How do I check and fill a new column in df2 with the counted values from df1 ?

df1 :

      id      member        seq
0   48299      Koif          1
1   48299      Iki           1
2   48299      Juju          2
3   48299      PNik          3 
4   48865      Lok           1 
5   48865      Mkoj          2
6   48865      Kino          1
7   64865      Boni          1
8   64865      Afriya        2
9   50774      Amah          2
10  23697      Pilato        1
11  23697      Clems         1

df2 :

   group_id      group_name    count
0   48299      e_sys          
1   50774      Y3N
2   64865      nana
3   48865      juzti

There could be members from df1 for example Clems and Pilato whose counts are not needed since this group is not in df2 .

I can do the counts alright (see code below), my problem is comparing counted id in df1 with group_id in df2 and filling the counted values.

Counting:

 df1.groupby('id')['id'].count()

My current solution is:

df2['count'] = df1[(df2['group_id'].isin(df1['id']))].count() Or

df2['count'] = df1[(df2['group_id'].isin(df1['id']))].transform('count')

Both doesn't give the desired result.

Results df2 :

   group_id      group_name    count
0   48299      e_sys              4
1   50774      Y3N                1
2   64865      nana               2
3   48865      juzti              3

Use map by Series :

df2['count'] = df2['group_id'].map(df1.groupby('id')['id'].count())

Alternative with Series.value_counts :

df2['count'] = df2['group_id'].map(df1['id'].value_counts())

print (df2)
   group_id group_name  count
0     48299      e_sys      4
1     50774        Y3N      1
2     64865       nana      2
3     48865      juzti      3

Merge the two dataframe using a left-join:

counts = df1.groupby('id').size().reset_index()
df2.merge(counts, how='left', left_on='group_id', right_on='id')

Output:

 #      group_id group_name     id  0
 #   0     48299      e_sys  48299  4
 #   1     50774        Y3N  50774  1
 #   2     64865       nana  64865  2
 #   3     48865      juzti  48865  3

Left join makes sure you only keep the counts that appear in df2 . Note I used the groupby().size() as a somewhat clearer and more concise version of counting.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM