简体   繁体   中英

Pandas merge on aggregated columns

Let's say I create a DataFrame:

import pandas as pd
df = pd.DataFrame({"a": [1,2,3,13,15], "b": [4,5,6,6,6], "c": ["wish", "you","were", "here", "here"]})

Like so:

    a   b   c
0   1   4   wish
1   2   5   you
2   3   6   were
3   13  6   here
4   15  6   here

... and then group and aggregate by a couple columns ...

gb = df.groupby(['b','c']).agg({"a": lambda x: x.nunique()})

Yielding the following result:

            a
b   c   
4   wish    1
5   you     1
6   here    2
    were    1

Is it possible to merge df with the newly aggregated table gb such that I create a new column in df, containing the corresponding values from gb ? Like this:

    a   b   c      nc
0   1   4   wish    1
1   2   5   you     1
2   3   6   were    1
3   13  6   here    2
4   15  6   here    2

I tried doing the simplest thing:

df.merge(gb, on=['b','c'])

But this gives the error:

KeyError: 'b'

Which makes sense because the grouped table has a Multi-index and b is not a column. So my question is two-fold:

  1. Can I transform the multi-index of the gb DataFrame back into columns (so that it has the b and c column)?
  2. Can I merge df with gb on the column names?

Whenever you want to add some aggregated column from groupby operation back to the df you should be using transform , this produces a Series with its index aligned with your orig df:

In [4]:

df['nc'] = df.groupby(['b','c'])['a'].transform(pd.Series.nunique)
df
Out[4]:
    a  b     c  nc
0   1  4  wish   1
1   2  5   you   1
2   3  6  were   1
3  13  6  here   2
4  15  6  here   2

There is no need to reset the index or perform an additional merge.

There's a simple way of doing this using reset_index() .

df.merge(gb.reset_index(), on=['b','c'])

gives you

   a_x  b    c    a_y
0    1  4  wish    1
1    2  5   you    1
2    3  6  were    1
3   13  6  here    2
4   15  6  here    2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM