Count Distinct Using Pandas Transform

Question

Let's say I have the following dataframe:

df2 = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'foo', 'bar', 'foo', 'foo'],
                   'B' : ['one', 'one', 'two', 'three',
                         'two', 'two', 'one', 'three'],
                   'C' : np.random.randn(8), 'D' : np.random.randn(8)})
df2.head()

Which looks like the following:

     A      B         C         D
0  foo    one  0.613774  0.783539
1  bar    one -0.937659 -0.913213
2  foo    two -1.568537  1.569597
3  bar  three -0.353449  1.108789
4  foo    two -1.769544  0.530466

I know that if I wanted to create another column which is the count of records for each value in column A, I could do the following:

df2['counts'] = df2.groupby('A')['B'].transform(np.size)

However let's say I only want to count the unique elements of B grouped by A? I know how to do this if I was going to reduce the dataframe down to 2 columns (one for "foo" and one for "bar"), but how do I do this using transform?

Answer 1

使用GroupBy.transform.nunique ：

df2['counts'] = df2.groupby('A')['B'].transform('nunique')

Count Distinct Using Pandas Transform

Question

1 answers

solution1
2 ACCPTED 2020-01-27 18:32:12

Count Distinct Using Pandas Transform

Question

1 answers

solution1 2 ACCPTED 2020-01-27 18:32:12

solution1
2 ACCPTED 2020-01-27 18:32:12