Let's say I have the following dataframe:
df2 = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C' : np.random.randn(8), 'D' : np.random.randn(8)})
df2.head()
Which looks like the following:
A B C D
0 foo one 0.613774 0.783539
1 bar one -0.937659 -0.913213
2 foo two -1.568537 1.569597
3 bar three -0.353449 1.108789
4 foo two -1.769544 0.530466
I know that if I wanted to create another column which is the count of records for each value in column A, I could do the following:
df2['counts'] = df2.groupby('A')['B'].transform(np.size)
However let's say I only want to count the unique elements of B grouped by A? I know how to do this if I was going to reduce the dataframe down to 2 columns (one for "foo" and one for "bar"), but how do I do this using transform?
使用GroupBy.transform.nunique
:
df2['counts'] = df2.groupby('A')['B'].transform('nunique')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.