在 pandas dataframe 中使用.groupby 计算唯一值

Question

I have a dataframe and I when I run my code it returns all Nan's instead of returning the counted value.我有一个 dataframe，当我运行我的代码时，它返回所有 Nan 而不是返回计数值。 I'm sure it's something simple but I can't figure it out.我确定这很简单，但我无法弄清楚。 I'm trying to get a unique number of species in each location.我试图在每个位置获得唯一数量的物种。 I'd like the new column to output a count of species [2,2,1,1,2,2,1,1]我想在 output 的新专栏中列出物种 [2,2,1,1,2,2,1,1]

import pandas as pd

df = pd.DataFrame({
         'ID': [1, 2, 3, 4, 5, 6, 7, 8],
         'location': ['A', 'A', 'C', 'C', 'E', 'E', 'E', 'E'],
         'Species': ['Cat', 'Cat', 'Dog', 'Cat', 'Cat', 'Cat', 'Dog', 'Bird'],
         'Count': [2,2,2,2,4,4,4,4]
})

def abundance(data):
    data["Abundance"] = data[data.Species.notnull()].groupby('location')['Species'].unique()

abundance(df)
print(df)
````````````````````
   ID location Species  Count Abundance
0   1        A     Cat      2       NaN
1   2        A     Cat      2       NaN
2   3        C     Dog      2       NaN
3   4        C     Cat      2       NaN
4   5        E     Cat      4       NaN
5   6        E     Cat      4       NaN
6   7        E     Dog      4       NaN
7   8        E    Bird      4       NaN

Answer 1

I believe you want count of each pair location, Species .我相信您想要计数每对location, Species 。 And also, to assign groupby output back to the original dataframe, we usually use transform :而且，为了将groupby output 分配回原来的 dataframe，我们通常使用transform ：

df['Abundance'] = df.groupby(['location','Species']).Species.transform('size')

Output: Output：

   ID location Species  Count  Abundance
0   1        A     Cat      2          2
1   2        A     Cat      2          2
2   3        C     Dog      2          1
3   4        C     Cat      2          1
4   5        E     Cat      4          2
5   6        E     Cat      4          2
6   7        E     Dog      4          1
7   8        E    Bird      4          1

Answer 2

df.groupby(['location','Species']).Species.value_counts().to_frame('Abundance')



                            Abundance
location Species Species           
A        Cat     Cat              2
C        Cat     Cat              1
         Dog     Dog              1
E        Bird    Bird             1
         Cat     Cat              2
         Dog     Dog              1

Answer 3

I believe you should try grouping the data frame using the columns you want to have in the output,我相信您应该尝试使用 output 中的列对数据框进行分组，

>>> df[df.Species.notnull()].groupby(['location','Species']).count()
                  ID  Count
location Species           
A        Cat       2      2
C        Cat       1      1
         Dog       1      1
E        Bird      1      1
         Cat       2      2
         Dog       1      1

在 pandas dataframe 中使用.groupby 计算唯一值

问题描述

3 个解决方案

解决方案1
3 已采纳 2020-12-17 04:25:48

解决方案2
0 2020-12-17 04:29:54

解决方案3
0 2020-12-17 04:31:28

在 pandas dataframe 中使用.groupby 计算唯一值

问题描述

3 个解决方案

解决方案1 3 已采纳 2020-12-17 04:25:48

解决方案2 0 2020-12-17 04:29:54

解决方案3 0 2020-12-17 04:31:28

解决方案1
3 已采纳 2020-12-17 04:25:48

解决方案2
0 2020-12-17 04:29:54

解决方案3
0 2020-12-17 04:31:28