简体   繁体   English

如何在python pandas中的同一列上进行分组并将一些值的唯一数和计数数作为聚合?

[英]How to do group by and take count of unique and count of some value as aggregate on same column in python pandas?

My question is related to my previous Question but it's different. 我的问题与我以前的问题有关,但它有所不同。 So I am asking the new question. 所以我在问新问题。

In above question see the answer of @jezrael. 在上面的问题中,请参阅@jezrael的答案。

df = pd.DataFrame({'col1':[1,1,1],
                   'col2':[4,4,6],
                   'col3':[7,7,9],
                   'col4':[3,3,5]})

print (df)
   col1  col2  col3  col4
0     1     4     7     3
1     1     4     7     3
2     1     6     9     5

df1 = df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique'})
df1['result_col'] = df1['col3'].div(df1['col4'])
print (df1)
           col4  col3  result_col
col1 col2                        
1    4        1     2         2.0
     6        1     1         1.0

Now here I want to take count for the specific value of col4 . 现在我想col4的具体值。 Say I also want to take count of col4 == 3 in the same query. 假设我也想在同一个查询中计算col4 == 3

df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique'}) ... + count(col4=='3')

How to do this in same above query I have tried bellow but not getting solution. 如何在上面相同的查询中执行此操作我已经尝试过但没有得到解决方案。

df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique','col4':'x: lambda x[x == 7].count()'})

I think you need aggregate with list of function in dict for column col4 . 我认为你需要在列col4中使用dict中的函数列表进行aggregate

If need count 3 values the simpliest is sum True values in x == 3 : 如果需要计数3值,则最简单的是x == 3 sum True值:

df1 = df.groupby(['col1','col2'])
        .agg({'col3':'size','col4': ['nunique', lambda x: (x == 3).sum()]})
df1 = df1.rename(columns={'<lambda>':'count_3'})
df1.columns = ['{}_{}'.format(x[0], x[1]) for x in df1.columns]
print (df1)
           col4_nunique  col4_count_3  col3_size
col1 col2                                       
1    4                1             2          2
     6                1             0          1

Do some preprocessing by including the col4==3 as a column ahead of time. 通过将col4==3作为列提前包含来进行一些预处理。 Then use aggregate 然后使用aggregate

df.assign(result_col=df.col4.eq(3).astype(int)).groupby(
    ['col1', 'col2']
).agg(dict(col3='size', col4='nunique', result_col='sum'))

           col3  result_col  col4
col1 col2                        
1    4        2           2     1
     6        1           0     1

old answers 老答案

g = df.groupby(['col1', 'col2'])
g.agg({'col3':'size','col4': 'nunique'}).assign(
    result_col=g.col4.apply(lambda x: x.eq(3).sum()))

           col3  col4  result_col
col1 col2                        
1    4        2     1           2
     6        1     1           0

slightly rearranged 稍微重新排列

g = df.groupby(['col1', 'col2'])
final_df = g.agg({'col3':'size','col4': 'nunique'})
final_df.insert(1, 'result_col', g.col4.apply(lambda x: x.eq(3).sum()))
final_df

           col3  result_col  col4
col1 col2                        
1    4        2           2     1
     6        1           0     1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python - 如何在python pandas中分组并取一列的计数除以数据框第二列的唯一计数? - How to do group by and take Count of one column divide by count of unique of second column of data frame in python pandas? python pandas:如何对列中的每个值进行分组并计算条件? - python pandas: how to group by and count with a condition for every value in a column? 根据多列分组聚合列的唯一值并计算唯一值 - pandas - Aggregate unique values of a column based on group by multiple columns and count unique - pandas Pandas:如何按某些列的值总和或按行数聚合数据? - Pandas: how to aggregate data by some column's value sum or by row count? 通过扩展列 Pandas 中的值来分组并计数 - Group and take count by expanding values in column Pandas 如何按特定列分组然后计算不是 NA 的多列的计数并将它们添加到 Pandas Python 中? - How to group by certain column then take the count of multiple columns where it is not NA and add them in Pandas Python? 如何使用分组依据和给出唯一计数的列创建 Pandas Dataframe - How to create a Pandas Dataframe with group by and a column giving unique count Python pandas:如何基于多列分组和计算唯一值? - Python pandas: How to group by and count unique values based on multiple columns? 在Python熊猫数据框中对唯一值进行分组和计数 - Group and count unique values in Python pandas dataframe python pandas:按几列分组并计算一列的值 - python pandas : group by several columns and count value for one column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM