简体   繁体   English

根据python中的count和unique count添加列

[英]adding column based on count and unique count in python

i have a dataframe as shown below. 我有一个如下所示的数据帧。

type item
new apple
new apple
new io
new io
old apple
old io
old io 
old se
old pj
etc el

i need to create a new dataframe based on count and unique count 我需要根据计数和唯一计数创建一个新的数据帧

type    type_count  unique_item_count
new            4    2
old            5    4
etc            1    1

col 'type_count' is based on the frequency of labels in col'type' col 'unique_item_count' is based on the unique count of labels present in col'item' for each unique label in col'type' col'type_count'基于col'type'col'中的标签频率unique_item_count'基于col'item'中为col'type'中的每个唯一标签存在的标签的唯一计数

also if i add a new column 如果我添加一个新列

type    item    val
new apple       20
new apple       6
new io          5
new io          6
old apple       5
old io          6
old io          4
old se          5
old pj          3
etc el          2

and want a new dataframe with 并希望有一个新的数据帧

type    type_count  unique_item_count   total_count
new             4                   2   37
old             5                   4   23
etc             1                   1   2

col 'total_count' is sum of amount present in the col'val' for each type col'total_count'是每种类型的col'val'中存在的金额的总和

Use DataFrameGroupBy.agg with list of tuples - first value specify new column name and second aggregate function, here size and nunique : DataFrameGroupBy.agg与元组列表一起使用 - 第一个值指定新列名和第二个聚合函数,这里是sizenunique

L = [('type_count','size'), ('unique_item_count','nunique')]
df = df.groupby('type', sort=False)['item'].agg(L).reset_index()
print (df)
  type  type_count  unique_item_count
0  new           4                  2
1  old           5                  4
2  etc           1                  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM