[英]adding column based on count and unique count in python
i have a dataframe as shown below. 我有一个如下所示的数据帧。
type item
new apple
new apple
new io
new io
old apple
old io
old io
old se
old pj
etc el
i need to create a new dataframe based on count and unique count 我需要根据计数和唯一计数创建一个新的数据帧
type type_count unique_item_count
new 4 2
old 5 4
etc 1 1
col 'type_count' is based on the frequency of labels in col'type' col 'unique_item_count' is based on the unique count of labels present in col'item' for each unique label in col'type' col'type_count'基于col'type'col'中的标签频率unique_item_count'基于col'item'中为col'type'中的每个唯一标签存在的标签的唯一计数
also if i add a new column 如果我添加一个新列
type item val
new apple 20
new apple 6
new io 5
new io 6
old apple 5
old io 6
old io 4
old se 5
old pj 3
etc el 2
and want a new dataframe with 并希望有一个新的数据帧
type type_count unique_item_count total_count
new 4 2 37
old 5 4 23
etc 1 1 2
col 'total_count' is sum of amount present in the col'val' for each type col'total_count'是每种类型的col'val'中存在的金额的总和
Use DataFrameGroupBy.agg
with list of tuples - first value specify new column name and second aggregate function, here size
and nunique
: 将
DataFrameGroupBy.agg
与元组列表一起使用 - 第一个值指定新列名和第二个聚合函数,这里是size
和nunique
:
L = [('type_count','size'), ('unique_item_count','nunique')]
df = df.groupby('type', sort=False)['item'].agg(L).reset_index()
print (df)
type type_count unique_item_count
0 new 4 2
1 old 5 4
2 etc 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.