[英]Pandas group by and filter
I have the following .csv我有以下 .csv
Name Location Product Type number
Greg 1 Fruit grape 1
Greg 1 Fruit apple 2
Greg 1 Bakery bread 5
Greg 1 Bakery roll 8
Greg 2 Fruit grape 7
Greg 2 Fruit apple 1
Greg 3 Fruit grape 2
Greg 4 Bakery roll 3
Greg 4 Bakery bread 4
Sam 5 Fruit apple 7
Sam 5 Fruit grape 9
Sam 5 Fruit apple 10
Sam 6 Bakery roll 11
Sam 6 Bakery bread 12
Sam 7 Fruit orange 13
Sam 7 Bakery roll 14
Tim 8 Fruit bread 16
Zack 9 Bakery roll 17
Zack 10 Fruit apple 19
Zack 10 Fruit grape 20
I would like to put this into pandas and group by name, location where there is more than one location with more than two products.我想将其放入熊猫并按名称分组,其中有多个位置有两个以上产品的位置。 I would still want to maintain the 'number' for the products
我仍然想保留产品的“编号”
So something Like this as an example since Greg at location 1 has two products以此类推,因为位置 1 的 Greg 有两个产品
name location product type
Greg 1 Fruit, bakery grape,apple,bread,roll
I am struggling with the groupby and ultimately getting this back to a data frame that I could .to_csv我正在努力与 groupby 并最终将其恢复到我可以 .to_csv 的数据框
IIUC use transform
with nunique
IIUC 使用具有
nunique
transform
df1=df[df.groupby(['Name','Location']).Product.transform('nunique')>1]
Name Location Product Type number
0 Greg 1 Fruit grape 1
1 Greg 1 Fruit apple 2
2 Greg 1 Bakery bread 5
3 Greg 1 Bakery roll 8
14 Sam 7 Fruit orange 13
15 Sam 7 Bakery roll 14
If you do df.groupby([col_names])
, col_names will become the index.如果您执行
df.groupby([col_names])
,则 col_names 将成为索引。
In order to convert the indexes back to columns, you are required to use the DataFrame.reset_index()
method.为了将索引转换回列,您需要使用
DataFrame.reset_index()
方法。
Hope that helps.希望有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.