I have the following .csv
Name Location Product Type number
Greg 1 Fruit grape 1
Greg 1 Fruit apple 2
Greg 1 Bakery bread 5
Greg 1 Bakery roll 8
Greg 2 Fruit grape 7
Greg 2 Fruit apple 1
Greg 3 Fruit grape 2
Greg 4 Bakery roll 3
Greg 4 Bakery bread 4
Sam 5 Fruit apple 7
Sam 5 Fruit grape 9
Sam 5 Fruit apple 10
Sam 6 Bakery roll 11
Sam 6 Bakery bread 12
Sam 7 Fruit orange 13
Sam 7 Bakery roll 14
Tim 8 Fruit bread 16
Zack 9 Bakery roll 17
Zack 10 Fruit apple 19
Zack 10 Fruit grape 20
I would like to put this into pandas and group by name, location where there is more than one location with more than two products. I would still want to maintain the 'number' for the products
So something Like this as an example since Greg at location 1 has two products
name location product type
Greg 1 Fruit, bakery grape,apple,bread,roll
I am struggling with the groupby and ultimately getting this back to a data frame that I could .to_csv
IIUC use transform
with nunique
df1=df[df.groupby(['Name','Location']).Product.transform('nunique')>1]
Name Location Product Type number
0 Greg 1 Fruit grape 1
1 Greg 1 Fruit apple 2
2 Greg 1 Bakery bread 5
3 Greg 1 Bakery roll 8
14 Sam 7 Fruit orange 13
15 Sam 7 Bakery roll 14
If you do df.groupby([col_names])
, col_names will become the index.
In order to convert the indexes back to columns, you are required to use the DataFrame.reset_index()
method.
Hope that helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.