I'm grouping by a dataframe on two columns "zone_id and eventName. I need to compute the percentage for the eventName grouped by zone_id.
In other words I need to compute (clicked/printed)*100 by zone_id.
import pandas as pd
#read the csv file
df = pd.read_csv('data.csv', sep=';')
result=df.groupby(['zone_id','eventName']).event.count()
print(result)
#I use count() method to extract the number of clicked and printed by zone_id. Then on this basis I think to be able to find a way to compute a percentage by zone_id.
output :
zone_id eventName
28 printed 88
9283 clicked 197
printed 7732
9284 clicked 2
printed 452
9287 clicked 129
printed 3802
9614 clicked 4
printed 342
17437 clicked 55
printed 4026
#By using mean() function, the mean calculation is well done grouped by zone_id
result=df.groupby(['zone_id','eventName']).event.count().groupby('zone_id').mean()
print(result)
output :
zone_id
28 88.0
9283 3964.5
9284 227.0
9287 1965.5
9614 173.0
17437 2040.5
#Expected result : I need to compute the percentage of eventName (clicked/printed)*100 by zone_id
Expected output:
zone_id
28 0% -> (0/88)*100
9283 2.54% -> (197/7732)*100
9284 0.44% -> (2/452)*100
9287 3.39% -> (129/3802)*100
9614 1.16% -> (4/342)*100
17437 1.36% -> (55/4026)*100
Without sample data it's hard to see, but try something like this?
events = df.groupby(['zone_id','eventName']).size()
events.loc[pd.IndexSlice[:, 'printed']] / events.loc[pd.IndexSlice[:, 'clicked']]
Or using unstack to get the clicked and printed as columns:
events = df.groupby(['zone_id','eventName']).size().unstack(level=1)
events['printed'] / events['clicked']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.