I have a dataframe that includes the category of a project, currency, number of investors, goal, etc., and I want to create a new column which will be "average success rate of their category":
state category main_category currency backers country \
0 0 Poetry Publishing GBP 0 GB
1 0 Narrative Film Film & Video USD 15 US
2 0 Narrative Film Film & Video USD 3 US
3 0 Music Music USD 1 US
4 1 Restaurants Food USD 224 US
usd_goal_real duration year hour
0 1533.95 59 2015 morning
1 30000.00 60 2017 morning
2 45000.00 45 2013 morning
3 5000.00 30 2012 morning
4 50000.00 35 2016 afternoon
I have the average success rates in series format:
Dance 65.435209
Theater 63.796134
Comics 59.141527
Music 52.660558
Art 44.889045
Games 43.890467
Film & Video 41.790649
Design 41.594386
Publishing 34.701650
Photography 34.110847
Fashion 28.283186
Technology 23.785582
And now I want to add in a new column, where each column will have a success rate matching their category, ie wherever the row is technology, the new column will include 23.78 for that row.
df[category_success_rate] = i
want the output column to be the % success which matches with the category in "main category" column.
I think you need GroupBy.transform
with a Boolean mask, df['state'].eq(1)
or (df['state'] == 1)
:
df['category_success_rate'] = (df['state'].eq(1)
.groupby(df['main_category']).transform('mean') * 100)
Alternative:
df['category_success_rate'] = ((df['state'] == 1)
.groupby(df['main_category']).transform('mean') * 100)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.