If you have a dataframe as follows:
genre mean_average_budget
horror thriller x
romance comedy y
action thriller z
documentary a
comedy documentary b
How could one be made in which the rows are the individual appearances of each string in the genre column? Eg:
genre mean_average_budget
horror h
thriller i
action k
documentary l
comedy m
Try this
new_df = df.set_index('mean_average_budget').genre.str.split().\
apply(pd.Series).stack().reset_index(1,drop = True).\
reset_index(name = 'genre')
mean_average_budget genre
0 x horror
1 x thriller
2 y romance
3 y comedy
4 z action
5 z thriller
6 a documentary
7 b comedy
8 b documentary
To find mean, try this for numeric data
new_df.groupby('genre')['mean_average_budget'].mean()
If you want to aggregate the strings
new_df.groupby('genre')['mean_average_budget'].apply('+'.join)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.