Create Dataframe with rows made for each string appearing in a column of another Dataframe

Question

If you have a dataframe as follows:

genre                 mean_average_budget
horror thriller       x
romance comedy        y 
action thriller       z
documentary           a
comedy documentary    b

How could one be made in which the rows are the individual appearances of each string in the genre column? Eg:

genre                 mean_average_budget
horror                h
thriller              i 
action                k
documentary           l
comedy                m

Answer 1

Try this

new_df = df.set_index('mean_average_budget').genre.str.split().\
    apply(pd.Series).stack().reset_index(1,drop = True).\
    reset_index(name = 'genre')

    mean_average_budget genre
0   x                   horror
1   x                   thriller
2   y                   romance
3   y                   comedy
4   z                   action
5   z                   thriller
6   a                   documentary
7   b                   comedy
8   b                   documentary

To find mean, try this for numeric data

new_df.groupby('genre')['mean_average_budget'].mean()

If you want to aggregate the strings

new_df.groupby('genre')['mean_average_budget'].apply('+'.join)

Create Dataframe with rows made for each string appearing in a column of another Dataframe

Question

1 answers

solution1
1 ACCPTED 2017-10-24 18:29:16

Create Dataframe with rows made for each string appearing in a column of another Dataframe

Question

1 answers

solution1 1 ACCPTED 2017-10-24 18:29:16

solution1
1 ACCPTED 2017-10-24 18:29:16