"从现有的熊猫中创建一个新的df？"

Question

I am doing a project on movies data.我正在做一个关于电影数据的项目。
The sample dataset looks like :示例数据集如下所示：

The column genres have 21 unique values.列类型有 21 个唯一值。

I want to create a new table\/dataframe so that the table will contains the average ratings for the each genres for every user, like我想创建一个新表\/数据框，以便该表包含每个用户的每种类型的平均评分，例如

<\/a>

I got the list of genres using the code below:我使用以下代码获得了流派列表：

def split(sent):
    return (sent.split())

new_genres=set()

for i in range(len(genres)):
    a=split(genres[i])
    for g in a:
        new_genres.add(g)

new_genres

Answer 1

Setup:设置：

In [905]: df = pd.DataFrame({'userID':[1,2,3,3,2], 'id':[110, 147, 858, 1246, 1968], 'rating':[1.0, 4.5, 5.0, 5.0, 4.0], 'genres':['Drama Mystery Romance', 'Drama', 'Comedy Drama Romance', 'Drama', 'Drama Comedy Romance']}
     ...: )

In [906]: df
Out[906]: 
   userID    id  rating                 genres
0       1   110     1.0  Drama Mystery Romance
1       2   147     4.5                  Drama
2       3   858     5.0   Comedy Drama Romance
3       3  1246     5.0                  Drama
4       2  1968     4.0   Drama Comedy Romance

Answer 2

We can start by using the assign<\/code><\/a> method to get each genre<\/code> in rows like so :我们可以首先使用assign<\/code><\/a>方法来获取行中的每个genre<\/code> ，如下所示：

>>> df = df.assign(genre=df['genres'].str.split(' ')).explode('genre')
>>> df
    userId  id      rating  genres                  genre
0   1       110     1.0     Drama Mystery Romance   Drama
0   1       110     1.0     Drama Mystery Romance   Mystery
0   1       110     1.0     Drama Mystery Romance   Romance
1   1       147     4.5     Drama                   Drama
2   1       858     5.0     Comedy Drama Romance    Comedy
2   1       858     5.0     Comedy Drama Romance    Drama
2   1       858     5.0     Comedy Drama Romance    Romance
3   1       1246    5.0     Drama                   Drama
4   1       1968    4.0     Drama Comedy Romance    Drama
4   1       1968    4.0     Drama Comedy Romance    Comedy
4   1       1968    4.0     Drama Comedy Romance    Romance
5   270896  48780   5.0     Forein                  Forein
6   270896  49530   4.0     Action Thriller Scifi   Action
6   270896  49530   4.0     Action Thriller Scifi   Thriller
6   270896  49530   4.0     Action Thriller Scifi   Scifi
7   270896  54001   4.0     Drama                   Drama
8   270896  54503   4.0     Action Forein           Action
8   270896  54503   4.0     Action Forein           Forein
9   270896  58559   5.0     Drama                   Drama

"从现有的熊猫中创建一个新的df？"

问题描述

2 个解决方案

解决方案1
1 2022-02-06 05:40:21

解决方案2
1 2022-02-06 05:53:27

"从现有的熊猫中创建一个新的df？"

问题描述

2 个解决方案

解决方案1 1 2022-02-06 05:40:21

解决方案2 1 2022-02-06 05:53:27

解决方案1
1 2022-02-06 05:40:21

解决方案2
1 2022-02-06 05:53:27