简体   繁体   English

"从现有的熊猫中创建一个新的df?"

[英]Create a new df in pandas from existing one?

I am doing a project on movies data.我正在做一个关于电影数据的项目。
The sample dataset looks like :示例数据集如下所示:

在此处输入图像描述<\/a>

The column genres have 21 unique values.列类型有 21 个唯一值。

在此处输入图像描述<\/a>

I want to create a new table\/dataframe so that the table will contains the average ratings for the each genres for every user, like我想创建一个新表\/数据框,以便该表包含每个用户的每种类型的平均评分,例如

在此处输入图像描述<\/a>

I got the list of genres using the code below:我使用以下代码获得了流派列表:

def split(sent):
    return (sent.split())

new_genres=set()

for i in range(len(genres)):
    a=split(genres[i])
    for g in a:
        new_genres.add(g)

new_genres

Setup:设置:

In [905]: df = pd.DataFrame({'userID':[1,2,3,3,2], 'id':[110, 147, 858, 1246, 1968], 'rating':[1.0, 4.5, 5.0, 5.0, 4.0], 'genres':['Drama Mystery Romance', 'Drama', 'Comedy Drama Romance', 'Drama', 'Drama Comedy Romance']}
     ...: )

In [906]: df
Out[906]: 
   userID    id  rating                 genres
0       1   110     1.0  Drama Mystery Romance
1       2   147     4.5                  Drama
2       3   858     5.0   Comedy Drama Romance
3       3  1246     5.0                  Drama
4       2  1968     4.0   Drama Comedy Romance

We can start by using the assign<\/code><\/a> method to get each genre<\/code> in rows like so :我们可以首先使用assign<\/code><\/a>方法来获取行中的每个genre<\/code> ,如下所示:

>>> df = df.assign(genre=df['genres'].str.split(' ')).explode('genre')
>>> df
    userId  id      rating  genres                  genre
0   1       110     1.0     Drama Mystery Romance   Drama
0   1       110     1.0     Drama Mystery Romance   Mystery
0   1       110     1.0     Drama Mystery Romance   Romance
1   1       147     4.5     Drama                   Drama
2   1       858     5.0     Comedy Drama Romance    Comedy
2   1       858     5.0     Comedy Drama Romance    Drama
2   1       858     5.0     Comedy Drama Romance    Romance
3   1       1246    5.0     Drama                   Drama
4   1       1968    4.0     Drama Comedy Romance    Drama
4   1       1968    4.0     Drama Comedy Romance    Comedy
4   1       1968    4.0     Drama Comedy Romance    Romance
5   270896  48780   5.0     Forein                  Forein
6   270896  49530   4.0     Action Thriller Scifi   Action
6   270896  49530   4.0     Action Thriller Scifi   Thriller
6   270896  49530   4.0     Action Thriller Scifi   Scifi
7   270896  54001   4.0     Drama                   Drama
8   270896  54503   4.0     Action Forein           Action
8   270896  54503   4.0     Action Forein           Forein
9   270896  58559   5.0     Drama                   Drama

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM