简体   繁体   English

如何使用熊猫对数据透视表进行排序

[英]How to sort a pivot table with pandas

I just started with pandas this week. 我本周刚开始学习熊猫。

It's a table with movie names, user id and a rating that user gave to the movie. 该表包含电影名称,用户ID和用户对该电影的评价。 Only movies viewed by the user were rated 仅对用户观看的电影评分

I have a pivot table like this with one line: 我有这样的数据透视表,只有一行:

In[1]: ratings_matrix = combine_movies_ratings.pivot_table(index='userID', columns='title', values='rating').fillna(0)

Out[1]:
 title      MovieA MovieB MovieC .... MovieN
 userID
 1           5      0        3   ....      0

I'm accessing the values with ratings_matrix.loc[1].values this return an array with all ratings [5, 0, ...., 0] 我正在使用ratings_matrix.loc[1].values访问值。此值将返回一个具有所有评分[5, 0, ...., 0] ratings_matrix.loc[1].values [5, 0, ...., 0]的数组

The movies name I access with ratings_matrix.loc[1].columns 我通过ratings_matrix.loc[1].columns访问的电影名称

I desire a outcome to put the first five movies rated by the user, since not every film in the dataset was rated. 我希望有一个结果可以让用户对前五部电影进行评分,因为并不是数据集中的每部电影都得到了评分。

['MovieA', 'MovieC', 'MovieB' ... ]

My attempt was: - Sort the pivot table to show first movies with rating values because they're sorted by alphabetical order by default and in some cases the first movie is rated with 0 already that the user not rated it. 我的尝试是:-对数据透视表进行排序,以显示具有评级值的第一部电影,因为默认情况下它们是按字母顺序排序的,并且在某些情况下,第一部电影的评级为0,而用户尚未对其进行评级。

Suggestions are welcome 欢迎建议

也许你可以尝试

(combine_movies_ratings.sort_values('rating').groupby('userID').head(5)).title

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM