[英]How to sort a pivot table with pandas
I just started with pandas this week. 我本周刚开始学习熊猫。
It's a table with movie names, user id and a rating that user gave to the movie. 该表包含电影名称,用户ID和用户对该电影的评价。 Only movies viewed by the user were rated 仅对用户观看的电影评分
I have a pivot table like this with one line: 我有这样的数据透视表,只有一行:
In[1]: ratings_matrix = combine_movies_ratings.pivot_table(index='userID', columns='title', values='rating').fillna(0)
Out[1]:
title MovieA MovieB MovieC .... MovieN
userID
1 5 0 3 .... 0
I'm accessing the values with ratings_matrix.loc[1].values
this return an array with all ratings [5, 0, ...., 0]
我正在使用ratings_matrix.loc[1].values
访问值。此值将返回一个具有所有评分[5, 0, ...., 0]
ratings_matrix.loc[1].values
[5, 0, ...., 0]
的数组
The movies name I access with ratings_matrix.loc[1].columns
我通过ratings_matrix.loc[1].columns
访问的电影名称
I desire a outcome to put the first five movies rated by the user, since not every film in the dataset was rated. 我希望有一个结果可以让用户对前五部电影进行评分,因为并不是数据集中的每部电影都得到了评分。
['MovieA', 'MovieC', 'MovieB' ... ]
My attempt was: - Sort the pivot table to show first movies with rating values because they're sorted by alphabetical order by default and in some cases the first movie is rated with 0 already that the user not rated it. 我的尝试是:-对数据透视表进行排序,以显示具有评级值的第一部电影,因为默认情况下它们是按字母顺序排序的,并且在某些情况下,第一部电影的评级为0,而用户尚未对其进行评级。
Suggestions are welcome 欢迎建议
也许你可以尝试
(combine_movies_ratings.sort_values('rating').groupby('userID').head(5)).title
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.