简体   繁体   English

如何计算python中2个列表的精度和召回率

[英]How to calculate precision and recall of 2 lists in python

I write a movie recommendation system.我写了一个电影推荐系统。 I have list of 20 films that I recommend to the user and list of 150 movies that the user really saw at last.我有我推荐给用户的 20 部电影的列表和用户最后真正看到的 150 部电影的列表。 How can I calculate in python with sklearn the precision and recall in these 2 lists?如何使用 sklearn 在 python 中计算这两个列表中的精度和召回率?

For example I have 10 movies that I recommnded to user that the user realy saw, the calculation of reacall is: 10/150, the calculation of precision is: 10/20比如我有10部电影推荐给用户,用户真正看过,recall的计算是:10/150,精度的计算是:10/20

From what I read, the simplest way would be to use the intersection between two sets.从我读到的,最简单的方法是使用两个集合之间的intersection

I imagine you use some kind of identifier for the movies so your lists must not have duplicates (you probably don't recommend the same movie twice for instance) meaning you can use sets and their built-in intersection .我想您对电影使用某种标识符,因此您的列表不能有重复项(例如,您可能不会两次推荐同一部电影),这意味着您可以使用 set 及其内置的intersection

recommendations={"movie1", "movie2", "movie3"}
saw={"movie1", "movie2", "movie4", "movie5", "movie6"}

"recommended movies saw by the user"
recommendations.intersection(saw)
>>> {"movie1", "movie2"}

# To get the "number of recommended movie that the user saw":
movie_intersect = len(recommendations.intersection(saw))
movie_intersect
>>> 2

# Precision is just:
movie_intersect/len(recommendations)
>>> 0.666666666666666667

# Recall:
movie_intersect/len(saw)
>>> 0.4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM