简体   繁体   English

如何使用PRAW在子目录中列出最重要的评论?

[英]How can I make a list of the top comments in a subreddit with PRAW?

I need to grab the top comments in a subreddit, from all time. 我需要始终保持在subreddit中的顶级评论。

I have tried grabbing all the submissions, and iterating through them, but unfortunately the amount of posts you can get is limited to 1000. 我尝试获取所有提交,并对其进行遍历,但不幸的是,您所能获得的帖子数量限制为1000。

I have tried using Subreddit.get_comments , but it returns only 25 comments. 我尝试使用Subreddit.get_comments ,但它仅返回25条注释。

So I am looking for a way around that. 因此,我正在寻找解决方法。

Can you help me out? 你能帮我吗?

It is possible to use get_comments with a parameter of limit set to None to get all available comments. 可以将get_comments的参数limit设置为None来获取所有可用注释。 (By default, it uses the amount for the account, which is usually 25). (默认情况下,它使用该帐户的金额,通常为25)。 (The parameters that are used for get_comments include the ones for get_content , including limit ). (用于get_comments的参数包括用于get_content ,包括limit )。

However, this probably won't do what you want – get_comments (or more specifically /r/subreddit/comments ) only offers a list of new comments or new gilded comments, not top comments. 但是,这可能无法满足您的要求get_comments (或更具体地说是/r/subreddit/comments )仅提供新评论或新镀金评论的列表,而不提供热门评论。 And since get_comments also capped to 1000 comments, you'll have trouble building a full list of top comments. 而且由于get_commentsget_comments可包含1000条评论,所以您将难以构建完整的顶部评论列表。

So what you really want is the original algorithm – getting the list of top submissions and then the top comments of those. 因此,您真正想要的是原始算法-获取最重要的提交列表,然后获取这些最重要的注释。 It's not the perfect system (a low-scoring post might actually have a highly voted comment), but it's the best possible. 这不是一个完美的系统(得分较低的帖子实际上可能会获得很高的评价),但这是最好的选择。

Here's some code: 这是一些代码:

import praw

r = praw.Reddit(user_agent='top_comment_test')
subreddit = r.get_subreddit('opensource')
top = subreddit.get_top(params={'t': 'all'}, limit=25) # For a more potentially accurate set of top comments, increase the limit (but it'll take longer)
all_comments = []
for submission in top: 
    submission_comments = praw.helpers.flatten_tree(submission.comments)
    #don't include non comment objects such as "morecomments"
    real_comments = [comment for comment in submission_comments if isinstance(comment, praw.objects.Comment)]
    all_comments += real_comments

all_comments.sort(key=lambda comment: comment.score, reverse=True)

top_comments = all_comments[:25] #top 25 comments

print top_comments

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM