简体   繁体   English

在R / ggplot2中绘制前X个类别

[英]Plot top X categories in R/ggplot2

This is very similar to the question here: 这与这里的问题非常相似:

How to use ggplot to group and show top X categories? 如何使用ggplot分组和显示前X个类别?

Except in my case I don't have a discrete value to go on. 除了我自己的情况,我没有其他价值要继续。 I've got data about users posting messages to a user forum. 我有关于用户在用户论坛上发布消息的数据。 Similar to: 如同:

Year, Month, Day, User, Message

I've got an entry for every single message a person posted and I want to plot the top 5 users per year in terms of total Messages posted. 我为每个人发布的每条消息都有一个条目,我想按每年发布的消息总数来排定前5名用户。 In the previous question there was a distinct list of values that could be keyed off of. 在上一个问题中,有一个可以列出的独特的值列表。

In my case, I'm curious if I can do it easily in ggplot2, or if I need to do something like: 就我而言,我很好奇我是否可以在ggplot2中轻松完成此操作,或者是否需要执行以下操作:

  1. Load the data into a dataframe 将数据加载到数据框
  2. Construct a new dataframe which is the same data collapsed & summarized by year 构造一个新的数据框,该数据框是按年份折叠和汇总的相同数据
  3. Plot from the new frame using the same approach as the previous question 使用与上一个问题相同的方法从新框架中绘图

If this is the best way to do it, what's the "correct" way to do #2? 如果这是最好的方法,那么做#2的“正确”方法是什么? That new dataframe should probably be of the form: 新的数据框可能采用以下形式:

Year, User, Total number of Messages

any help is appreciated. 任何帮助表示赞赏。

Based on Joran's comment, I found this plyr approach: 根据Joran的评论,我发现了这种plyr方法:

ddply(posts, .(year, poster), summarise, freq=length(year))

Which gives me the posts per year per user. 这给了我每个用户每年的帖子。 From there I can trim it down as suggested in other posts to get the top X posters per year. 从那里,我可以按照其他帖子中的建议将其缩小,以获取每年的X大海报。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM