简体   繁体   English

用 R 中的 ggplot 按周绘图

[英]Plotting by week with ggplot in R

I have the following data:我有以下数据:

set.seed(123)
timeseq <- as.Date(Sys.time() + cumsum(runif(1000)*86400))
data <- rnorm(1000)
df <- data.frame(timeseq,data)

I wanted to know if anyone has any methods on how to aggregate data by week.我想知道是否有人有关于如何按周汇总data的方法。 What I am attempting to do is plot a time series ggplot, so even better if I can skip this step and have ggplot handle this.我试图做的是绘制一个时间序列 ggplot,所以如果我可以跳过这一步并让 ggplot 处理它,那就更好了。 Been stuck on this all day.一整天都被困在这个问题上。

Another way to manually aggregate by week using dplyr.另一种使用 dplyr 按周手动聚合的方法。

library(dplyr)
df$weeks <- cut(df[,"timeseq"], breaks="week")
agg <- df %>% group_by(weeks) %>% summarise(agg=sum(data))
ggplot(agg, aes(as.Date(weeks), agg)) + geom_point() + scale_x_date() +
    ylab("Aggregated by Week") + xlab("Week") + geom_line()

在此处输入图片说明

You can also aggregate a date aesthetic with the scale_x_date() function's breaks argument.您还可以使用scale_x_date()函数的breaks参数聚合日期美学。

ggplot(df, aes(x = timeseq, y = data)) +
    stat_summary(fun.y = sum, geom = "line") +
    scale_x_date(labels = date_format("%Y-%m-%d"),
                 breaks = "1 week")

To get the week we can use the lubridate library, with the floor_date function like so:要获得星期,我们可以使用lubridate库,使用floor_date函数,如下所示:

library(lubridate)
df$week <- floor_date(df$timeseq, "week")

We can plot the data using ggplot by doing a stats summary (there might be a better way?), and it will look like this:我们可以使用ggplot通过做一个统计摘要来绘制数据(可能有更好的方法?),它看起来像这样:

stat_sum_single <- function(fun, geom="point", ...) {
  stat_summary(fun.y=fun, colour="red", geom=geom, size = 3, ...)
}

ggplot(df, aes(x=floor_date(timeseq, "week"), y=data)) + 
  stat_sum_single(sum, geom="line") + 
  xlab("week")

which will have the output:这将有输出:

在此处输入图片说明

I want to expand upon @chappers idea of using package lubridate , but in a fully piped way.我想在使用包@chappers想法扩大lubridate ,但在一个完全管道的方式。

library(dplyr)
library(ggplot2)
library(lubridate)
set.seed(123)
data.frame(
  timeseq = as.Date(Sys.time() + cumsum(runif(1000) * 86400)),
  data = rnorm(1000)
) %>%
  mutate(timeseq = floor_date(timeseq, unit = "week")) %>%
  group_by(timeseq) %>%
  summarise(data = sum(data)) %>%
  ggplot() +
  geom_line(aes(x = timeseq, y = data))

Substitute data.frame lines with df if you have it already stored as an object.如果您已经将data.frame行存储为对象,请用df替换它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM