简体   繁体   中英

Subset boxplot by date

I want to make a boxplot based on timeseries with 10-days data categories

set.seed(100)
date <- seq.Date(as.Date("2013-01-01"), as.Date("2014-12-31"), "days")
x <- as.integer(abs(rnorm(365))*1000)
df <- data.frame(date, x)

library(ggplot2)    
ggplot(df) +
      geom_boxplot(aes(y=x,
                       x=reorder(format(df$date,'10 days'),df$date),
                       fill=format(df$date,'%Y'), 
                       group=cut(df$date, "10 days"))) +
      xlab('10 Dyas') + guides(fill=guide_legend(title="Year")) +
      theme_bw()

But I got result like this

在此处输入图像描述

I don't know why I got NA here and the x label does not display axis of date like 1-10 Jan, 11-20 Jan, etc

Is there something wrong with my script?

You can use scale_x_date .

library(ggplot2)

ggplot(df) +
  geom_boxplot(aes(y=x,
                   x=date,
                   fill=format(date,'%Y'))) +  
   xlab('Monthly data') + guides(fill=guide_legend(title="Year")) +
   theme_bw() + 
   scale_x_date(breaks = '1 month') + 
   theme(axis.text.x = element_text(angle = 90, hjust = 1))

在此处输入图像描述

I chose 1 month as break interval for better visibility but you can use "10 days" if you want.

I just try to help you work out the problem. I think it would be easy to work out the problem if you generate variables needed before ggplot :

set.seed(100)
date <- seq.Date(as.Date("2013-01-01"), as.Date("2014-12-31"), "days")
x <- as.integer(abs(rnorm(365))*1000)
df <- data.frame(date, x)
library(tidyverse)

df1<-df %>% 
  mutate(
      x1=reorder(format(date,'10 days'),date),
      fill=format(date,'%Y'), 
      group=cut(date, "10 days")
 )

df1 %>% 
  ggplot(aes(y = x, x= date, fill=fill, group= group))+
  geom_boxplot()

Then if you check your data df1 , you will find that when group == 2013-12-27 , fill has two values: 2013 and 2014 . That's why you got a NA group in addition to 2013 and 2014 . Solution depends on how you want to assign the value to this group, or alternative way to group. A silly quick fix is

df1$fill = ifelse(as.character(df1$group) == "2013-12-27", "2013", df1$fill)

You can also check your x axis generated by x = reorder(format(df$date,'10 days'), df$date) (I used x1) is only one value: 10 days .

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM