简体   繁体   English

如何在 ggplot 上绘制月日日期而不是 r 中的年份

[英]How to plot month-day dates on ggplot instead of day of year in r

I need to create a plot that shows the range between the earliest and the latest date for two groups.我需要创建一个图表,显示两组的最早日期和最晚日期之间的范围。 There are different years, but I am only interested in the dates defined as month-day (ie Feb-04) regardless of years.有不同的年份,但我只对定义为月日(即 2 月 4 日)的日期感兴趣,而不管年份。 I am able to do that when defining month-day as Julian days, but I'd like to do it on the month-day format (ie Feb-04).在将月日定义为儒略日时,我能够做到这一点,但我想以月日格式(即 2 月 4 日)来做到这一点。

This is the code and output I obtained when working this thing in Julian dates:这是我在朱利安日期工作时获得的代码和输出:

library(dplyr)

data.1 <-read.csv(text = "
trt,full_date
A,10/06/2020
A,09/19/2017
A,10/28/2014
A,09/02/2016
A,09/19/2017
A,09/26/2017
B,08/24/2020
B,09/24/2020
B,10/16/2018
B,09/16/2018
B,09/15/2016
B,09/09/2018
")

#day of year option
data.2 <- data.1 %>%
  mutate(full_date = as.Date(full_date, format("%m/%d/%Y")),
         full_date.doy = as.numeric(strftime(full_date, format = "%j"))) %>%
  group_by(trt) %>%
  summarise(earliest.doy = min(full_date.doy),
            latest.doy = max(full_date.doy))
                                  
ggplot(data.2) +
  geom_segment( aes(x=trt, xend=trt, y=earliest.doy, yend=latest.doy), color="grey") +
  geom_point( aes(x=trt, y=earliest.doy), color=rgb(0.2,0.7,0.1,0.5), size=3 ) +
  geom_point( aes(x=trt, y=latest.doy), color=rgb(0.7,0.2,0.1,0.5), size=3 ) +
  coord_flip() +
  ylab("Day of the year")

output:输出:

在此处输入图像描述

What I would like to have is this (dates on the x axis are approximated:我想要的是这个(x轴上的日期是近似的: 在此处输入图像描述

The first problem I ran into was the calculation of earliest and latest date.我遇到的第一个问题是最早和最晚日期的计算。 For trt="A" , the earliest and latest dates are wrong.对于trt="A" ,最早和最晚的日期是错误的。 在此处输入图像描述

The issue is that the date_mm.dd seems to be in character format, and I don't find a way to change to date.问题是date_mm.dd似乎是字符格式,我找不到更改日期的方法。 That way, the plot is wrong:这样,情节就错了: 在此处输入图像描述

Any hint would be really appreciated.任何提示将不胜感激。

One way to address this could be to take your doy variables and make them into dates in an arbitrary year like 2022. Here, day one will be one day after 2021-12-31, ie Jan 1 2022.解决此问题的一种方法可能是将您的doy变量设置为任意年份(例如 2022 年)的日期。在这里,第一天将是 2021-12-31 之后的一天,即 2022 年 1 月 1 日。

(2022 is not a leap year, so dates originating after Feb 28 in a leap year will be represented ahead by one day. ie Feb 29, when it occurs, is the 60th day of the year, but in most years, like 2022, March 1 is the 60th day, so it would show up there. Depending on the context, you could potentially adjust for that.) (2022 年不是闰年,因此闰年 2 月 28 日之后的日期将提前一天表示。即 2 月 29 日发生时是一年中的第 60 天,但在大多数年份,如 2022 年, 3 月 1 日是第 60 天,所以它会显示在那里。根据上下文,您可能会对此进行调整。)

data.2 %>%
  mutate(across(contains("doy"), ~as.Date("2021-12-31") + .x))

This is a shortcut to ask dplyr to apply the same function to any column whose name contains the strong "doy".这是要求 dplyr 将相同的函数应用于名称包含强“doy”的任何列的快捷方式。 We could equivalently use:我们可以等效地使用:

data.2 %>%
  mutate(earliest.doy = as.Date("2021-12-31") + earliest.doy))
  mutate(latest.doy   = as.Date("2021-12-31") + earliers.doy))

Result结果

# A tibble: 2 × 3
  trt   earliest.doy latest.doy
  <chr> <date>       <date>    
1 A     2022-09-03   2022-10-28
2 B     2022-08-25   2022-10-16

then you could feed that into your existing code:那么您可以将其输入到您现有的代码中:

... %>%
  ggplot()+
  geom_segment( aes(x=trt, xend=trt, y=earliest.doy, yend=latest.doy), color="grey") +
  geom_point( aes(x=trt, y=earliest.doy), color=rgb(0.2,0.7,0.1,0.5), size=3 ) +
  geom_point( aes(x=trt, y=latest.doy), color=rgb(0.7,0.2,0.1,0.5), size=3 ) +
  coord_flip() +
  ylab("Day of the year")

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM