[英]Presence in consecutive day by year in daily time-series data in R
I have daily time-series data for 60 years about the presence and absence of rainfall for 400 stations.我有 60 年来关于 400 个站点有无降雨的每日时间序列数据。 The data is in the following format where, in the second column, 1 indicates presence and 0 indicate absence:数据采用以下格式,其中,在第二列中,1 表示存在,0 表示不存在:
Date Rainfall
---------------------
1981-01-01 0
1981-01-02 0
1981-01-03 0
1981-01-04 1
1981-01-05 0
1981-01-06 1
1981-01-07 1
1981-01-08 1
1981-01-09 0
1981-01-10 0
1981-01-11 1
1981-01-12 1
1981-01-13 1
1981-01-14 1
1981-01-15 1
1981-01-16 0
.......... .
Now I have to calculate the number of consecutive wet days for each year when at least 3 consecutive days received rainfall and the longest consecutive days of rainfall in a year.现在我必须计算每年至少连续 3 天降雨的连续潮湿天数和一年中最长的连续降雨天数。 If 3 or more than 3 consecutive days (any number) received rainfall I will consider it as a single event.如果连续 3 天或超过 3 天(任意数量)下雨,我会将其视为单一事件。
My output will be like this我的输出将是这样的
Year No of consecutive wet-days longest consecutive wet-days
1981 2 5
.
.
How can we do this in R?我们如何在 R 中做到这一点? If I can solve for a station, I can iterate for all stations in R.如果我可以解决一个站,我可以迭代 R 中的所有站。
Thank you in advance for your help :)预先感谢您的帮助 :)
Another possible solution (I thank @DarrenTsai for his comments, which have improved this solution):另一个可能的解决方案(我感谢@DarrenTsai 的评论,他改进了这个解决方案):
library(tidyverse)
library(lubridate)
df %>%
group_by(Year = year(ymd(Date))) %>%
mutate(x = list(rle(Rainfall))) %>%
summarise(ncons = sum(x[[1]]$lengths >= 3 & x[[1]]$values == 1),
longest = ifelse(sum(x[[1]]$values == 1) == 0, 0,
max(x[[1]]$lengths[x[[1]]$values == 1])))
#> # A tibble: 2 × 3
#> Year ncons longest
#> <dbl> <int> <int>
#> 1 1981 2 5
#> 2 1982 2 4
You can create event with rle
.您可以使用rle
创建事件。
library(dplyr)
df <- df %>%
mutate(event = with(rle(Rainfall), rep(seq_along(lengths), lengths)))
Date Rainfall event
1 1981-01-01 0 1
2 1981-01-02 0 1
3 1981-01-03 0 1
4 1981-01-04 1 2
5 1981-01-05 0 3
6 1981-01-06 1 4
7 1981-01-07 1 4
8 1981-01-08 1 4
9 1981-01-09 0 5
10 1981-01-10 0 5
11 1981-01-11 1 6
12 1981-01-12 1 6
13 1981-01-13 1 6
14 1981-01-14 1 6
15 1981-01-15 1 6
16 1981-01-16 0 7
With that you can tally up the number of consecutive days for each rainfall event.这样,您就可以计算每次降雨事件的连续天数。
df %>% filter(Rainfall == 1) %>% group_by(event) %>% tally()
# A tibble: 3 x 2
event n
<int> <int>
1 2 1
2 4 3
3 6 5
Further wrangling with Year and counting the tallied up rain event will give you your expected summary.与 Year 进一步争论并计算统计的降雨事件将为您提供预期的总结。
Another possible solution, Though I think already better solutions have been suggested另一种可能的解决方案,虽然我认为已经提出了更好的解决方案
library(lubridate)
library(dplyr)
library(tibble)
rainfall_data <- tibble::tribble(
~ date, ~rainfall,
"1981-01-01", 0,
"1981-01-02", 0,
"1981-01-03", 0,
"1981-01-04", 1,
"1981-01-05", 0,
"1981-01-06", 1,
"1981-01-07", 1,
"1981-01-08", 1,
"1981-01-09", 0,
"1981-01-10", 0,
"1981-01-11", 1,
"1981-01-12", 1,
"1981-01-13", 1,
"1981-01-14", 1,
"1981-01-15", 1,
"1981-01-16", 0
)
rainfall_data %>%
mutate(
csum = ave(rainfall, cumsum(rainfall == 0), FUN = cumsum),
event = ave(csum, cumsum(rainfall == 0), FUN = max)
) %>%
filter(event >= 3) %>%
distinct(event, .keep_all = TRUE) %>%
group_by(year = year(ymd(date))) %>%
summarise(
No_of_consecutive_wet_days = n(),
longest_consecutive_wet_days = max(event)
)
#> # A tibble: 1 × 3
#> year No_of_consecutive_wet_days longest_consecutive_wet_days
#> <dbl> <int> <dbl>
#> 1 1981 2 5
Created on 2022-07-06 by the reprex package (v2.0.1)由 reprex 包 (v2.0.1) 创建于 2022-07-06
A simple solution using only {base} R functions, particularly diff
and tapply
.仅使用 {base} R 函数的简单解决方案,尤其是diff
和tapply
。 The summary statistics pertain to events with a start date in that year.汇总统计信息与开始日期在该年的事件有关。
date <- seq(as.Date("2000/1/1"), as.Date("2010/1/1"), "days")
rainfall <- sample(c(0,1),length(date), replace = T)
df <- data.frame(date,rainfall)
events <- df$date[which(c(1,diff(df$rainfall)) > 0)]
event.lengths <- tapply(df$rainfall, cumsum(c(1, diff(df$rainfall) > 0 )), sum)
df.events <- data.frame(events,event.lengths)
Total_rain_days <- tapply(df.events$event.lengths, format(df.events$events, format = "%Y"),sum)
Max_consecutive_rain_days <- tapply(df.events$event.lengths, format(df.events$events, format = "%Y"),max)
Year <- names(Total_rain_days)
output <- data.frame(Year, Total_rain_days, Max_consecutive_rain_days)
output
> output
Year Total_rain_days Max_consecutive_rain_days
2000 2000 192 8
2001 2001 175 9
2002 2002 196 13
2003 2003 193 9
2004 2004 164 11
2005 2005 183 7
2006 2006 196 10
2007 2007 176 7
2008 2008 179 7
2009 2009 178 9
2010 2010 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.