简体   繁体   English

从 R 中的多年数据运行每个日历日的百分位数值

[英]Running percentile value for each calendar day from multi-year data in R

I need to calculate the 30-day running (window) 90th percentile maximum temperature value for each calendar day from multi-year data.我需要根据多年数据计算每个日历日的 30 天运行(窗口)第 90 个百分位最高温度值。 For example, to calculate the 90th percentile value on Jan 1st, I have to choose a 30-day window centered on Jan 1st, ie, data from December 16 to January 15 for all 42 years.例如,要计算 1 月 1 日的第 90 个百分位值,我必须选择一个以 1 月 1 日为中心的 30 天 window,即 12 月 16 日到 1 月 15 日所有 42 年的数据。 So, I would have 1260 (30*42) data points for each day.所以,我每天会有 1260 (30*42) 个数据点。 I need the value for 366 days.我需要 366 天的值。 I have 42-year daily datasets from 1980 to 2022, which look like this:我有从 1980 年到 2022 年的 42 年每日数据集,如下所示:

date    tmax    tmin
1981-01-01  19.2    5.4
1981-01-02  18.2    5
1981-01-03  16.1    3.8
1981-01-04  17.2    4.4
1981-01-05  15.7    2.4
1981-01-06  15.6    5.4
1981-01-07  11.2    4.1
1981-01-08  14.8    -1
1981-01-09  15  0.8
1981-01-10  16.2    -0.4

.........................
.........................
.........................
2022-12-25  17.4    4.4
2022-12-26  16.5    4.1
2022-12-27  17  5.4
2022-12-28  15.2    3.6
2022-12-29  8.1 7.7
2022-12-30  13.5    6
2022-12-31  14.8    4.5

How can I do this in R?我如何在 R 中执行此操作? Initially, I thought it would be simple like this.最初,我认为它会像这样简单。

temp_data <- read.csv("temperature.csv")

#as the date and tmax data are being read as characters by R
temp_data$tmax <- as.numeric(temp_data$tmax)
temp_data$date <- as.Date(temp_data$date, "%Y-%m-%d")
#Create a day of year variable for the day of the year
temp_data$doy <- as.numeric(format(temp_data$date,"%j"))

#load libraries
library(dplyr)
library(zoo)

temp_data_90th <- temp_data %>% 
  group_by(doy) %>% 
  summarize(rolling_90th = rollapply(tmax, width = 30, FUN = quantile, prob = 0.9, align = "center", na.rm=T))

But I don't think it gave the correct result since temp_data_90th has 4,470 rows with 13 data for each day of year.但我不认为它给出了正确的结果,因为 temp_data_90th 有 4,470 行,一年中的每一天都有 13 个数据。

Please can you suggest where I am doing wrong?请你能建议我哪里做错了吗? Thank you in advance for your support.预先感谢您对我们的支持。

To illustrate this we will need reproducible data so use DF shown reproducibly in the Note at the end.为了说明这一点,我们将需要可重现的数据,因此请使用末尾注释中可重现显示的 DF。

Calculate yday which is the day of the year for each row of DF.计算 yday,它是 DF 每一行的一年中的第几天。 Then for each possible yday (0:365) get value in all rows whose yday is within 15 back to 14 forward of that yday modulo 366 and apply quantile to those values giving q90.然后,对于每个可能的 yday (0:365),获取 yday 在 15 以内的所有行中的值,从该 yday 模 366 向前到 14,并将分位数应用于那些给出 q90 的值。

No packages are used.没有使用包。

yday <- as.POSIXlt(DF$date)$yday
q90 <- sapply(0:365, function(x) 
  quantile(DF$value[yday %in% (seq(x-15, x+14) %% 366)], prob = 0.9, na.rm = TRUE))

With rollapply it is slightly shorter.使用 rollapply 它会稍微短一些。 Using yday from above we have:使用上面的 yday 我们有:

library(zoo)
q90 <- rollapply(seq(-15, 365 + 14) %% 366, 30, function(x)
  quantile(DF$value[yday %in% x], prob = 0.9, na.rm = TRUE))

Note笔记

d <- seq(as.Date("2000-01-01"), as.Date("2022-12-31"), "day")
DF <- data.frame(date = d, value = seq_along(d))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R: ggplot2 绘制多年累积度日数据 - R: ggplot2 to plot multi-year cumulative degree day data 根据多年数据生成年度频率图 - Generating yearly frequency plot from multi-year data 为每小时臭氧数据的多年平均值设置参数“频率” - setting the parameter "frequency" for the multi-year average of hourly ozone data 从多年时间序列计算每小时平均值 - Calculating hourly averages from a multi-year timeseries ggplot:按月-年聚合多年数据,审美长度误差 - ggplot: aggregate multi-year data by Month-Year, aesthetic length error 如何填充R数据框的行,其中每行代表一天,每年的每一天都有一个共同的值? - How can I populate the rows of an R data frame, in which each row represents a day, with a single common value for each day of a year? 在多年事件的R表中,如何标记另一个事件的首次出现? - In an R table of multi-year events, how can you mark the first occurrence of a different event? 使用R将日历年到水年的数据框重新排序 - Reorder data frame from calendar year to water year using R 来自phenex软件包的modelNDVI()函数,用于平滑和分析不规则多年时间序列的其他软件包? - modelNDVI() function from phenex package, other package for smoothing and analysing irregular multi-year time series? 多年数据集中一年内的不同季节 - Separate seasons within a one year in a multi-year dataset
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM