简体   繁体   English

r下雨事件后天数如何统计

[英]How to count the days after rain events in r

I have a data frame 'test' like below,我有一个如下所示的数据框“测试”,

    day                  Rain      SWC_11    SWC_12    SWC_13    SWC_14   SWC_21   
01/01/2019  00:00:00     0.2         51       60      63         60        64 
02/01/2019  00:00:00     0.2        51.5      60.3      63.4     60.8      64.4
03/01/2019  00:00:00     0.0        51.3      60.3      63.3     60.6      64.1 
04/01/2019  00:00:00     0.4        51.5      60.3      63.4     60.8      64.4   
15/01/2019  00:00:00     0.0        NA        NA        NA       NA        NA
16/01/2019  00:00:00     0.0        NA        NA        NA       NA        NA
17/01/2019  00:00:00     0.0        51.5      60.3      63.4     60.8      64.4

Now I want to count the days after each rain event, once it comes to the next rain events, it restarts again.现在我想计算每次下雨后的天数,一旦到了下一次下雨,它又会重新开始。 My ideal outputs gonna be like below.我的理想输出将如下所示。

  day                     Rain      SWC_11    SWC_12    SWC_13    SWC_14   SWC_21   events
01/01/2019  00:00:00     0.2        51         60        63         60        64       1
02/01/2019  00:00:00     0.2        51.5      60.3      63.4     60.8      64.4       1
03/01/2019  00:00:00     0.0        51.3      60.3      63.3     60.6      64.1       2
04/01/2019  00:00:00     0.4        51.5      60.3      63.4     60.8      64.4       1
15/01/2019  00:00:00     0.0        NA        NA        NA       NA        NA         12
16/01/2019  00:00:00     0.0        NA        NA        NA       NA        NA         13
17/01/2019  00:00:00     0.0        51.5      60.3      63.4     60.8      64.4       14

my code is我的代码是

test$day<- as.numeric(as.Date(test$day))
for(i in 1:(nrow(test)-1))
if (test$Rain[[i]] != 0){
  test$event[i] <- 1
  test$event[i+nrow(test)] <-test$day[i+nrow(test)]- test$day[i] +1
 }else{ 
test$event <-0
}

but the results looks wired and the warning message is as below,但结果看起来有线并且警告消息如下所示,

Error in `$<-.data.frame`(`*tmp*`, "event", value = c(0, 1, 0, 0, 0, 0,  : 
replacement has 12 rows, data has 10

Hope someone gonna help.希望有人能帮忙。

Instead of the rle -derivative I suggested earlier, I think cumulative-sum logic can be used here.除了我之前建议的rle ,我认为这里可以使用累积和逻辑。

dat %>%
  mutate(
    day = as.Date(day, format="%d/%m/%Y"),
    daylag = dplyr::lag(day, default = first(day) - 1)
  ) %>%
  group_by(grp = cumsum(Rain > 0)) %>%
  mutate(event = day - daylag[1]) %>%
  ungroup()
# # A tibble: 7 x 11
#   day         Rain SWC_11 SWC_12 SWC_13 SWC_14 SWC_21 events daylag       grp event  
#   <date>     <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <int> <date>     <int> <drtn> 
# 1 2019-01-01   0.2   51     60     63     60     64        1 2018-12-31     1  1 days
# 2 2019-01-02   0.2   51.5   60.3   63.4   60.8   64.4      1 2019-01-01     2  1 days
# 3 2019-01-03   0     51.3   60.3   63.3   60.6   64.1      2 2019-01-02     2  2 days
# 4 2019-01-04   0.4   51.5   60.3   63.4   60.8   64.4      1 2019-01-03     3  1 days
# 5 2019-01-15   0     NA     NA     NA     NA     NA       12 2019-01-04     3 12 days
# 6 2019-01-16   0     NA     NA     NA     NA     NA       13 2019-01-15     3 13 days
# 7 2019-01-17   0     51.5   60.3   63.4   60.8   64.4     14 2019-01-16     3 14 days

Data:数据:

dat <- read.table(header = TRUE, text = "
  day           Rain      SWC_11    SWC_12    SWC_13    SWC_14   SWC_21   events
01/01/2019     0.2        51          60      63         60        64       1
02/01/2019     0.2        51.5      60.3      63.4     60.8      64.4       1
03/01/2019     0.0        51.3      60.3      63.3     60.6      64.1       2
04/01/2019     0.4        51.5      60.3      63.4     60.8      64.4       1
15/01/2019     0.0        NA        NA        NA       NA        NA         12
16/01/2019     0.0        NA        NA        NA       NA        NA         13
17/01/2019     0.0        51.5      60.3      63.4     60.8      64.4       14")

I have a solution which works just with base R, but it is not as short as the one above.我有一个仅适用于 base R 的解决方案,但它不像上面的那样短。

# imagine if you have to do this manually, how would you achieve it step-by-step?
# then just use base R to realize every single step.

# create the "events" column and "day2" column 
test$events <- NA
test$day2 <- as.Date(test$day,format="%d/%m/%Y")

# test if the first day, rains or not, and assign the value for event
for (i in 1:1){
  if(test$Rain[[i]] !=0){
    test$events[[i]] <- 1
  }
  else {
    test$events[[i]] <- 0
  }
}

# then starting from the 2nd row, go down one by one
# assign the value for "events" column based on your criteria
for (i in 2:(nrow(test))){
  if(test$Rain[[i-1]] !=0 &
     test$Rain[[i]] != 0){
     test$events[[i]] <- 1
  }
  if (test$Rain[[i-1]] != 0 &
      test$Rain[[i]] == 0){
      test$events[[i]] <- test$events[[i-1]] + 1*(as.Date(as.character(test$day2[[i]]), format="%d/%m/%Y") - 
                                                  as.Date(as.character(test$day2[[i-1]]), format="%d/%m/%Y"))
  }
  if (test$Rain[[i-1]] == 0 &
      test$Rain[[i]] !=0){
      test$events[[i]] <-1
  }
  if (test$Rain[[i-1]] == 0 &
      test$Rain[[i]] ==0){
      test$events[[i]] <-  test$events[[i-1]] + + 1*(as.Date(as.character(test$day2[[i]]), format="%d/%m/%Y") - 
                                                  as.Date(as.character(test$day2[[i-1]]), format="%d/%m/%Y"))
  }
}

Now you will have the desired results.现在您将获得所需的结果。 My solution is not very smart, but this is my thinking when the codes seem to be difficult.我的解决方案不是很聪明,但这是我在代码似乎很难时的想法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM