简体   繁体   English

按年月计数分离中的连续DRY值

[英]Count consecutive DRY values within separation by year-month

I have a data frame. 我有一个数据框。

dat <- read.table(text = "
YEAR  MONTH DAY PCP  SPELL 
1950   12   28   0    DRY    
1950   12   29  11.7  WET
1950   12   30   0    DRY
1950   12   31   0    DRY
1951   01   01   0    DRY
1951   01   02   0    DRY
1951   01   03  20.3  WET
", header = TRUE)

I create groups by year and month, 我按年和月创建组,

library(tidyverse)

groups <- dat %>% group_by(YEAR , MONTH) %>% summarise(NUM = n())

groups$ID <- 1:length(grupos$NUM)

dat %>% left_join(groups, by = c("YEAR", "MONTH"))

and apply the script, 并应用脚本,

dfx <- data.frame(dat, svalue = NA)

dfx$svalue[1] <- ifelse(dfx$SPELL[1] == "DRY", 1, 0)

for(i in 2:nrow(dfx)) 
  dfx$svalue[i] <- ifelse(dfx$SPELL[i] == 0, dfx$svalue[i - 1] + 1, 0)

Then, I obtain: 然后,我获得:

YEAR  MONTH DAY PCP  SPELL svalue
1950   12   28   0    DRY    1
1950   12   29  11.7  WET    0
1950   12   30   0    DRY    1
1950   12   31   0    DRY    2
1951   01   01   0    DRY    3 
1951   01   02   0    DRY    4
1951   01   03  20.3  WET    0

How I can separate the values by year and month? 如何按年和月分隔值?
I need to obtain this: 我需要获得这个:

YEAR  MONTH DAY PCP  SPELL svalue
1950   12   28   0    DRY    1
1950   12   29  11.7  WET    0
1950   12   30   0    DRY    1
1950   12   31   0    DRY    2
1951   01   01   0    DRY    1 
1951   01   02   0    DRY    2
1951   01   03  20.3  WET    0

Or apply the dw.spell of RMRAINGEN package, with the separation year-month. 或应用dw.spellRMRAINGEN包,与分离年月。

Thanks. 谢谢。

Based on the expected output, it can be created by creating another group with a logical vector created on 'svalue' 基于预期的输出,可以通过使用在“ svalue”上创建的逻辑向量创建另一个组来创建它

library(data.table)
setDT(dfx)[svalue != 0, svalue := seq_len(.N), .(cumsum(svalue == 1), YEAR, MONTH)]
dfx
#   YEAR MONTH DAY  PCP SPELL svalue
#1: 1950    12  28  0.0   DRY      1
#2: 1950    12  29 11.7   WET      0
#3: 1950    12  30  0.0   DRY      1
#4: 1950    12  31  0.0   DRY      2
#5: 1951     1   1  0.0   DRY      1
#6: 1951     1   2  0.0   DRY      2
#7: 1951     1   3 20.3   WET      0

Or group by the run-length-id of 'SPELL' 或按“ SPELL”的run-length-id分组

setDT(dfx)[, svalue := seq_len(.N) * (svalue != 0), .(rleid(SPELL), YEAR, MONTH)]

data 数据

dfx <- structure(list(YEAR = c(1950L, 1950L, 1950L, 1950L, 1951L, 1951L, 
 1951L), MONTH = c(12L, 12L, 12L, 12L, 1L, 1L, 1L), DAY = c(28L, 
 29L, 30L, 31L, 1L, 2L, 3L), PCP = c(0, 11.7, 0, 0, 0, 0, 20.3
 ), SPELL = c("DRY", "WET", "DRY", "DRY", "DRY", "DRY", "WET"), 
  svalue = c(1L, 0L, 1L, 2L, 3L, 4L, 0L)), class = "data.frame",
 row.names = c(NA, -7L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM