繁体   English   中英

R 分组为循环 seq_along 长度 - 1?

[英]R grouped for loop seq_along length - 1?

我正在尝试为接受项目所需的“尝试”次数创建一个指标。 我认为 for 循环是通往 go 的方法,但我在 R 中没有大量循环经验,而且逻辑有点复杂。 任何帮助/建议/反馈将不胜感激!

在玩具示例中,“accept”是“C”,并且向前迭代“try”的切换是提交 (A) 被重置 (B) 或提交 (A) 被接受 (C)。

在一个组内,如果事件顺序是 A > B 或 A > C,则将“try”向前迭代 1。否则,“try”计数应保持不变。 显然,“真实”示例比这个玩具示例要复杂得多。

目前,我只是想让尝试计数正确,而不用担心分组。

我不确定如何限制 seq_along 本质上停止在 [group_by %>% length(group) - 1]。 有更好的选择吗?

df = data.frame(group = c(1,1,1,1,1,2,2,2,2), 
                 event = c("A","B","A","A","C","A","B","A","C"))

df$try <- 0
for (i in seq_along(df$event)){
    if (df$event[[i]] == "A" &  
          df$event[[i+1]] %in% c("B", "C"))
      {
        df$try[[i]] <- df$try + 1
    } else {
        df$try[[i]] <- df$try
    }
}

# this essentially shows the correct answer (win = try + 1, loss = try), 
# but has "df$event[[i + 1]] : subscript out of bounds", 
# and I need to save the outcome so I can access later

df$try <- 0
for (i in seq_along(df$event)){
    if (df$event[[i]] == "A" &  
          df$event[[i+1]] %in% c("B", "C"))
      {
        print("Win")
    } else {
        print("Loss")
    }
}

我对玩具示例的预期(最终)答案是:try = c(1,1,1,2,2,1,1,2,2); 每组 1 和 2 需要 2 次“尝试”才能被接受

您可以使用leaddplyr中获取下一个值。 尝试这个 -

library(dplyr)

df %>%
  group_by(group) %>%
  mutate(result = cumsum(event == 'A' & lead(event) %in% c('B', 'C'))) %>%
  ungroup

#  group event   try result
#  <dbl> <chr> <dbl>  <int>
#1     1 A         1      1
#2     1 B         1      1
#3     1 A         1      1
#4     1 A         2      2
#5     1 C         2      2
#6     2 A         1      1
#7     2 B         1      1
#8     2 A         2      2
#9     2 C         2      2

保留 output 中的try列用于比较。

您可以通过再添加一个 if 来解决“下标越界”问题。

if(i+1 > nrow(df){
print('do nothing')
} else if (
#followed by your original code
)

我假设如果最后一行,值将只是 0。所以另一个 if 应该做这个技巧。

library(tidyverse)
df <- data.frame(group = c(1,1,1,1,1,2,2,2,2), 
                event = c("A","B","A","A","C","A","B","A","C"))


temp <- data.frame(NULL)
for(i in 1:nrow(df)){
  if(i+1 > nrow(df)){
    print('This is the last row')
    temp <- rbind(temp, 0)
  } else if(df$event[[i]] == 'A' &
     df$event[[i+1]] %in% c('B', 'C'))
  {
    temp <- rbind(temp, 1)
  } else {
    temp <- rbind(temp, 0)
  }
}

df2 <- cbind(df, temp) %>%
  mutate(
    cumulative_sum = cumsum(X1)
  )

这似乎暂时有效:

如果 i + 1 超出长度,则添加“中断”

df$try <- 0
for (i in seq_along(df$event)){
    if (i+1 == length(df$event)){
      break
      } else if (df$event[[i]] == "A" &  
          df$event[[i+1]] %in% c("B", "C"))
      {
        print("Win")
    } else (
        print("Loss")
    )
}

# updated toy df to show N tries differs:

df = data.frame(group = c(1,1,1,1,1,1,1,2,2,2,2), 
                 event = c("A","B","A","A","B","A","C","A","B","A","C"))

df$try <- 0
for (i in seq_along(df$event)){
    if (i == length(df$event)){ # use i otherwise it doesn't catch the last switch
      break
      } else if (df$event[[i]] == "A" &  
          df$event[[i+1]] %in% c("B", "C"))
      {
        df$try[[i]] <- + 1
    } else (
        df$try[[i]]
    )
}

df %>% 
  group_by(group) %>% 
  mutate(N_tries = max(cumsum(try)))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM