[英]R grouped for loop seq_along length - 1?
我正在尝试为接受项目所需的“尝试”次数创建一个指标。 我认为 for 循环是通往 go 的方法,但我在 R 中没有大量循环经验,而且逻辑有点复杂。 任何帮助/建议/反馈将不胜感激!
在玩具示例中,“accept”是“C”,并且向前迭代“try”的切换是提交 (A) 被重置 (B) 或提交 (A) 被接受 (C)。
在一个组内,如果事件顺序是 A > B 或 A > C,则将“try”向前迭代 1。否则,“try”计数应保持不变。 显然,“真实”示例比这个玩具示例要复杂得多。
目前,我只是想让尝试计数正确,而不用担心分组。
我不确定如何限制 seq_along 本质上停止在 [group_by %>% length(group) - 1]。 有更好的选择吗?
df = data.frame(group = c(1,1,1,1,1,2,2,2,2),
event = c("A","B","A","A","C","A","B","A","C"))
df$try <- 0
for (i in seq_along(df$event)){
if (df$event[[i]] == "A" &
df$event[[i+1]] %in% c("B", "C"))
{
df$try[[i]] <- df$try + 1
} else {
df$try[[i]] <- df$try
}
}
# this essentially shows the correct answer (win = try + 1, loss = try),
# but has "df$event[[i + 1]] : subscript out of bounds",
# and I need to save the outcome so I can access later
df$try <- 0
for (i in seq_along(df$event)){
if (df$event[[i]] == "A" &
df$event[[i+1]] %in% c("B", "C"))
{
print("Win")
} else {
print("Loss")
}
}
我对玩具示例的预期(最终)答案是:try = c(1,1,1,2,2,1,1,2,2); 每组 1 和 2 需要 2 次“尝试”才能被接受
您可以使用lead
在dplyr
中获取下一个值。 尝试这个 -
library(dplyr)
df %>%
group_by(group) %>%
mutate(result = cumsum(event == 'A' & lead(event) %in% c('B', 'C'))) %>%
ungroup
# group event try result
# <dbl> <chr> <dbl> <int>
#1 1 A 1 1
#2 1 B 1 1
#3 1 A 1 1
#4 1 A 2 2
#5 1 C 2 2
#6 2 A 1 1
#7 2 B 1 1
#8 2 A 2 2
#9 2 C 2 2
保留 output 中的try
列用于比较。
您可以通过再添加一个 if 来解决“下标越界”问题。
if(i+1 > nrow(df){
print('do nothing')
} else if (
#followed by your original code
)
我假设如果最后一行,值将只是 0。所以另一个 if 应该做这个技巧。
library(tidyverse)
df <- data.frame(group = c(1,1,1,1,1,2,2,2,2),
event = c("A","B","A","A","C","A","B","A","C"))
temp <- data.frame(NULL)
for(i in 1:nrow(df)){
if(i+1 > nrow(df)){
print('This is the last row')
temp <- rbind(temp, 0)
} else if(df$event[[i]] == 'A' &
df$event[[i+1]] %in% c('B', 'C'))
{
temp <- rbind(temp, 1)
} else {
temp <- rbind(temp, 0)
}
}
df2 <- cbind(df, temp) %>%
mutate(
cumulative_sum = cumsum(X1)
)
这似乎暂时有效:
如果 i + 1 超出长度,则添加“中断”
df$try <- 0
for (i in seq_along(df$event)){
if (i+1 == length(df$event)){
break
} else if (df$event[[i]] == "A" &
df$event[[i+1]] %in% c("B", "C"))
{
print("Win")
} else (
print("Loss")
)
}
# updated toy df to show N tries differs:
df = data.frame(group = c(1,1,1,1,1,1,1,2,2,2,2),
event = c("A","B","A","A","B","A","C","A","B","A","C"))
df$try <- 0
for (i in seq_along(df$event)){
if (i == length(df$event)){ # use i otherwise it doesn't catch the last switch
break
} else if (df$event[[i]] == "A" &
df$event[[i+1]] %in% c("B", "C"))
{
df$try[[i]] <- + 1
} else (
df$try[[i]]
)
}
df %>%
group_by(group) %>%
mutate(N_tries = max(cumsum(try)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.