r 带有 if else 语句的 for 循环和对上一次迭代结果的引用

Question

I am having a dataframe with field x containing both group names (labeled as letters in the example below) and members of the group (listed under the group names, labeled as a number).我有一个包含字段 x 的数据框，其中包含组名（在下面的示例中标记为字母）和组成员（列在组名下，标记为数字）。 I want to create a field that shows for each member the name of its group.我想创建一个字段，为每个成员显示其组的名称。 In the dataframe below the desired output is shown in column "outcome".在下面的数据框中，所需的输出显示在“结果”列中。

df <- data.frame("x"=c("A","1","2","B","C","1","2","C","D","1"),
                 "outcome"=c("A","A","A","B","C","C","C","C","D","D")
) %>%
  mutate(
    Letter = ifelse(grepl("[A-Za-z]", x) == T,"Letter",
                      "No Letter")
  )

My idea is to do this via a forloop.我的想法是通过 forloop 来做到这一点。 If x is a letter it should return that letter, if not it should return the outcome of the previous loop (which is the previous found letter in x).如果 x 是一个字母，它应该返回那个字母，否则它应该返回上一个循环的结果（这是 x 中上一个找到的字母）。 The forloop below doesn't give the right output:下面的 forloop 没有给出正确的输出：

df$outcome_calc[1] <- "A" 
for (i in 2:10) {  
  df$outcome_calc[i] <- ifelse(df$Letter[i] == "No Letter",df$outcome_calc[i-1],df$x[i])    

}

Any ideas how to get the right output?任何想法如何获得正确的输出？

Answer 1

Here are two tidyverse ways, very similar, using the convenience function zoo::na.locf .这里有两种tidyverse方式，非常相似，使用便利函数zoo::na.locf 。

First:第一的：

library(tidyverse)

df %>%
  mutate(na = is.na(as.numeric(as.character(x))),
         outcome2 = ifelse(na, as.character(x), NA_character_),
         outcome2 = zoo::na.locf(outcome2)) %>%
  select(-na)

Another one:另一个：

df %>%
  mutate(chr = !grepl("[[:digit:]]", x),
         outcome2 = ifelse(chr, as.character(x), NA_character_),
         outcome2 = zoo::na.locf(outcome2)) %>%
  select(-chr)

Answer 2

Here's a way to do this using for loop:这是使用for循环执行此操作的一种方法：

# keeps track of previous letter
prev = ''

# output
op = c()

for (i in df$x){

    # check the pattern
    check = grepl(pattern = '[a-zA-Z]', x = i, ignore.case = T)

    if(isTRUE(check)){
        op = c(op, i)
        prev = i
    } else {
        op = c(op, prev)
    }

}

print(op)
[1] "A" "A" "A" "B" "C" "C" "C" "C" "D" "D"

Answer 3

Alternatively, you can avoid for loop by using sapply function.或者，您可以使用sapply函数来避免for循环。

You can define the position of your letters:您可以定义字母的位置：

pos_letter <- grep("[A-Za-z]", df$x)

Then, use sapply to 1) define for each row, the position of the letter right above and finally replaced each values by the corresponding letter:然后，使用sapply到 1) 为每一行定义，字母在正上方的位置，最后将每个值替换为相应的字母：

df$out <- sapply(1:nrow(df),function(x) max(pos_letter[pos_letter <= x]))
df$out2 <- sapply(df$out, function(x) x = as.character(df[x,"x"]))

   x outcome out out2
1  A       A   1    A
2  1       A   1    A
3  2       A   1    A
4  B       B   4    B
5  C       C   5    C
6  1       C   5    C
7  2       C   5    C
8  C       C   8    C
9  D       D   9    D
10 1       D   9    D

You can combine both sapply function in a single line by writing:您可以通过编写将两个sapply函数组合在一行中：

sapply(1:nrow(df), function(n) as.character(df[max(pos_letter[pos_letter <= n]),"x"]))

[1] "A" "A" "A" "B" "C" "C" "C" "C" "D" "D"

Answer 4

Using tidyr::fill - requires NAs where your numbers were:使用tidyr::fill - 需要您的号码所在的 NA：

df = data.frame(x = c("A","1","2","B","C","1","2","C","D","1"),
                stringsAsFactors = FALSE)

df$x[grepl("[0-9]+", df$x)] = NA

tidyr::fill(df, x)
   x
1  A
2  A
3  A
4  B
5  C
6  C
7  C
8  C
9  D
10 D

Answer 5

`dplyr`

Here is a stream-lined version of Rui's 2nd approach which doesn't require to create a temporary helper column.这是Rui 第二种方法的简化版本，它不需要创建临时帮助列。 It uses stringr::str_detect() , if_else() , and zoo::na.locf() .它使用stringr::str_detect() 、 if_else()和zoo::na.locf() 。

library(dplyr)
df %>% 
  mutate(outcome2 = if_else(stringr::str_detect(x, "\\D"), x, factor(NA)) %>% zoo::na.locf())

 x outcome Letter outcome2 1 AA Letter A 2 1 A No Letter A 3 2 A No Letter A 4 BB Letter B 5 CC Letter C 6 1 C No Letter C 7 2 C No Letter C 8 CC Letter C 9 DD Letter D 10 1 D No Letter D

`data.table`

For the sake of completeness, here is also data.table approach which I have used frequently.为了完整起见，这里也是我经常使用的data.table方法。 It uses assignment by reference to update df .它使用通过引用赋值来更新df 。

library(data.table)
setDT(df)[x %like% "\\D", outcome2 := x][, outcome2 := zoo::na.locf(outcome2)][]

 x outcome Letter outcome2 1: AA Letter A 2: 1 A No Letter A 3: 2 A No Letter A 4: BB Letter B 5: CC Letter C 6: 1 C No Letter C 7: 2 C No Letter C 8: CC Letter C 9: DD Letter D 10: 1 D No Letter D

r 带有 if else 语句的 for 循环和对上一次迭代结果的引用

问题描述

5 个解决方案

解决方案1
2 2019-12-29 08:00:02

解决方案2
1 已采纳 2019-12-29 07:33:13

解决方案3
1 2019-12-29 08:00:17

解决方案4
1 2019-12-29 08:10:46

解决方案5
0 2019-12-29 14:04:42

`dplyr`

`data.table`

r 带有 if else 语句的 for 循环和对上一次迭代结果的引用

问题描述

5 个解决方案

解决方案1 2 2019-12-29 08:00:02

解决方案2 1 已采纳 2019-12-29 07:33:13

解决方案3 1 2019-12-29 08:00:17

解决方案4 1 2019-12-29 08:10:46

解决方案5 0 2019-12-29 14:04:42

dplyr

data.table

解决方案1
2 2019-12-29 08:00:02

解决方案2
1 已采纳 2019-12-29 07:33:13

解决方案3
1 2019-12-29 08:00:17

解决方案4
1 2019-12-29 08:10:46

解决方案5
0 2019-12-29 14:04:42

`dplyr`

`data.table`