For 循环跨多个列

Question

I have some data from a questionnaire that measures frequency of entering a shop ("_freq") and enjoyment of the experience ("_enj").我有一些问卷数据，用于衡量进入商店的频率（“_freq”）和体验的享受（“_enj”）。 Overall, there are 17 shops (shop1, shop2, ...) and 120 rows of data.总共有 17 家商店（shop1、shop2、...）和 120 行数据。 Below is an example of 5 rows of data for just shop 1 and 2.下面是仅商店 1 和 2 的 5 行数据的示例。

shop1_freq shop1_freq	shop1_enj shop1_enj	shop2_freq shop2_freq	shop2_enj shop2_enj
0 0	9 9	5 5	4 4
3 3	2 2	0 0	9 9
0 0	9 9	5 5	4 4
0 0	2 2	0 0	9 9
4 4	9 9	5 5	4 4

I have written a for loop that labels incorrect responses to the questionnaire as "999" so that I can identify them.我编写了一个 for 循环，它将对问卷的错误回答标记为“999”，以便我可以识别它们。 Basically, for each shop in isolation, a response is incorrect if frequency is 0 and enjoyment is not 9, or the reverse, if frequency is not 0 but enjoyment is 9. At the moment I am repeating the loops below 17 times (individually for each shop, below is just shop 1).基本上，对于每个孤立的商店，如果频率为 0 并且享受不是9，则响应不正确，或者如果频率不是0 但享受为 9，则响应不正确。目前我正在重复 17 次以下的循环（个人为每个商店，下面只是商店1）。

for (rows in 1:120){  
  if(data$shop1_freq[rows] == "0" & data$shop1_enj[rows] != 9) { 
    data$shop1_enj[rows] = "999" # label incorrect 999
  }
}

for (rows in 1:120){  
  if(data$shop1_freq[rows] != "0" & data$shop1_enj[rows] == 9) { 
    data$shop1_enj[rows] = "999" # label incorrect 999
  }
}

However I wondered if there was a more efficient way to do this for all 17 shops in less code?但是我想知道是否有一种更有效的方法可以用更少的代码为所有 17 家商店做到这一点？

Answer 1

It can be done with across in mutate for multiple 'shop_\d+_enj' columns and its corresponding '_freq' columns对于多个 'shop_\d+_enj' 列及其对应的 '_freq' 列，它可以across mutate中完成

library(dplyr)
data1 <- data %>%
    mutate(across(matches('^shop\\d+_enj$'), ~ {
             tmp <- get(str_replace(cur_column(), '_enj', '_freq'))
             case_when(tmp == 0 &  . != 9 ~ 999, 
                       tmp != 0 & . == 9 ~ 999,
                    TRUE ~ .)

      }))

Details -细节 -

We loop across the columns that match 'shop' followed by one or more digits, then a _ and the 'enj' in column names, get the corresponding column of 'freq', by replacing the column name ( cur_column() ) suffix '_enj' with '_freq', use that to create the compound conditional expression with logical operators in case_when , if the value is TRUE, then assign ( ~ ) those row elements to 999, and the default values are returned in the last TRUE ~.我们遍历匹配 'shop' across列，后跟一个或多个数字，然后是列名中的_和 'enj'，通过替换列名 ( get cur_column() ) 后缀' _enj' 和 '_freq'，使用它在case_when中使用逻辑运算符创建复合条件表达式，如果值为 TRUE，则将 ( ~ ) 这些行元素分配给 999，并在最后一个TRUE ~. . . Here, .在这里， . is the column values是列值

In base R , this can be done with multiple ways.在base R中，这可以通过多种方式完成。 One option is to split the data into a list based on the pattern of column names一种选择是根据列名的模式将数据拆分为list

lst1 <- split(data1, sub("_.*", "", names(data1))
out <- do.call(cbind, lapply(lst1, function(x) {
       x[[2]] <- ifelse(x[[1]] == 0 & x[[2]] != 9, 999,
     ifelse(x[[1]] != 0 & x[[2]] == 9, 999, x[[2]])))
     x 
     }))

For 循环跨多个列

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-03-18 15:52:52

shop1_freq shop1_freq	shop1_enj shop1_enj	shop2_freq shop2_freq	shop2_enj shop2_enj
0 0	9 9	5 5	4 4
3 3	2 2	0 0	9 9
0 0	9 9	5 5	4 4
0 0	2 2	0 0	9 9
4 4	9 9	5 5	4 4

shop1_freq shop1_freq	shop1_enj shop1_enj	shop2_freq shop2_freq	shop2_enj shop2_enj
0 0	9 9	5 5	4 4
3 3	2 2	0 0	9 9
0 0	9 9	5 5	4 4
0 0	2 2	0 0	9 9
4 4	9 9	5 5	4 4

For 循环跨多个列

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-03-18 15:52:52

解决方案1
0 已采纳 2021-03-18 15:52:52

shop1_freq shop1_freq	shop1_enj shop1_enj	shop2_freq shop2_freq	shop2_enj shop2_enj
0 0	9 9	5 5	4 4
3 3	2 2	0 0	9 9
0 0	9 9	5 5	4 4
0 0	2 2	0 0	9 9
4 4	9 9	5 5	4 4