在 R 中循环数据帧删除过程

Question

I wonder how to loop my code below to make it more functional and generalizable for other data (the current data is just a toy):我想知道如何在下面循环我的代码以使其更具功能性和可推广到其他数据（当前数据只是一个玩具）：

FIRST , I select a study from data using sample() and then filter() rows of it whose outcome == outcome_to_remove . FIRST ，我使用sample()从data选择一项study ，然后filter()其outcome == outcome_to_remove 。 This gives datat output.这给出了datat输出。

SECOND , I select a study from datat using sample() and then filter() rows of it whose outcome == outcome_to_remove2 . SECOND ，我使用sample()从datat选择一项study ，然后filter()其outcome == outcome_to_remove2 。 This gives the final output.这给出了最终的输出。

Can we possibly loop this process?我们可以循环这个过程吗？

EDIT: The only conditional I would like to add to my code is that the length(unique(data$study)) before and after the looping should always remain the same.编辑：我想添加到我的代码中的唯一条件是循环前后的length(unique(data$study))应始终保持不变。 That is, it shouldn't be possible that a study looses its outcome == "A" in the FIRST step, and outcome == "B" at the SECOND step, thus the whole study gets deleted.也就是说，一个应该是不可能的study失去其outcome == "A"的FIRST步骤和outcome == "B"在SECOND步骤，从而在整个研究被删除。

(data <- expand_grid(study = 1:5, group = 1:2, outcome = c("A", "B")))

n = 1
#====-------------------- FIRST:  
studies_to_remove = sample(unique(data$study), size = n)
outcome_to_remove = c("A")
             
datat <- data %>%
  filter(
    !(    study %in% studies_to_remove &
        outcome %in% outcome_to_remove
    ))

#====------------------- SECOND:
studies_to_remove2 = sample(unique(datat$study), size = n)
outcome_to_remove2 = c("B")

datat %>%
  filter(
    !(    study %in% studies_to_remove2 &
        outcome %in% outcome_to_remove2
    ))

Answer 1

With the help of for loop -在for循环的帮助下 -

data <- tidyr::expand_grid(study = 1:5, group = 1:2, outcome = c("A", "B"))

n = 1
set.seed(9873)
outcome_to_remove <- unique(data$outcome)
unique_study <- unique(data$study)

for(i in outcome_to_remove) {
  studies_to_remove = sample(unique_study, size = n)
  outcome_to_remove = i
  unique_study <- setdiff(unique_study, studies_to_remove)
  cat('\nDropping study ', studies_to_remove, 'and outcome ', outcome_to_remove)
  data <- data %>%
    filter(
      !( study %in% studies_to_remove &
         outcome %in% outcome_to_remove
      ))
}

#Dropping study  3 and outcome  A
#Dropping study  1 and outcome  B

data
#   study group outcome
#   <int> <int> <chr>  
# 1     1     1 A      
# 2     1     2 A      
# 3     2     1 A      
# 4     2     1 B      
# 5     2     2 A      
# 6     2     2 B      
# 7     3     1 B      
# 8     3     2 B      
# 9     4     1 A      
#10     4     1 B      
#11     4     2 A      
#12     4     2 B      
#13     5     1 A      
#14     5     1 B      
#15     5     2 A      
#16     5     2 B

在 R 中循环数据帧删除过程

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-23 01:33:47

在 R 中循环数据帧删除过程

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-23 01:33:47

解决方案1
1 已采纳 2021-10-23 01:33:47