R 中的 For 循環中的復雜 If Else 語句警告消息

Question

我為我的數據集和 2 個空向量創建了一個帶有許多 if else 語句的 for 循環。 但是，我收到一條警告消息：

在 closenessSupport[i] <- rowMeans(seniorEdPlans[c("closenessFriends", ... ：要替換的項目數不是替換長度的倍數。

我只是想知道如何解決這個向量長度問題，因為我認為它擾亂了我尋找 2 列平均值的意圖。任何幫助表示贊賞。

Answer 1

哇，對我來說太多了。 但是有一些輕推的答案。 在這種情況下，您絕對不希望對 dataframe 的所有行進行 for 循環。 r針對列進行了優化。 我不完全確定我理解你所有的條件，但很可能dplyr::case_when會很好地為你服務。

我dput了您的數據並僅輸入了前 20 行。 然后我寫了一個mutate和case_when ，它開始向closenessSupport方向發展。 這就是你要做的嗎？

在您的附加輸入后僅對感興趣的列進行修訂

# https://stackoverflow.com/questions/61582653
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
seniored <- structure(list(id = 1:20,
                           age = c(17L, 16L, 17L, 16L, 17L, 18L,
                                    17L, 17L, 18L, 16L, 17L, 17L, 17L, 17L, 17L, 17L, 16L, 17L, 16L,
                                    18L),
                           higherEd = structure(c(1L, 5L, 1L, 1L, 3L, 1L, 2L, 2L,
                                                  5L, 5L, 3L, 4L, 3L, 2L, 5L, 3L, 4L, 5L, 1L, 1L), .Label = c("2-year",
                                                                                                                                       "4-year", "None", "Other", "Vocational"), class = "factor"),
                           riskGroup = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 3L,
                                                   1L, 3L, 3L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L), .Label = c("High",
                                                                                                           "Low", "Medium"), class = "factor"),
                           GPA = c(3.169, 2.703,
                                                                                                                                                        3.225, 2.488, 2.618, 2.928, 3.176, 3.256, 3.48, 3.244, 3.265,
                                                                                                                                                        3.4, 3.109, 3.513, 3.102, 2.656, 2.853, 3.046, 2.304, 3.473
                                                                                                           ),
                           closenessFriends = c(7L, 7L, 7L, 8L, NA, NA, NA, 6L, 7L,
                                                                                                                                   NA, 5L, 6L, 3L, 1L, 1L, NA, 8L, 2L, NA, 8L),
                           closenessMentors = c(6L,
                                                                                                                                                                                                     NA, 5L, NA, 5L, 4L, 8L, 6L, 4L, 5L, 4L, 4L, 4L, 5L, 5L, 5L,
                                                                                                                                                                                                     3L, 4L, NA, 5L),
                           numSupportSources = c(2L, 1L, 2L, 1L, 1L,
                                                                                                                                                                                                                                            1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 0L, 2L
                                                                                                                                                                                                     )), row.names = c(NA, 20L), class = "data.frame")
seniored %>%
  mutate(
    closenessSupport = case_when(
      numSupportSources == 1 & !is.na(closenessFriends) ~ as.numeric(closenessFriends),
      numSupportSources == 1 & !is.na(closenessMentors) ~ as.numeric(closenessMentors),
      numSupportSources == 2 ~ (closenessFriends + closenessMentors)/2,
      numSupportSources == 0 ~ NA_real_),
    supportType = case_when(
      numSupportSources == 1 & !is.na(closenessFriends) ~ "FriendOnly",
      numSupportSources == 1 & !is.na(closenessMentors) ~ "MentorOnly",
      numSupportSources == 2 ~ "Both",
      numSupportSources == 0 ~ "Neither"
    )
  ) %>%
  select(numSupportSources, closenessFriends, closenessMentors, closenessSupport, supportType)
#>    numSupportSources closenessFriends closenessMentors closenessSupport
#> 1                  2                7                6              6.5
#> 2                  1                7               NA              7.0
#> 3                  2                7                5              6.0
#> 4                  1                8               NA              8.0
#> 5                  1               NA                5              5.0
#> 6                  1               NA                4              4.0
#> 7                  1               NA                8              8.0
#> 8                  2                6                6              6.0
#> 9                  2                7                4              5.5
#> 10                 1               NA                5              5.0
#> 11                 2                5                4              4.5
#> 12                 2                6                4              5.0
#> 13                 2                3                4              3.5
#> 14                 2                1                5              3.0
#> 15                 2                1                5              3.0
#> 16                 1               NA                5              5.0
#> 17                 2                8                3              5.5
#> 18                 2                2                4              3.0
#> 19                 0               NA               NA               NA
#> 20                 2                8                5              6.5
#>    supportType
#> 1         Both
#> 2   FriendOnly
#> 3         Both
#> 4   FriendOnly
#> 5   MentorOnly
#> 6   MentorOnly
#> 7   MentorOnly
#> 8         Both
#> 9         Both
#> 10  MentorOnly
#> 11        Both
#> 12        Both
#> 13        Both
#> 14        Both
#> 15        Both
#> 16  MentorOnly
#> 17        Both
#> 18        Both
#> 19     Neither
#> 20        Both

^{由代表 package (v0.3.0) 於 2020 年 5 月 4 日創建}

Answer 2

請接受正確答案之一

你昨天問過你的循環有什么問題。 我今天看了。 問題是在rowwise內按行運行。 它已經基於行，因此在遍歷行的 for 循環中運行它必然會導致問題。

我還制作了一個示例數據集，其中包含您的數據的代表值。 可能對您當前的數據無關緊要，但 for 循環會慢得多。 在 20,000 行的情況下，for 循環需要 1.4 秒。 dplyr解決方案 11 毫秒。

# build a reproducible dataset assume valid scores 1 - 8
# we'll make 9's equal to NA

set.seed(2020)
a <- sample(1:9, 20000, replace = TRUE)
a[a == 9] <- NA
set.seed(2021)
b <- sample(1:9, 20000, replace = TRUE)
b[b == 9] <- NA

seniorEdPlans2 <- data.frame(closenessFriends = a,
                              closenessMentors = b)

# use apply to calculate numSupportSources
seniorEdPlans2$numSupportSources <- apply(seniorEdPlans2, 
                                          1, 
                                          function(x) sum(!is.na(x))
                                          )

# head(seniorEdPlans2, 50) # close enough

# this was the source of your error message it's already
# row based so can't put it in a for loop
seniorEdPlans2$closenessSupport <- rowMeans(seniorEdPlans2[c('closenessFriends', 'closenessMentors')], 
                                           na.rm = TRUE)

# your for loop
for (i in 1:nrow(seniorEdPlans2)) {
  if (seniorEdPlans2$numSupportSources[i] == 2) {
    seniorEdPlans2$supportType[i] <- "Both"
  } else if (seniorEdPlans2$numSupportSources[i] == 0) {
    seniorEdPlans2$supportType[i] <- "Neither"
  } else if (!is.na(seniorEdPlans2$closenessFriends[i])) {
    seniorEdPlans2$supportType[i] <- "FriendOnly"
  } else {
    seniorEdPlans2$supportType[i] <- "MentorOnly"
  }
}

# head(seniorEdPlans2, 50)

^{由reprex package (v0.3.0) 於 2020 年 5 月 5 日創建}

R 中的 For 循環中的復雜 If Else 語句警告消息

問題描述

2 個解決方案

解決方案1
1 2020-05-04 13:56:37

解決方案2
1 2020-05-05 14:12:28

R 中的 For 循環中的復雜 If Else 語句警告消息

問題描述

2 個解決方案

解決方案1 1 2020-05-04 13:56:37

解決方案2 1 2020-05-05 14:12:28

解決方案1
1 2020-05-04 13:56:37

解決方案2
1 2020-05-05 14:12:28