[英]Complex If Else Statement in For Loop in R Warning Message
我為我的數據集和 2 個空向量創建了一個帶有許多 if else 語句的 for 循環。 但是,我收到一條警告消息:
在 closenessSupport[i] <- rowMeans(seniorEdPlans[c("closenessFriends", ... :要替換的項目數不是替換長度的倍數。
我只是想知道如何解決這個向量長度問題,因為我認為它擾亂了我尋找 2 列平均值的意圖。任何幫助表示贊賞。
哇,對我來說太多了。 但是有一些輕推的答案。 在這種情況下,您絕對不希望對 dataframe 的所有行進行 for 循環。 r
針對列進行了優化。 我不完全確定我理解你所有的條件,但很可能dplyr::case_when
會很好地為你服務。
我dput
了您的數據並僅輸入了前 20 行。 然后我寫了一個mutate
和case_when
,它開始向closenessSupport
方向發展。 這就是你要做的嗎?
在您的附加輸入后僅對感興趣的列進行修訂
# https://stackoverflow.com/questions/61582653
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
seniored <- structure(list(id = 1:20,
age = c(17L, 16L, 17L, 16L, 17L, 18L,
17L, 17L, 18L, 16L, 17L, 17L, 17L, 17L, 17L, 17L, 16L, 17L, 16L,
18L),
higherEd = structure(c(1L, 5L, 1L, 1L, 3L, 1L, 2L, 2L,
5L, 5L, 3L, 4L, 3L, 2L, 5L, 3L, 4L, 5L, 1L, 1L), .Label = c("2-year",
"4-year", "None", "Other", "Vocational"), class = "factor"),
riskGroup = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 3L,
1L, 3L, 3L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L), .Label = c("High",
"Low", "Medium"), class = "factor"),
GPA = c(3.169, 2.703,
3.225, 2.488, 2.618, 2.928, 3.176, 3.256, 3.48, 3.244, 3.265,
3.4, 3.109, 3.513, 3.102, 2.656, 2.853, 3.046, 2.304, 3.473
),
closenessFriends = c(7L, 7L, 7L, 8L, NA, NA, NA, 6L, 7L,
NA, 5L, 6L, 3L, 1L, 1L, NA, 8L, 2L, NA, 8L),
closenessMentors = c(6L,
NA, 5L, NA, 5L, 4L, 8L, 6L, 4L, 5L, 4L, 4L, 4L, 5L, 5L, 5L,
3L, 4L, NA, 5L),
numSupportSources = c(2L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 0L, 2L
)), row.names = c(NA, 20L), class = "data.frame")
seniored %>%
mutate(
closenessSupport = case_when(
numSupportSources == 1 & !is.na(closenessFriends) ~ as.numeric(closenessFriends),
numSupportSources == 1 & !is.na(closenessMentors) ~ as.numeric(closenessMentors),
numSupportSources == 2 ~ (closenessFriends + closenessMentors)/2,
numSupportSources == 0 ~ NA_real_),
supportType = case_when(
numSupportSources == 1 & !is.na(closenessFriends) ~ "FriendOnly",
numSupportSources == 1 & !is.na(closenessMentors) ~ "MentorOnly",
numSupportSources == 2 ~ "Both",
numSupportSources == 0 ~ "Neither"
)
) %>%
select(numSupportSources, closenessFriends, closenessMentors, closenessSupport, supportType)
#> numSupportSources closenessFriends closenessMentors closenessSupport
#> 1 2 7 6 6.5
#> 2 1 7 NA 7.0
#> 3 2 7 5 6.0
#> 4 1 8 NA 8.0
#> 5 1 NA 5 5.0
#> 6 1 NA 4 4.0
#> 7 1 NA 8 8.0
#> 8 2 6 6 6.0
#> 9 2 7 4 5.5
#> 10 1 NA 5 5.0
#> 11 2 5 4 4.5
#> 12 2 6 4 5.0
#> 13 2 3 4 3.5
#> 14 2 1 5 3.0
#> 15 2 1 5 3.0
#> 16 1 NA 5 5.0
#> 17 2 8 3 5.5
#> 18 2 2 4 3.0
#> 19 0 NA NA NA
#> 20 2 8 5 6.5
#> supportType
#> 1 Both
#> 2 FriendOnly
#> 3 Both
#> 4 FriendOnly
#> 5 MentorOnly
#> 6 MentorOnly
#> 7 MentorOnly
#> 8 Both
#> 9 Both
#> 10 MentorOnly
#> 11 Both
#> 12 Both
#> 13 Both
#> 14 Both
#> 15 Both
#> 16 MentorOnly
#> 17 Both
#> 18 Both
#> 19 Neither
#> 20 Both
由代表 package (v0.3.0) 於 2020 年 5 月 4 日創建
請接受正確答案之一
你昨天問過你的循環有什么問題。 我今天看了。 問題是在rowwise
內按行運行。 它已經基於行,因此在遍歷行的 for 循環中運行它必然會導致問題。
我還制作了一個示例數據集,其中包含您的數據的代表值。 可能對您當前的數據無關緊要,但 for 循環會慢得多。 在 20,000 行的情況下,for 循環需要 1.4 秒。 dplyr
解決方案 11 毫秒。
# build a reproducible dataset assume valid scores 1 - 8
# we'll make 9's equal to NA
set.seed(2020)
a <- sample(1:9, 20000, replace = TRUE)
a[a == 9] <- NA
set.seed(2021)
b <- sample(1:9, 20000, replace = TRUE)
b[b == 9] <- NA
seniorEdPlans2 <- data.frame(closenessFriends = a,
closenessMentors = b)
# use apply to calculate numSupportSources
seniorEdPlans2$numSupportSources <- apply(seniorEdPlans2,
1,
function(x) sum(!is.na(x))
)
# head(seniorEdPlans2, 50) # close enough
# this was the source of your error message it's already
# row based so can't put it in a for loop
seniorEdPlans2$closenessSupport <- rowMeans(seniorEdPlans2[c('closenessFriends', 'closenessMentors')],
na.rm = TRUE)
# your for loop
for (i in 1:nrow(seniorEdPlans2)) {
if (seniorEdPlans2$numSupportSources[i] == 2) {
seniorEdPlans2$supportType[i] <- "Both"
} else if (seniorEdPlans2$numSupportSources[i] == 0) {
seniorEdPlans2$supportType[i] <- "Neither"
} else if (!is.na(seniorEdPlans2$closenessFriends[i])) {
seniorEdPlans2$supportType[i] <- "FriendOnly"
} else {
seniorEdPlans2$supportType[i] <- "MentorOnly"
}
}
# head(seniorEdPlans2, 50)
由reprex package (v0.3.0) 於 2020 年 5 月 5 日創建
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.