[英]Issue with R if else loop: Conditions only partly executed
我有以下數據框:
Row Repro Number2
1 1 EWC
2 NA LWY
3 7 EWS
4 NA LWC
5 NA EWC
6 NA LWC
7 3 EWY
8 NA LW2Y
9 NA Unknown
10 NA LWC
11 1 EWC
12 NA LWY
13 NA EWY
14 NA LWY
15 NA Unknown
16 NA LWC
在此數據框上,我使用以下循環:
for (i in 1:nrow(df3)) {
if(df3$Number2[i+1]=="Unknown" & is.na(df3$Repro[i])) {
df3$Number2[i]="Unknown"
} else{
df3$Number2[i]==df3$Number2[i]
}
}
當循環確實運行時,我在最后收到一個錯誤代碼,並且數據框最終看起來不像我想要的結果。
我的問題是,雖然代碼正在執行其預期目的(如果 number2 列中的值也是“未知”並且關聯的 Repro 值為 NA,則將 number2 列中的值替換為“未知”),它只是用“最初在數據幀中的未知”值。 我希望它也考慮到添加的新“Unknowns”並執行循環條件。
這是錯誤代碼:
Error in if (df3$Number2[i + 1] == "Unknown" & is.na(df3$Repro[i])) { :
missing value where TRUE/FALSE needed
這是運行循環后的數據框。 我添加了另一個名為“Number2.Correct”的列,顯示了我希望 Number2 列的實際外觀。 問題出在第 12 行和第 13 行 - 這些應該分別是“Unknowns”而不是“LWY”和“EWY”。
Repro Number2 Number2.Correct
1 1 EWC EWC
2 NA LWY LWY
3 7 EWS EWS
4 NA LWC LWC
5 NA EWC EWC
6 NA LWC LWC
7 3 EWY EWY
8 NA Unknown Unknown
9 NA Unknown Unknown
10 NA LWC LWC
11 1 EWC EWC
12 NA LWY Unknown
13 NA EWY Unknown
14 NA Unknown Unknown
15 NA Unknown Unknown
16 NA LWC LEW
最后,我有兩個問題:
for (i in rev(1:nrow(df3))) {
if (df3$Number2[i + 1] == "Unknown" & is.na(df3$Repro[i]) & i + 1 < nrow(df3)) {
df3$Number2[i] <- "Unknown"
} else {
df3$Number2[i] == df3$Number2[i]
}
}
df3
#> Row Repro Number2
#> 1 1 1 EWC
#> 2 2 NA LWY
#> 3 3 7 EWS
#> 4 4 NA LWC
#> 5 5 NA EWC
#> 6 6 NA LWC
#> 7 7 3 EWY
#> 8 8 NA Unknown
#> 9 9 NA Unknown
#> 10 10 NA LWC
#> 11 11 1 EWC
#> 12 12 NA Unknown
#> 13 13 NA Unknown
#> 14 14 NA Unknown
#> 15 15 NA Unknown
#> 16 16 NA LWC
創建於 2023-01-09,使用reprex v2.0.2您有兩個問題:
i + 1
超出數據最后一行的范圍; 我添加了另一個條件( i + 1 < nrow(df3)
)Unknown
。 您可以使用rev()
反轉順序第nrow
行數據后i+1
超出范圍。 我們可以通過tidyverse
的方法使用分組
library(dplyr)
library(tidyr)
library(data.table)
df3 %>%
mutate(grp = replace(replace(Number2, Number2 != "Unknown", NA),
Number2 == "Unknown", seq_len(sum(Number2 == "Unknown")))) %>%
fill(grp, .direction = "updown") %>%
group_by(grp, grp2 = rleid(is.na(Repro))) %>%
mutate(Number2 = case_when(is.na(Repro) &
row_number() < match("Unknown", Number2) ~ "Unknown",
TRUE ~ Number2)) %>%
ungroup %>%
select(-grp, -grp2)
-輸出
# A tibble: 16 × 3
Row Repro Number2
<int> <int> <chr>
1 1 1 EWC
2 2 NA LWY
3 3 7 EWS
4 4 NA LWC
5 5 NA EWC
6 6 NA LWC
7 7 3 EWY
8 8 NA Unknown
9 9 NA Unknown
10 10 NA LWC
11 11 1 EWC
12 12 NA Unknown
13 13 NA Unknown
14 14 NA Unknown
15 15 NA Unknown
16 16 NA LWC
df3 <- structure(list(Row = 1:16, Repro = c(1L, NA, 7L, NA, NA, NA,
3L, NA, NA, NA, 1L, NA, NA, NA, NA, NA), Number2 = c("EWC", "LWY",
"EWS", "LWC", "EWC", "LWC", "EWY", "LW2Y", "Unknown", "LWC",
"EWC", "LWY", "EWY", "LWY", "Unknown", "LWC")),
class = "data.frame", row.names = c(NA,
-16L))
代碼失敗的原因是nrow(df3)+1
超出范圍。 所以 for 循環需要是1:(nrow(df3)-1)
要迭代更新 Number2,一種簡單的方法(雖然不優雅)是使用 while 循環。 停止條件是新舊Number2
相同時。
while(T){
df3$Number2_new <- df3$Number2
for (i in 1:(nrow(df3)-1)) {
if(df3$Number2_new[i+1]=="Unknown" & is.na(df3$Repro[i])) {
df3$Number2_new[i]="Unknown"
} else{
df3$Number2_new[i]==df3$Number2_new[i]
}
}
if(all(df3$Number2==df3$Number2_new)){
df3 <- df3%>%
mutate(Number2=Number2_new)%>%
select(-Number2_new)
break
}else{
df3 <- df3%>%
mutate(Number2=Number2_new)%>%
select(-Number2_new)
}
}
df3
Row Repro Number2
1 1 1 EWC
2 2 NA LWY
3 3 7 EWS
4 4 NA LWC
5 5 NA EWC
6 6 NA LWC
7 7 3 EWY
8 8 NA Unknown
9 9 NA Unknown
10 10 NA LWC
11 11 1 EWC
12 12 NA Unknown
13 13 NA Unknown
14 14 NA Unknown
15 15 NA Unknown
16 16 NA LWC
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.