[英]Nested if else statement in R - 'else' statement executed even when 'if' statement is TRUE
I have a dataframe with 2 columns: df$a
and df$b
.我有一个 dataframe 有 2 列:
df$a
和df$b
。 I need to calculate the values for column df$c
based on the values of df$b
using 2 seperate sets of conditions.我需要使用 2 个单独的条件集根据
df$b
的值计算列df$c
的值。 Which set of conditions should be applied depends on the value of df$a
.应应用哪一组条件取决于
df$a
的值。
I tried to solve this by writing a nested if
else
statement.我试图通过编写一个嵌套的
if
else
语句来解决这个问题。
# A subset of my data
a <- c(4211L, 2660L, 2839L, 3967L, 3167L, 2755L, 1680L, 2400L, 1173L, 1301L, 2370L, 2366L, 411L, 615L, 1382L, 826L, 717L, 401L, 177L, 82L, 579L, 246L)
b <- c(0.213, 0.102, 0.092, 0.121, 0.093, 0.0918, 0.0241, 0.060, 0.008, 0.003, 0.0385, 0.0368, -0.0529, -0.0697, 0.0192, -0.0346, -0.053, NA, -0.098, -0.139, -0.137, -0.0697)
df <- data.frame(a,b)
I want to use the first set of conditions when df$a <1000
, and the second set of conditions when df$a>=1000
.我想在
df$a <1000
时使用第一组条件,在df$a>=1000
时使用第二组条件。 This is my code:这是我的代码:
df$c <- if (df$a < 1000) {
ifelse(df$b <= -0.2, '1',
ifelse(df$b > -0.2 & df$b <= -0.1, '2',
ifelse(df$b > -0.1 & df$b <= 0.0, '3',
ifelse(df$b > 0.0 & df$b <= 0.1, '4',
'5'))))
} else {
ifelse(df$b <= 0.0, '1',
ifelse(df$b > 0.0 & df$b <= 0.1, '2',
ifelse(df$b > 0.1 & df$b <= 0.2, '3',
ifelse(df$b > 0.2 & df$b <= 0.3, '4',
'5'))))
}
However, the code calculates all df$c
values based on conditions in the else
statement, even when (df$a < 1000)
is TRUE
.但是,代码会根据
else
语句中的条件计算所有df$c
值,即使(df$a < 1000)
为TRUE
也是如此。 Does anyone know what is causing this mistake?有谁知道是什么导致了这个错误? I get the following warning message:
我收到以下警告消息:
Warning message:
In if (df$a < 1000) { :
the condition has length > 1 and only the first element will be used
You could use ifelse
as well, because if
is non vectorized.您也可以使用
ifelse
,因为if
是非矢量化的。 And I would use a function like cut
to simplify the code:我会使用 function 之类的
cut
来简化代码:
a <- c(4211L, 2660L, 2839L, 3967L, 3167L, 2755L, 1680L, 2400L, 1173L, 1301L, 2370L, 2366L, 411L, 615L, 1382L, 826L, 717L, 401L, 177L, 82L, 579L, 246L)
b <- c(0.213, 0.102, 0.092, 0.121, 0.093, 0.0918, 0.0241, 0.060, 0.008, 0.003, 0.0385, 0.0368, -0.0529, -0.0697, 0.0192, -0.0346, -0.053, NA, -0.098, -0.139, -0.137, -0.0697)
df <- data.frame(a,b)
df$c <- ifelse(df$a < 1000,
cut(df$b, breaks = c(-Inf, -0.2, -0.1, 0.0, 0.1, +Inf),
labels = as.character(1:5)),
cut(df$b, c(-Inf, 0, 0.1, 0.2, 0.3, +Inf),
as.character(1:5)))
df
# a b c
# 1 4211 0.2130 4
# 2 2660 0.1020 3
# 3 2839 0.0920 2
# 4 3967 0.1210 3
# 5 3167 0.0930 2
# 6 2755 0.0918 2
# 7 1680 0.0241 2
# ....
We can use findInterval
:我们可以使用
findInterval
:
df$c <- with(df, ifelse(a < 1000, findInterval(b, seq(-0.2, 0.1, 0.1)),
findInterval(b, seq(0, 0.3, 0.1))) + 1)
df$c
# [1] 4 3 2 3 2 2 2 2 2 2 2 2 3 3 2 3 3 NA 3 2 2 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.