I have a data frame looks like the following
data.1 <- data.frame(
X1 = 6:10,
X2 = 1:5,
X3 = c(TRUE,FALSE,TRUE,FALSE,TRUE)
)
X1 X2 X3
1 6 1 TRUE
2 7 2 FALSE
3 8 3 TRUE
4 9 4 FALSE
5 10 5 TRUE
I want to create a new column X4 with the following logic:
if X3==NULL then X4=NULL
elseif X3==TRUE then X4=X1+X2
else X4=X1-X2
Thanks in advance
lapply
is for when your data is a list, which isn't what you're doing.
Firstly, you won't find a NULL
entry in a data.frame
. NA
, sure, but not NULL
, so you should be working around a is.na()
. Next, you don't need to test if(x==TRUE)
; R knows how to use if(x)
. Okay, so down to business; you were most of the way there with your ifelse
. You can assign a vector to the output of an ifelse
and it will take care of the vectorisation for you
data.1$X4 <- ifelse(is.na(data.1$X3),
NA,
ifelse(data.1$X3==TRUE,
data.1$X1+data.1$X2,
data.1$X1-data.1$X2))
data.1
## X1 X2 X3 X4
## 1 6 1 TRUE 7
## 2 7 2 FALSE 5
## 3 8 3 TRUE 11
## 4 9 4 FALSE 5
## 5 10 5 TRUE 15
That's ugly though. dplyr
includes non-standard evaluation, which involves searching the namespace for columns, so you don't need to quote the data name every time you want to reference a column, making this much cleaner. dplyr::mutate
changes a column.
library(dplyr)
mutate(data.1, X4 = ifelse(is.na(X3),
NA,
ifelse(X3,
X1+X2,
X1-X2)))
data.1
## X1 X2 X3 X4
## 1 6 1 TRUE 7
## 2 7 2 FALSE 5
## 3 8 3 TRUE 11
## 4 9 4 FALSE 5
## 5 10 5 TRUE 15
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.