I have a data as follows:
col1 <- c(0.1,0.2,0.0,0.5,0.6)
col2 <- c(2,2,4,5,6)
col3 <- c(1,4,3,4,5)
col4 <- c(2,3,4,4,6)
col5 <- c(5,3,3,2,1)
data.frame(col1,col2,col3,col4,col5)
col1 col2 col3 col4 col5
1 0.1 2 1 2 5
2 0.2 2 4 3 3
3 0.0 4 3 4 3
4 0.5 5 4 4 2
5 0.6 6 5 6 1
I would like to add a new column with "yes" value where in each row at least one column from col2 to column 5 is equal to 4 and "no" when the data does not meet the criteria.
So the output would look like as:
col1 col2 col3 col4 col5 col6
1 0.1 2 1 2 5 no
2 0.2 2 4 3 3 yes
3 0.0 4 3 4 3 yes
4 0.5 5 4 4 2 yes
5 0.6 6 5 6 1 no
here is my command:
new.df <- df %>% mutate(df, col6 = funs(ifelse(abs(vars(c(2:5) == 4),"yes", "no")
But I can not get the required output. do you have any idea how can I use dplyr, mutate and if else function to get the result?
We can sum when cell ==4 using rowSums
library(dplyr)
df %>% mutate(col6 = ifelse(rowSums(.[,c(2:5)]==4)>0, 'yes', 'no'))
col1 col2 col3 col4 col5 col6
1 0.1 2 1 2 5 no
2 0.2 2 4 3 3 yes
3 0.0 4 3 4 3 yes
4 0.5 5 4 4 2 yes
5 0.6 6 5 6 1 no
Using @thelatemail suggestion, we can rewrite ifelse
as below:
df %>% mutate(col6 = ifelse(Reduce(`|`, lapply(.[,c(2:5)], `==`, 4)), 'yes', 'no'))
You can try with any
ifelse(apply(df[,-1]==4,1,any),'yes','no')
[1] "no" "yes" "yes" "yes" "no"
Here are some approaches:
1) rowSums Compare all but col1
to 4, sum the comparisons and for each one that is strictly positive choose yes
else no
.
library(dplyr)
dd %>%
mutate(col6 = if_else(rowSums(select(., -col1) == 4) > 0, "yes", "no"))
giving:
col1 col2 col3 col4 col5 col6
1 0.1 2 1 2 5 no
2 0.2 2 4 3 3 yes
3 0.0 4 3 4 3 yes
4 0.5 5 4 4 2 yes
5 0.6 6 5 6 1 no
1a) Reduce A variation of this using Reduce
would be:
dd %>%
mutate(col6 =
if_else(Reduce("|", (select(., -col1) == 4) %>% as.data.frame), "yes", "no"))
2) gathr Convert to long form, summarize and join back to the original data frame.
library(dplyr)
library(tidyr)
dd %>%
mutate(seq = 1:n()) %>%
left_join(gather(., key, value, -seq, -col1) %>%
group_by(seq) %>%
summarize(col6 = if_else(any(value == 4), "yes", "no")) %>%
ungroup) %>%
select(-seq)
If col1
has unique values, as it does in the sample data shown in the question, then it could be used in place of seq
shortening the code to:
dd %>%
left_join(gather(., key, value, -col1) %>%
group_by(col1) %>%
summarize(col6 = if_else(any(value == 4), "yes", "no")) %>%
ungroup)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.