I have a dataset with symptoms (20+), the symptoms are categorized as Yes/No/Unknown. I would like to create a new column which indicates if the subject ( ID
) has no symptoms (I'm defining this as they have no symptoms with 'Yes').
I've got a sample dataset below and I can create a column as desired but it feels like there must be a better/cleaner way just using dplyr::mutate()
rather than the filtering and joining that I'm doing?
library(dplyr)
test <- tibble(
ID = c(1:10),
col1 = sample(c("Yes", "No", "Unknown"), 10, replace = TRUE),
col2 = sample(c("Yes", "No", "Unknown"), 10, replace = TRUE),
col3 = sample(c("Yes", "No", "Unknown"), 10, replace = TRUE)
)
left_join(test, test %>%
filter_at(vars(col1:col3), any_vars(. == "Yes")) %>%
mutate(any_symptoms = "Yes") %>%
select(ID, any_symptoms),
by = "ID"
) %>%
mutate(any_symptoms = recode(any_symptoms, .missing = "No"))
#> # A tibble: 10 x 5
#> ID col1 col2 col3 any_symptoms
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 Unknown Unknown Unknown No
#> 2 2 Unknown No No No
#> 3 3 Yes Yes Unknown Yes
#> 4 4 No Unknown Unknown No
#> 5 5 No No Unknown No
#> 6 6 Unknown Yes Unknown Yes
#> 7 7 Yes Unknown Unknown Yes
#> 8 8 No No No No
#> 9 9 No Unknown Unknown No
#> 10 10 No No No No
Created on 2020-05-29 by the reprex package (v0.3.0)
You can use rowSums
to check if you have more than 0 "yes"
in a row.
test$any_symptoms <- c('No', 'Yes')[(rowSums(test[-1] == 'Yes') > 0) + 1]
You can also use this in dplyr
pipes:
library(dplyr)
test %>% mutate(any_symptoms = c('No', 'Yes')[(rowSums(.[-1] == 'Yes') > 0) + 1])
Or using pmap
from purrr
library(purrr)
test %>%
mutate(any_symptoms = c('No', 'Yes')[pmap_lgl(select(., starts_with('col')),
~any(c(...) == 'Yes')) + 1])
This should work:
test %>%
left_join(
test %>%
pivot_longer(-ID) %>%
group_by(ID) %>%
mutate(is_yes = value == "Yes") %>%
summarise(any_symptoms = ifelse(sum(is_yes) > 0, "Yes", "No"))
)
This works, but might be a bit annoying if you have 20 columns:
test %>% mutate(any_symptoms = case_when(grepl("Yes", paste(col1, col2, col3), fixed = TRUE) ~ "Yes", TRUE ~ "No"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.