I have a dataframe "df" like below:
Samples Status last_contact_days_to death_days_to
Sample1 Alive [Not Available] [Not Applicable]
Sample2 Dead [Not Available] 724
Sample3 Dead [Not Available] 1624
Sample4 Alive 1569 [Not Applicable]
Sample5 Dead [Not Available] 2532
Sample6 Dead [Not Available] 1271
I want to combine columns last_contact_days_to
and death_days_to
where in the result it should show only values not any other characters. And if both the columns has characters it should remove the whole row.
The result should look like following:
Samples Status new_column
Sample2 Dead 724
Sample3 Dead 1624
Sample4 Alive 1569
Sample5 Dead 2532
Sample6 Dead 1271
We can change the [Not Available]
and [Not Applicable]
to NA
and use coalesce
library(tidyverse)
df1 %>%
mutate_at(3:4,
funs(replace(., .%in% c("[Not Available]", "[Not Applicable]"), NA))) %>%
transmute(Samples, Status,
new_column = coalesce(last_contact_days_to, death_days_to)) %>%
filter(!is.na(new_column))
# Samples Status new_column
#1 Sample2 Dead 724
#2 Sample3 Dead 1624
#3 Sample4 Alive 1569
#4 Sample5 Dead 2532
#5 Sample6 Dead 1271
Note: As @Roland suggested, if the columns 3 and 4 have only numeric values in addition to the '[Not Available]', '[Not Applicable]', then the mutate_at
can be changed to as.numeric
. It will convert all non-numeric elements to NA
with a friendly warning and it would not have any problems
df1 %>%
mutate_at(3:4, as.numeric)
# if the columns are `factor` class then wrap with `as.character`
# mutate_at(3:4, funs(as.numeric(as.character(.))))
NOTE: In the OP's dataset, these are factor
class. So, uncomment the code above and use that instead of directly applying as.numeric
df1 <- structure(list(Samples = c("Sample1", "Sample2", "Sample3", "Sample4",
"Sample5", "Sample6"), Status = c("Alive", "Dead", "Dead", "Alive",
"Dead", "Dead"), last_contact_days_to = c("[Not Available]",
"[Not Available]", "[Not Available]", "1569", "[Not Available]",
"[Not Available]"), death_days_to = c("[Not Applicable]", "724",
"1624", "[Not Applicable]", "2532", "1271")), .Names = c("Samples",
"Status", "last_contact_days_to", "death_days_to"),
class = "data.frame", row.names = c(NA,
-6L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.