简体   繁体   English

如何在R中的数据框中组合两列?

[英]How to combine two columns in a dataframe in R?

I have a dataframe "df" like below: 我有一个数据框“df”如下:

Samples Status  last_contact_days_to    death_days_to
Sample1 Alive   [Not Available]       [Not Applicable]
Sample2 Dead    [Not Available]             724
Sample3 Dead    [Not Available]            1624
Sample4 Alive      1569               [Not Applicable]
Sample5 Dead    [Not Available]            2532
Sample6 Dead    [Not Available]            1271

I want to combine columns last_contact_days_to and death_days_to where in the result it should show only values not any other characters. 我想将列last_contact_days_todeath_days_to组合在一起,在结果中它应该只显示值而不是任何其他字符。 And if both the columns has characters it should remove the whole row. 如果两列都有字符,则应删除整行。

The result should look like following: 结果应如下所示:

Samples Status  new_column
Sample2 Dead    724
Sample3 Dead    1624
Sample4 Alive   1569
Sample5 Dead    2532
Sample6 Dead    1271

We can change the [Not Available] and [Not Applicable] to NA and use coalesce 我们可以将[Not Available][Not Applicable]更改为NA并使用coalesce

library(tidyverse)
df1 %>%
   mutate_at(3:4, 
      funs(replace(., .%in% c("[Not Available]", "[Not Applicable]"), NA))) %>%
   transmute(Samples, Status,
             new_column = coalesce(last_contact_days_to, death_days_to)) %>%
   filter(!is.na(new_column))
#  Samples Status new_column
#1 Sample2   Dead        724
#2 Sample3   Dead       1624
#3 Sample4  Alive       1569
#4 Sample5   Dead       2532
#5 Sample6   Dead       1271

Note: As @Roland suggested, if the columns 3 and 4 have only numeric values in addition to the '[Not Available]', '[Not Applicable]', then the mutate_at can be changed to as.numeric . 注意:正如@Roland建议的那样,如果第3列和第4列除了'[Not Available]','[Not Applicable]'之外只有数值,那么mutate_at可以更改为as.numeric It will convert all non-numeric elements to NA with a friendly warning and it would not have any problems 它将所有非数字元素转换为NA并带有友好警告,它不会有任何问题

df1 %>%
    mutate_at(3:4, as.numeric) 
    # if the columns are `factor` class then wrap with `as.character`
    # mutate_at(3:4, funs(as.numeric(as.character(.))))

NOTE: In the OP's dataset, these are factor class. 注意:在OP的数据集中,这些是factor类。 So, uncomment the code above and use that instead of directly applying as.numeric 因此,取消注释上面的代码并使用它而不是直接应用as.numeric

data 数据

df1 <- structure(list(Samples = c("Sample1", "Sample2", "Sample3", "Sample4", 
"Sample5", "Sample6"), Status = c("Alive", "Dead", "Dead", "Alive", 
"Dead", "Dead"), last_contact_days_to = c("[Not Available]", 
"[Not Available]", "[Not Available]", "1569", "[Not Available]", 
"[Not Available]"), death_days_to = c("[Not Applicable]", "724", 
"1624", "[Not Applicable]", "2532", "1271")), .Names = c("Samples", 
"Status", "last_contact_days_to", "death_days_to"), 
 class = "data.frame", row.names = c(NA, 
-6L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM