简体   繁体   English

合并R中数据框的列

[英]Merging columns of a dataframe in R

I have the following data frame, 我有以下数据框,

c1 <- c(1,2,"<NA>","<NA>")
c2 <- c("<NA>","<NA>",3,4)
df <- data.frame(c1,c2)

>df 

    c1   c2
1    1 <NA>
2    2 <NA>
3 <NA>    3
4 <NA>    4

The following is the desired output that I'm trying to obtain after merging columns 1 ,2 以下是我合并第1,2列后想要获得的期望输出

  >df 

    c1  
1    1 
2    2
3    3
4    4

I tried, 我试过了,

df <- mutate(df, x =paste(c1,c2))

which gives 这使

> df
    c1   c2      x
1    1 <NA> 1 <NA>
2    2 <NA> 2 <NA>
3 <NA>    3 <NA> 3
4 <NA>    4 <NA> 4

Could someone give suggestions on how to obtain the desired output? 有人可以提供有关如何获得所需输出的建议吗?

One way is this: 一种方法是:

c1 <- c(1, 2, NA, NA)
c2 <- c(NA, NA, 3, 4)
df <- data.frame(c1, c2)

df2 <- data.frame(
  c1 = ifelse(is.na(df$c1), df$c2, df$c1)
)

#df2
#  c1
#1  1
#2  2
#3  3
#4  4

You are close, but you are pasting together two strings where one uses the string NA in angled brackets to represent nothing, and if you are pasting strings together and want a string to not appear in the pasted string you need to have it as a zero length string. 您很近,但是您要将两个字符串粘贴在一起,其中一个使用尖括号中的字符串NA表示什么,如果您将字符串粘贴在一起并且不希望某个字符串出现在粘贴的字符串中,则需要将其设置为零长度字符串。 You can do this using the recode command in dplyr . 您可以使用dplyrrecode命令执行此操作。

You can modify your code to be: 您可以将代码修改为:

library(dplyr)
df <- mutate(df, x =paste0(recode(c1,"<NA>" = ""),recode(c2,"<NA>" = "")))

Another way using dplyr from tidyverse : 使用dplyrtidyverse另一种方式:

df2 <- df %>% 
    mutate(c3 = if_else(is.na(c1),c2,c1)) %>% 
    select(-c1, -c2) %>% # Given you only wanted one column
    rename(c1 = c3) # Given you wanted the column to be called c1

Output: 输出:

  c1
1  1
2  2
3  3
4  4

You could use rowSums : 您可以使用rowSums

data.frame(c1 = rowSums(df,na.rm = TRUE))
#   c1
# 1  1
# 2  2
# 3  3
# 4  4

Since it seems the the dataframe actually contains NA values rather than '<NA>' strings, I would suggest to use coalesce : 由于数据框似乎实际上包含NA值而不是'<NA>'字符串,所以我建议使用coalesce

c1 <- c(1,2,NA, NA)
c2 <- c(NA, NA,3,4)
df <- data.frame(c1,c2)

library(tidyverse)
df %>% 
  mutate(c3=coalesce(c1, c2))

Output: 输出:

   c1 c2 c3
1  1 NA  1
2  2 NA  2
3 NA  3  3
4 NA  4  4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM