使用来自 dplyr 的 Unite 组合多个字符列时 na.rm = TRUE 的问题

Question

When trying to combine multiple character columns using unite from dplyr, the na.rm = TRUE option does not remove NA.当尝试使用来自 dplyr 的 unite 组合多个字符列时， na.rm = TRUE选项不会删除 NA。

Step by step:一步步：

Original dataset has 5 columns word1:word5 Image of the original data原始数据集有 5 列word1:word5原始数据的图像
Looking to combine word1:word5 in a single column using code:希望使用代码将word1:word5到一个列中：

    data_unite_5 <-  data_original_5 %>%
        unite("pentawords", word1:word5, sep=" ", na.rm=TRUE, remove=FALSE)

Here's an image of the output: data_unite_5这是 output 的图像： data_unite_5

I've tried using mutate_if(is.factor, as.character) but that did not work.我试过使用mutate_if(is.factor, as.character)但这没有用。

Any suggestions would be appreciated.任何建议，将不胜感激。

Answer 1

You have misinterpreted how the na.rm argument works for unite .您误解了na.rm参数如何适用于unite 。 Following the examples on the tidyverse page here , z is the unite of x and y .按照此处tidyverse 页面上的示例， z是x和y的unite 。

With na.rm = FALSE使用na.rm = FALSE

#>   z     x     y    
#>   <chr> <chr> <chr>
#> 1 a_b   a     b    
#> 2 a_NA  a     NA   
#> 3 NA_b  NA    b    
#> 4 NA_NA NA    NA

With na.rm = TRUE使用na.rm = TRUE

#>   z     x     y    
#>   <chr> <chr> <chr>
#> 1 "a_b" a     b    
#> 2 "a"   a     NA   
#> 3 "b"   NA    b    
#> 4 ""    NA    NA

Hence na.rm determines how NA values appear in the assembled strings ( pentrawords ) it does not drop rows from the data.因此na.rm确定NA值如何出现在组装的字符串 ( pentrawords ) 中，它不会从数据中删除行。

If you were wanting to remove the fourth row of the dataset, I would recommend filter .如果您想删除数据集的第四行，我会推荐filter 。

data_unite_5 <- data_original_5 %>%
  unite("pentawords", word1:word5, sep =" " , na.rm = TRUE, remove = FALSE) %>%
  filter(pentawords != "")

Which will exclude from your output all empty strings.这将从您的 output 中排除所有空字符串。

使用来自 dplyr 的 Unite 组合多个字符列时 na.rm = TRUE 的问题

问题描述

1 个解决方案

解决方案1
0 2020-09-09 20:50:42

使用来自 dplyr 的 Unite 组合多个字符列时 na.rm = TRUE 的问题

问题描述

1 个解决方案

解决方案1 0 2020-09-09 20:50:42

解决方案1
0 2020-09-09 20:50:42