防止dplyr加入NA

Question

I'd like to do a full-join of 2 df's. 我想完全加入2 df。 To my surprise, dplyr's default behavior is to join on NA's if they exist in both df's. 令我惊讶的是，dplyr的默认行为是加入NA，如果它们存在于两个df中。 Is there a functionality to prevent dplyr from doing this? 有没有阻止dplyr执行此操作的功能？

Here's an example with inner join: 以下是内部联接的示例：

x <- data.frame(a = c(5, NA, 9), b = 1:3)
y <- data.frame(a = c(5, NA, 9), c = 4:6)
z <- dplyr::inner_join(x, y, by = 'a')

I would like z to contain only 2 records, not 3. Ideally, I want to do this without having to manually filter out the records with NA's beforehand and then append them to the result (since that seems clumsy). 我希望z只包含2条记录，而不是3.理想情况下，我希望这样做而不必事先用NA手动过滤掉记录，然后将它们附加到结果中（因为这看起来很笨拙）。

Answer 1

You can use na_matches = "never" . 你可以使用na_matches = "never" 。 This is in the NEWS for v. 0.7.0 but I don't see it in the documentation. 这是针对v.7.0.0的新闻，但我没有在文档中看到它。 The default is na_matches = "na" . 默认值为na_matches = "na" 。

This returns two rows instead of three: 这将返回两行而不是三行：

dplyr::inner_join(x, y, by = 'a', na_matches = "never")

  a b c
1 5 1 4
2 9 3 6

防止dplyr加入NA

问题描述

1 个解决方案

解决方案1
19 已采纳 2017-09-11 17:52:54

防止dplyr加入NA

问题描述

1 个解决方案

解决方案1 19 已采纳 2017-09-11 17:52:54

解决方案1
19 已采纳 2017-09-11 17:52:54