查找唯一行时忽略数据帧中的NA

Question

I have a dataframe with 20 columns and about 200 rows, and I would like to find the unique rows. 我有一个包含20列和大约200行的数据框，我想找到唯一的行。 The problem is that nearly every row has a few NAs mixed in: this is really missing data and I would like the NAs to be treated like a "wildcard", not used to match other NAs. 问题是几乎每行都有几个NA混合在一起：这确实是数据丢失，我希望将这些NA视为“通配符”，而不是用来匹配其他NA。

The following two rows should be recognized as a match (ie non-unique) 以下两行应被视为匹配项（即非唯一）

T, S, NA, Z
NA, S, G, Z

I've tried the incomparables argument to the unique function, but it doesn't seem to be implemented. 我已经尝试了无与伦比的参数到唯一函数，但是它似乎没有实现。 Thanks a lot. 非常感谢。

Answer 1

Put this in a double for loop: 将其放入double for循环中：

all(na.omit(x[1,] == x[2,]))

Replacing 1 and 2 with i and j to cycle through all comparisons you need to check. 用i和j替换1和2以循环显示您需要检查的所有比较。

Answer 2

You could try 你可以试试

val <-  apply(df, 1, function(x) {paste(na.omit(x), collapse='')})
df[!duplicated(val),]
#    V1 V2   V3 V4
#1    T  S <NA>  Z
#2 <NA>  S    G  Z
#3    S  G    Z  T

data 数据

 df <- structure(list(V1 = c("T", NA, "S", "S", "S"), V2 = c("S", "S", 
 "G", NA, "G"), V3 = c(NA, "G", "Z", "Z", NA), V4 = c("Z", "Z", 
 "T", "G", "Z")), .Names = c("V1", "V2", "V3", "V4"), row.names = c(NA, 
 -5L), class = "data.frame")

查找唯一行时忽略数据帧中的NA

问题描述

2 个解决方案

解决方案1
0 2014-11-19 15:50:31

解决方案2
0 2014-11-19 16:07:14

data 数据

查找唯一行时忽略数据帧中的NA

问题描述

2 个解决方案

解决方案1 0 2014-11-19 15:50:31

解决方案2 0 2014-11-19 16:07:14

data 数据

解决方案1
0 2014-11-19 15:50:31

解决方案2
0 2014-11-19 16:07:14