简体   繁体   English

使用R从数据框中提取唯一值

[英]Extracting unique values from data frame using R

I have a data frame with multiple columns and I want to be able to isolate two of the columns and get the total amount of unique values... here's an example of what I mean: 我有一个包含多列的数据框,我希望能够隔离两列并获得唯一值的总数...这是我的意思的一个例子:

Lets say i have a data frame df: 假设我有一个数据框df:

df<- data.frame(v1 = c(1, 2, 3, 2, "a"), v2 = c("a", 2 ,"b","b", 4))
df

  v1 v2
1  1  a
2  2  2
3  3  b
4  2  b
5  a  4

Now what Im trying to do is extract just the unique values over the two columns. 现在我想要做的只是提取两列上的唯一值。 So if i just used unique() for each column the out put would look like this: 因此,如果我只为每列使用unique(),那么输出将如下所示:

> unique(df[,1])
[1] 1 2 3 a
> unique(df[,2])
[1] a 2 b 4

But this is no good as it only finds the unique values per column, whereas I need the total amount of unique values over the two columns! 但这并不好,因为它只找到每列的唯一值,而我需要两列上唯一值的总量! For instance, 'a' is repeated in both columns, but I only want it counted once. 例如,两个列中都重复了'a',但我只想计算一次。 For an example output of what I need; 我需要的输出示例; imagine the columns V1 and V2 are placed on top of each other like so: 想象列V1和V2如下所示放在彼此之上:

  V1_V2
1      1
2      2
3      3
4      2
5      a
6      a
7      2
8      b
9      b
10     4

The unique values of V1_V2 would be: V1_V2的唯一值是:

   V1_V2
1      1
2      2
3      3
5      a
8      b
10     4

Then I could just count the rows using nrow(). 然后我可以使用nrow()计算行数。 Any ideas how I'd achieve this? 任何想法我是如何实现这一目标的?

This is well suited for union : 这非常适合union

data.frame(V1_V2=union(df$v1, df$v2))

#  V1_V2
#1     1
#2     2
#3     3
#4     a
#5     b
#6     4

尝试这个:

unique(c(df[,1], df[,2]))

With this approach, you can obtain the unique values does not matter how many columns you have: 使用这种方法,您可以获得唯一值,无论您拥有多少列:

df2 <- as.vector(as.matrix(df))
unique(df2)

And then, just use length . 然后,只需使用length

A generic approach: 通用方法:

uq_elem=c()
for(i in 1:ncol(df))
{
  uq_elem=c(unique(df[,i]), uq_elem)
  uq_elem=unique(uq_elem)
}

All the different elements will be at: uq_elem 所有不同的元素都在: uq_elem

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM