简体   繁体   中英

R dataframe filter and count unique entries

Suppose I have a dataframe such as:

A    B    C    D  
1    1    1    1  
1    1    1    1  
2    2    1    2  
2    2    2    2  
2    2    1    2  

And I want to create a dataframe that only has the unique entries and the count of how many times it occurred. So something like this:

A    B    C    D    count
1    1    1    1     2  
2    2    1    2     2   
2    2    2    2     1  

How would I do this?

You can try using the "data.table" package, like this:

> library(data.table)
> as.data.table(dat)[, .N, by = names(dat)]
   A B C D N
1: 1 1 1 1 2
2: 2 2 1 2 2
3: 2 2 2 2 1

Or similarly with "dplyr":

> library(dplyr)
> dat %>% group_by_(.dots = names(dat)) %>% summarise(n = n())
Source: local data frame [3 x 5]
Groups: A, B, C

  A B C D n
1 1 1 1 1 2
2 2 2 1 2 2
3 2 2 2 2 1

A base R option is

aggregate(cbind(Count=1:nrow(df1))~., df1, FUN=length)
#    A B C D Count
#  1 1 1 1 1     2
#  2 2 2 1 2     2
#  3 2 2 2 2     1

Or a modification suggested by @David Arenburg

aggregate(Count ~ ., cbind(Count = 1, df1), FUN=length)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM