简体   繁体   中英

R count and list unique rows for each column satisfying a condition

I have been going crazy with something basic...

I am trying to count and list in a comma separated column each unique ID coming up in a data frame, eg:

df<-data.frame(id = as.character(c("a", "a", "a", "b", "c", "d", "d", "e", "f")), x1=c(3,1,1,1,4,2,3,3,3),
x2=c(6,1,1,1,3,2,3,3,1),
x3=c(1,1,1,1,1,2,3,3,2))

> > df  
  id x1 x2 x3
1  a  3  6  1
2  a  1  1  1
3  a  1  1  1
4  b  1  1  1
5  c  4  3  1
6  d  1  2  2
7  d  3  3  3
8  e  1  3  3
9  f  3  1  2

I am trying to get a count of unique id that satisfy a condition, >1:

res = data.frame(x1_counts =5, x1_names="a,c,d,e,f", x2_counts = 4, x2_names="a,c,d,f", x3_counts = 3, x3_names="d,e,f")

> res
  x1_counts  x1_names x2_counts x2_names x3_counts x3_names
1         5 a,c,d,e,f         4  a,c,d,f         3    d,e,f

I have tried with data.table but it seems very convoluted, ie DT = as.data.table(df) res <- DT[, list(x1= length(unique(id[which(x1>1)])), x2= length(unique(id[which(x2>1)]))), by=id)

But I can't get it right, I am going not getting what I need to do with data.table since it is not really a grouping I am looking for. Can you direct me in the right path please? Thanks so much!

You can reshape your data to long format and then do the summary:

library(data.table)
(melt(setDT(df), id.vars = "id")[value > 1]
   [, .(counts = uniqueN(id), names = list(unique(id))), variable])
   # You can replace the list to toString if you want a string as name instead of list

#   variable counts     names
#1:       x1      5 a,c,d,e,f
#2:       x2      4   a,c,d,e
#3:       x3      3     d,e,f

To get what you need, reshape it back to wide format:

dcast(1~variable, 
      data = (melt(setDT(df), id.vars = "id")[value > 1]
                 [, .(counts = uniqueN(id), names = list(unique(id))), variable]),  
      value.var = c('counts', 'names'))

#    . counts_x1 counts_x2 counts_x3  names_x1 names_x2 names_x3
# 1: .         5         4         3 a,c,d,e,f  a,c,d,e    d,e,f

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM