简体   繁体   中英

Compute the number of distinct values in col2 for each distinct value in col1 in R

I have a dataframe like this:

df <- data.frame(
          SchoolID=c("A","A","B","B","C","D"),
          Country=c("XX","XX","XX","YY","ZZ","ZZ"))

which gives me this data:

    SchoolID    Country
1   A           XX
2   A           XX
3   B           XX
4   B           YY
5   C           ZZ
6   D           ZZ

I would like to know for each SchoolID whether Country is uniquely assigned, by looking, for each distinct value of SchoolID, the number of distinct values of Country. So I would like to obtain this kind of table:

SchoolID   NumberOfCountry
A          1
B          2
C          1
D          1
aggregate(Country ~ SchoolID, df, function(x) length(unique(x)))

Or

tapply(df$Country, df$SchoolID, function(x) length(unique(x)))

Or

library(data.table) 
setDT(df)[, .(NumberOfCountry = length(unique(Country))), by = SchoolID]

Or with v >1.9.5

setDT(df)[, .(NumberOfCountry = uniqueN(Country)), by = SchoolID]

Or

library(dplyr)
df %>% 
  group_by(SchoolID) %>% 
  summarise(NumberOfCountry = n_distinct(Country))

One approach, which does not rely on third-party libraries:

> as.data.frame(rowSums(table(df[!duplicated(df), ]), na.rm=T))
  rowSums(table(df[!duplicated(df), ]), na.rm = T)
A                                                1
B                                                2
C                                                1
D                                                1

try this..

select School,count(Country)
from(
select distinct School,Country
from tbl_stacko) temp
group by School

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM