简体   繁体   中英

How do I count number of observations (1s) per column

suppose I have the following data frame, please read 0 as NA:

x = c("a","b","c","d","e")
y = c(1,1,0,0,1)
z = c(0,0,0,0,1)
data.frame(x,y,z)

how do I count the number of 1s for each column (ie, a,b,c,d,e)?

If there are only 0/1s you can simply sum up the relevant rows. The selector [2:3] selects all columns from 2 to 3. That vector can be adapted to your needs.

The second part adds the names from the vector x .

x = c("a","b","c","d","e")
y = c(1,1,0,0,1)
z = c(0,0,0,0,1)
df <- data.frame(x,y,z)

s <- rowSums(df[2:3])
names(s) <- x
s
# a b c d e 
# 1 1 0 0 2 

You will find that your code results in a data.frame organized as rows, rather than columns.

data <- data.frame(x,y,z)
data
  x y z
1 a 1 0
2 b 1 0
3 c 0 0
4 d 0 0
5 e 1 1

You might find consider transposing the data with t() :

t(data)
  [,1] [,2] [,3] [,4] [,5]
x "a"  "b"  "c"  "d"  "e" 
y "1"  "1"  "0"  "0"  "1" 
z "0"  "0"  "0"  "0"  "1" 

However, now we have two problems, first, the data is now a matrix. Second, the data is now character because numeric data and character data cannot exist in the same column of a data.frame.

Instead, we might subset only the numeric columns and transpose those:

new.data <- as.data.frame(t(data[,-1]))
new.data
  V1 V2 V3 V4 V5
y  1  1  0  0  1
z  0  0  0  0  1

Now we can add back the column names.

colnames(new.data) <- data[,1]
new.data
  a b c d e
y 1 1 0 0 1
z 0 0 0 0 1

And now it's easy with colSums :

result <- colSums(new.data)
result
a b c d e 
1 1 0 0 2 

If we need to put NA in for 0, we can subset:

result[result == 0] <- NA
result
a  b  c  d  e 
1  1 NA NA  2 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM