简体   繁体   English

如何计算R中表格的每一列中的元素

[英]how to count elements in each column of a table in R

i have a data set which look like this (actually it has >50 columns) 我有一个看起来像这样的数据集(实际上它有> 50列)

data <- read.csv("sample.csv")

subject gender  age type    satisfation     agree 
1   f   22  a   yes yes
2   f   23  b   no  yes 
3   f   21  b       no
4   m   24  c   yes yes 
5   f   22  b   no  yes
6   m       a   yes yes 
7       25  c   yes no
8   m   21  b   no  yes 
9   f   23  c   yes yes

I would like to count the elements in each column (don't count NA) and export the result as the layout below 我想计算每列中的元素(不计算NA)并将结果导出为以下布局

subject gender  age type    satisfation     agree 
9   8   8   9   8   9

i wrote a script to count 我写了一个脚本来计数

counting <- function(x) {
  for(i in 1:length(data)) {
     data <- length(which(!is.na(x$i)))
      print(data)
  }
  return(data)
}   
counting(data)

i didn't work out as it gave all 0. 我没有解决,因为它给了所有0。

dput(head(data, 9))

structure(list(subject = 1:9, gender = structure(c(2L, 2L, 2L, 
3L, 2L, 3L, 1L, 3L, 2L), .Label = c("", "f", "m"), class = "factor"), 
    age = c(22L, 23L, 21L, 24L, 22L, NA, 25L, 21L, 23L), type = structure(c(1L, 
    2L, 2L, 3L, 2L, 1L, 3L, 2L, 3L), .Label = c("a", "b", "c"
    ), class = "factor"), satisfation = structure(c(3L, 2L, 1L, 
    3L, 2L, 3L, 3L, 2L, 3L), .Label = c("", "no", "yes"), class = "factor"), 
    agree = structure(c(2L, 3L, 1L, 3L, 2L, 3L, 1L, 3L, 2L), .Label = c("no", 
    "yes", "yes "), class = "factor"), time = c(23L, 54L, 67L, 
    324L, 87L, 12L, 756L, 34L, 98L), day = c(1L, 3L, 2L, 5L, 
    7L, 4L, 3L, 1L, 4L)), .Names = c("subject", "gender", "age", 
"type", "satisfation", "agree", "time", "day"), row.names = c(NA, 
9L), class = "data.frame")

Is there any recommendation for the script, plz? 请问该脚本有什么建议吗?

Thank you all in advance! 谢谢大家!

Assuming you have handled NA then simply use colSums , 假设您已经处理了NA然后只需使用colSums

colSums(!is.na(df))
#    subject    gender       age       type   satisfation    agree      time        day 
#          9      9           8           9           9       9          9           9 

Adding @DavidArenburg suggestion so as to overcome any NA trouble, 添加@DavidArenburg建议以克服所有NA问题,

colSums(!is.na(df) | df != "", na.rm = TRUE)

When I load your table into R there are just blank spaces instead of NAs. 当我将表格加载到R中时,只有空格而不是NA。 So when you read your .csv file, specify how NAs are coded. 因此,当您读取.csv文件时,请指定NA的编码方式。 It looks like they are coded as "" or maybe " ". 看起来它们被编码为“”或“”。

After you get the NAs, you can run this code. 获得NA后,您可以运行此代码。 Assume your table is called df . 假设您的表名为df

counts <- apply(df, 2, function(x) length(na.omit(x)))

Or, as @JasonAizkalns suggests: 或者,就像@JasonAizkalns所建议的那样:

data <- read.csv("sample.csv", na.strings = "") 
sapply(data, function(x) sum(!is.na(x))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM