简体   繁体   English

使用apply()创建输出矩阵或数据框

[英]Create an output matrix or data frame using apply()

I've got a frequency cross-tab and would like to use rep() with an apply() function to make a long column of data for each sample (A01, A02 etc) that I can use for mean and stdev stats. 我有一个频率交叉表,并且想将rep()apply()函数一起使用,以便为每个样本(A01,A02等)制作一长列数据,可用于均值和标准差统计。 The numbers in columns A01, A02 etc are frequency counts of CAG eg 6485 counts of 13 CAG. A01,A02等列中的数字是CAG的频率计数,例如13485的6485计数。

I've managed to write the function to give the correct results, but the format doesn't appear to be indexable eg using sumstats$A01 gives NULL . 我设法编写了函数以提供正确的结果,但是该格式似乎不可索引,例如,使用sumstats$A01给出NULL I'd also ideally like the rows and columns inverted in the output table, so columns are mean, sd etc. 理想情况下,我也希望输出表中的行和列反转,因此列是均值,标准差等。

data <- data.frame(CAG = c(13, 14, 15), A01 = c(6485,35,132), A02 = c(0,42,56))
sumstats <- sapply(data[, 2:ncol(data)], function(x) {
data_e <- rep(data$CAG, x)

list(
  mean = mean(data_e),
  median = median(data_e),
  sd   = sd(data_e)
)
 })

#Output:
#sumstats$A01
#NULL

The $ subsetting is unique to the data.frame class. $子设置对于data.frame类是唯一的。 If you check class(sumstats) you will see it is just a simple matrix. 如果检查class(sumstats)您将看到它只是一个简单的矩阵。

Simply run sumstats <- as.data.frame(sumstats) and then you can use 只需运行sumstats <- as.data.frame(sumstats) ,然后就可以使用

sumstats$A01
#$mean
#[1] 13.04495
#
#$median
#[1] 13
#
#$sd
#[1] 0.2874512

Is this what you wanted? 这就是你想要的吗?

EDIT: 编辑:

sumstats2 <- as.data.frame(t(sumstats))
res <- data.frame(samples, sumheight, sumstats2)
res
#    samples sumheight     mean median        sd
#A01     A01      6652 13.04495     13 0.2874512
#A02     A02        98 14.57143     15  0.497416
data <- data.frame(CAG = c(13, 14, 15), A01 = c(6485,35,132), A02 = c(0,42,56))

samples <- c('A01', 'A02')
sumheight <- colSums(data[ , 2:ncol(data)], na.rm=TRUE)

sumstats <- sapply(data[, 2:ncol(data)], function(x) {
  data_e <- rep(data$CAG, x)

  list(
    mean = mean(data_e),
    median = median(data_e),
    sd   = sd(data_e)
  )
})


sumstats2 <- as.data.frame(t(sumstats))
res <- data.frame(samples, sumheight, sumstats2$mean)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM