简体   繁体   中英

How to perform row operations in R to produce a single statistic

I want to compute a mean from a data frame in R. The file represents the output of coverage (column 4) over ranges (columns 2,3) of a chromosome (column 1).

The data looks like this:

V1  V2  V3   V4
 1  65  69  103
 1  69  70  107
 1  70  74  108
 1  74  75  110
 1  75  77  111
 1  77  78  113
 1  78  79  115
 1  79  80  118
 1  80  81  119

I want to compute the mean coverage over all of the file. On paper, this looks like: [103*(69-65)+107(70-69)+108(74-70)+ ... + V4(V3-V2)]/(lengthOfChromosome)

The lengthOfChromosome is known.

I've been searching for a solution, and the closest thing I've found is the row-wise operators in the apply() family. These don't seem particularly well suited for the task since most of their outputs appear to be either matrices or lists or vectors. My goal is to get a single statistic: the mean. I also might be interested in the standard deviation, but that is less important now.

Any tips in the right direction would be appreciated!

You don't even need apply() here. Most operators in R operate in a vectorized manner. So if your data is in a data.frame called dd

dd<-structure(list(V1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), V2 = c(65L, 
69L, 70L, 74L, 75L, 77L, 78L, 79L, 80L), V3 = c(69L, 70L, 74L, 
75L, 77L, 78L, 79L, 80L, 81L), V4 = c(103L, 107L, 108L, 110L, 
111L, 113L, 115L, 118L, 119L)), .Names = c("V1", "V2", "V3", 
"V4"), class = "data.frame", row.names = c(NA, -9L))

Then you can get the numerator of your equation with a simple

with(dd, sum(V4*(V3-V2)))

(here we use with() so we don't have to write dd$ a bunch of times.) And assuming the lenght of the chromosome is just the max end less the min start then

with(dd, sum(V4*(V3-V2))/(max(V3)-min(V2)))

如果dat是您的data.frame,并且V1仅是1

with(dat, sum(V4*(V3-V2))) / (lengthOfChromosome)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM