I have a data frame with 1000 rows and 10000 columns, I want to set a function of estimating v(corresponds to 2pq) for each column. sample data can be
data=data.frame(replicate(1000,sample(0:2,100,rep=T))) #[1000 snp (column) and 100 ind (row)]
I can calculate v for the first row
a=count(data$X1==2) #totla no of 2
b=count(data$X1==1) #total no of 1
n=nrow(data) #no of row in real data NA can be there
p=(a+(b*0.5))/n
q=1-p
v=2*p*q
v
I want to estimate v for all the columns. Thanks in advance
Put the code in a function.
calculate <- function(v) {
a=sum(v==2)
b=sum(v==1)
n=length(v)
p=(a+(b*0.5))/n
q=1-p
v=2*p*q
v
}
and if you have a dataframe use sapply
to calculate v
for all columns.
result <- sapply(data, calculate)
You can also use apply
with MARGIN = 2 which will work for both dataframe and matrix.
result <- apply(data, 2, calculate)
We can use dapply
library(collapse)
dapply(data, calculate)
where
calculate <- function(v) {
a=sum(v==2)
b=sum(v==1)
n=length(v)
p=(a+(b*0.5))/n
q=1-p
v=2*p*q
v
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.