Missing values in lmFit [limma R package]

Question

[This question is specific to bioinformatics. There are posts elsewhere but I couldn't find a satisfactory answer.]

If I have a gene/protein expression data with missing values ( NA ), how does lmFit of the limma package handle these values? Note that the missing values are not in the design matrix, but rather, in the data matrix only.

Here is a simulated, working example that illustrates my question:

library(limma)
my_genes <- matrix(rnorm(9000, -10, 10), ncol=4)
my_genes <- as.data.frame(my_genes)
rownames(my_genes) <- paste("Gene", 1:nrow(my_genes))
## Randomly introducing NAs
purrr::map_df(my_genes, function(x) {x[sample(c(TRUE, NA), prob = c(0.95, 0.05), size = length(x), replace = TRUE)]})
tx <- 1:2 #suppose treatment is columns 1 & 2
ctrls <- 3:4 #suppose controls is columns 3 & 4
my_design <- model.matrix( ~factor(c(1,1,0,0)))
my_design
fit <- lmFit(my_genes, my_design)
fit <- eBayes(fit)
plot(fit$logFC, -log10(fit$p.value))

If you find any websites / posts that can help, feel free to share with a post or comment.

Answer 1

This post in CrossValidated answers my own question in detail. In short, the way of how lmFit deals with missing values is similar to how lm does. Rows with missing values are subjected to na.exclude , or "case-wise deletion."

Alternatively: Though it's not an ideal solution, it might be appropriate to just impute the missing gene-expression values. For example, using the knn.impute function in the impute Bioconductor package.

Missing values in lmFit [limma R package]

Question

1 answers

solution1
1 ACCPTED 2017-01-26 02:01:49

Missing values in lmFit [limma R package]

Question

1 answers

solution1 1 ACCPTED 2017-01-26 02:01:49

solution1
1 ACCPTED 2017-01-26 02:01:49