[英]Calculated contrasts in DESeq2: difference between manual coefficients and DESeq2 authomatic contrast
I have the following model on DESeq2 where I am blocking for replicate.我在 DESeq2 上有以下 model,我正在阻止复制。
dds <- DESeqDataSetFromMatrix(countData = CPEB4_featureCounts_3utr_matrix,
colData = CPEB4_sample_list,
design = ~ replicate + sample_name)
dds <- DESeq(dds)`
This are the metadata:这是元数据:
sample_name replicate
0195_2022 INPUT 4
0196_2022 IgG 4
0197_2022 CPEB4 4
0198_2022 INPUT 5
0199_2022 IgG 5
0200_2022 CPEB4 5
2125_2021 INPUT 1
2126_2021 IgG 1
2127_2021 CPEB4 1
2235_2021 INPUT 2
2237_2021 CPEB4 2
2238_2021 INPUT 3
2239_2021 IgG 3
2240_2021 CPEB4 3
I want to extract the contrast "CPEB4 - IgG"我想提取对比“CPEB4 - IgG”
I can do it by using the results
function like this:我可以像这样使用
results
function 来做到这一点:
CPEB4vsIgG <- results(dds, contrast=c("sample_name","CPEB4","IgG"))
I get the following DEGs:我得到以下 DEG:
summary(CPEB4vsIgG)
out of 17300 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 598, 3.5%
LFC < 0 (down) : 30, 0.17%
outliers [1] : 0, 0%
low counts [2] : 7637, 44%
(mean count < 41)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
However, I could also manually calculate the coefficient (I usually do this when I have more complex contrasts), like this:但是,我也可以手动计算系数(我通常在有更复杂的对比时这样做),如下所示:
mod_mat <- model.matrix(design(dds), colData(dds))
CPEB4 <- colMeans(mod_mat[dds$sample_name == "CPEB4", ])
IgG <- colMeans(mod_mat[dds$sample_name == "IgG", ])
CPEB4vsIgG_2 <- results(dds, contrast = (CPEB4 - IgG))
However, with this code I get a slightly different list of DEGs:但是,通过这段代码,我得到了一个略有不同的 DEG 列表:
summary(CPEB4vsIgG_2)
out of 17300 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 672, 3.9%
LFC < 0 (down) : 81, 0.47%
outliers [1] : 0, 0%
low counts [2] : 7637, 44%
(mean count < 41)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
If I check the coefficient for the two groups I am subtracting it looks like everything is fine:如果我检查我减去的两组的系数,看起来一切都很好:
CPEB4
(Intercept) replicate2 replicate3 replicate4 replicate5 sample_nameIgG
1.0 0.2 0.2 0.2 0.2 0.0
sample_nameINPUT
0.0
IgG
(Intercept) replicate2 replicate3 replicate4 replicate5 sample_nameIgG
1.00 0.00 0.25 0.25 0.25 1.00
sample_nameINPUT
0.00
Why is there this difference?为什么会有这种差异?
If I create a model without taking into account the replicate I have the same results with the two approaches.如果我在不考虑复制的情况下创建 model,这两种方法的结果相同。
You can find the answer of this issue here: https://support.bioconductor.org/p/9148941/您可以在这里找到这个问题的答案: https://support.bioconductor.org/p/9148941/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.