DESeq2 中的计算对比：手动系数与 DESeq2 自动对比之间的差异

Question

I have the following model on DESeq2 where I am blocking for replicate.我在 DESeq2 上有以下 model，我正在阻止复制。

dds <- DESeqDataSetFromMatrix(countData = CPEB4_featureCounts_3utr_matrix,
                              colData = CPEB4_sample_list,
                              design = ~   replicate  + sample_name)
dds <- DESeq(dds)`

This are the metadata:这是元数据：

          sample_name replicate
0195_2022       INPUT         4
0196_2022         IgG         4
0197_2022       CPEB4         4
0198_2022       INPUT         5
0199_2022         IgG         5
0200_2022       CPEB4         5
2125_2021       INPUT         1
2126_2021         IgG         1
2127_2021       CPEB4         1
2235_2021       INPUT         2
2237_2021       CPEB4         2
2238_2021       INPUT         3
2239_2021         IgG         3
2240_2021       CPEB4         3

I want to extract the contrast "CPEB4 - IgG"我想提取对比“CPEB4 - IgG”

I can do it by using the results function like this:我可以像这样使用results function 来做到这一点：

CPEB4vsIgG <- results(dds, contrast=c("sample_name","CPEB4","IgG"))

I get the following DEGs:我得到以下 DEG：

summary(CPEB4vsIgG)

out of 17300 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 598, 3.5%
LFC < 0 (down)     : 30, 0.17%
outliers [1]       : 0, 0%
low counts [2]     : 7637, 44%
(mean count < 41)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

However, I could also manually calculate the coefficient (I usually do this when I have more complex contrasts), like this:但是，我也可以手动计算系数（我通常在有更复杂的对比时这样做），如下所示：

mod_mat <- model.matrix(design(dds), colData(dds))
CPEB4 <- colMeans(mod_mat[dds$sample_name == "CPEB4", ])
IgG <- colMeans(mod_mat[dds$sample_name == "IgG", ])
CPEB4vsIgG_2 <- results(dds,  contrast = (CPEB4 - IgG))

However, with this code I get a slightly different list of DEGs:但是，通过这段代码，我得到了一个略有不同的 DEG 列表：

summary(CPEB4vsIgG_2)

out of 17300 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 672, 3.9%
LFC < 0 (down)     : 81, 0.47%
outliers [1]       : 0, 0%
low counts [2]     : 7637, 44%
(mean count < 41)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

If I check the coefficient for the two groups I am subtracting it looks like everything is fine:如果我检查我减去的两组的系数，看起来一切都很好：

CPEB4

     (Intercept)       replicate2       replicate3       replicate4       replicate5   sample_nameIgG 
             1.0              0.2              0.2              0.2              0.2              0.0 
sample_nameINPUT 
             0.0

IgG


     (Intercept)       replicate2       replicate3       replicate4       replicate5   sample_nameIgG 
            1.00             0.00             0.25             0.25             0.25             1.00 
sample_nameINPUT 
            0.00

Why is there this difference?为什么会有这种差异？

If I create a model without taking into account the replicate I have the same results with the two approaches.如果我在不考虑复制的情况下创建 model，这两种方法的结果相同。

Answer 1

You can find the answer of this issue here: https://support.bioconductor.org/p/9148941/您可以在这里找到这个问题的答案： https://support.bioconductor.org/p/9148941/

DESeq2 中的计算对比：手动系数与 DESeq2 自动对比之间的差异

问题描述

1 个解决方案

解决方案1
0 2023-01-19 14:39:53

DESeq2 中的计算对比：手动系数与 DESeq2 自动对比之间的差异

问题描述

1 个解决方案

解决方案1 0 2023-01-19 14:39:53

解决方案1
0 2023-01-19 14:39:53