如何构建数据集以随时间运行计数比的二项式 GLM？

Question

I am trying to do an analysis using a binomial GLM to test for differences in relative count frequency over time (Days).我正在尝试使用二项式 GLM 进行分析，以测试相对计数频率随时间（天）的差异。 The GLM model/formula would look something like this: GLM 模型/公式看起来像这样：

(1:2) ∼ Day (1:2) ∼ 日

Where we are testing for the effect of Day on the frequency of A1:A2.我们在哪里测试Day对 A1:A2 频率的影响。 Basically this is a binomial generalized linear model where A1 and A2 refer to the read counts of alternative alleles at each gene and Day is a multilevel factor.基本上，这是一个二项式广义线性 model，其中 A1 和 A2 是指每个基因的替代等位基因的读取计数，而 Day 是一个多级因子。 The other thing is that I would be testing this on many different genes (100's) so that we would be doing many tests.另一件事是，我将在许多不同的基因（100 个）上进行测试，以便我们进行许多测试。

The basic model formula in R is straightforward (eg using a long format dataset): ` R 中的基本 model 公式很简单（例如，使用长格式数据集）：`

glm(AF1:AF2 ~ Day, data = dfLong, family = "binomial")

But Im not really sure how to structure the data or loop over the Gene variable to accomplish this task?但我不太确定如何构造数据或遍历Gene变量来完成这项任务？

Here is an example dataframe:这是一个示例 dataframe：

> df<-read.csv("test.csv")
> df
  Gene A.count_1 A.count_2 Day
1    1        60        40   1
2    2       100        30   1
3    3       100         3   1
4    1        55       100   3
5    2       423       410   3
6    3       191        89   3
7    1        20        10   5
8    2       200        10   5
9    3       100        20   5

The output I'd like is the test of the effect of Day as a factor (not a numeric variable) on allele count ratios for each gene, producing a p-value for each gene (eg 1,2, and 3, or more, 100s, in the general case).我想要的 output 是测试Day作为一个因子（不是数字变量）对每个基因的等位基因计数比率的影响，为每个基因（例如 1,2 和 3，或更多）产生一个 p 值, 100s, 在一般情况下)。

Any help to set me in the right direction would be mnuch appreciated.任何帮助我走上正确方向的帮助都将不胜感激。

Thanks!!谢谢！！

Answer 1

I think that我觉得

library('lme4')
m <- lmList(AF1:AF2 ~ Day | Gene, data = dfLong, family = "binomial")
summary(m)

should probably do it?大概应该这样做？

如何构建数据集以随时间运行计数比的二项式 GLM？

问题描述

1 个解决方案

解决方案1
0 2021-11-24 17:25:07

如何构建数据集以随时间运行计数比的二项式 GLM？

问题描述

1 个解决方案

解决方案1 0 2021-11-24 17:25:07

解决方案1
0 2021-11-24 17:25:07