简体   繁体   English

计算重复测量设计的结果比例

[英]Compute proportion of outcome from repeated measures design

I have a table in the following format:我有以下格式的表格:

CowId    Result          IMI
1        S. aureus       1
1        No growth       0
2        No growth       0
2        No growth       0
3        E. coli         1
3        No growth       0
3        E. coli         0
4        Bacillus sp.    1
4        Contaminated    0

From this table, I would like to compute the proportion of CowIds that are negative for an IMI (0 = negative; 1 = positive) at all sampling time points.从该表中,我想计算在所有采样时间点对于 IMI(0 = 负;1 = 正)为负的 CowId 的比例。

In this example, 25% of cows [CowId = 2] tested negative for an IMI at all sampling time points.在此示例中,25% 的奶牛 [CowId = 2] 在所有采样时间点的 IMI 测试均为阴性。

To compute this proportion, my initial approach was to group each CowId, then compute the difference between the number of negative IMIs and the total number of IMI tests, where a resulting value of 0 would indicate that the cow was negative for an IMI at all time points.为了计算这个比例,我最初的方法是对每个 CowId 进行分组,然后计算负 IMI 数量与 IMI 测试总数之间的差异,其中结果值为 0 表示奶牛对 IMI 完全是阴性的时间点。

As of now, my code computes this for each individual CowId.到目前为止,我的代码为每个单独的 CowId 计算这个。 How can I augment this to compute the proportion described above?我怎样才能增加它来计算上述比例?

fp %>%
  filter(Result != "Contaminated") %>%
  group_by(CowId) %>%
  summarise(negative = (sum(IMI == 0) - length(IMI)))

We can count how many CowId 's have tested negative at all points and calculate their ratio.我们可以计算有多少CowIdall点测试为阴性并计算它们的比率。

library(dplyr)

fp %>%
  filter(Result != "Contaminated") %>%
  group_by(CowId) %>%
  summarise(negative = all(IMI == 0)) %>%
  summarise(total_percent = mean(negative) * 100)

# total_percent
#          <dbl>
#1            25

In base R, we can use aggregate在基础 R 中,我们可以使用aggregate

temp <- aggregate(IMI~CowId, subset(fp, Result != "Contaminated"), 
                  function(x) all(x == 0))

mean(temp$IMI) * 100

data数据

fp <- structure(list(CowId = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 4L), 
Result = structure(c(5L, 4L, 4L, 4L, 3L, 4L, 3L, 1L, 2L), .Label = 
c("Bacillus_sp.","Contaminated", "E.coli", "No_growth", "S.aureus"), 
class = "factor"),IMI = c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L)), 
class = "data.frame", row.names = c(NA, -9L))

With data.table带数据data.table

library(data.table)
setDT(fp)[Result != "Contaminated", .(negative = all(IMI == 0)), 
      .(CowId)][, .(total_percent = mean(negative)* 100 )]
#   total_percent
#1:            25

data数据

fp <- structure(list(CowId = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 4L), 
Result = structure(c(5L, 4L, 4L, 4L, 3L, 4L, 3L, 1L, 2L), .Label = 
c("Bacillus_sp.","Contaminated", "E.coli", "No_growth", "S.aureus"), 
class = "factor"),IMI = c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L)), 
class = "data.frame", row.names = c(NA, -9L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM