计算重复测量设计的结果比例

Question

I have a table in the following format:我有以下格式的表格：

CowId    Result          IMI
1        S. aureus       1
1        No growth       0
2        No growth       0
2        No growth       0
3        E. coli         1
3        No growth       0
3        E. coli         0
4        Bacillus sp.    1
4        Contaminated    0

From this table, I would like to compute the proportion of CowIds that are negative for an IMI (0 = negative; 1 = positive) at all sampling time points.从该表中，我想计算在所有采样时间点对于 IMI（0 = 负；1 = 正）为负的 CowId 的比例。

In this example, 25% of cows [CowId = 2] tested negative for an IMI at all sampling time points.在此示例中，25% 的奶牛 [CowId = 2] 在所有采样时间点的 IMI 测试均为阴性。

To compute this proportion, my initial approach was to group each CowId, then compute the difference between the number of negative IMIs and the total number of IMI tests, where a resulting value of 0 would indicate that the cow was negative for an IMI at all time points.为了计算这个比例，我最初的方法是对每个 CowId 进行分组，然后计算负 IMI 数量与 IMI 测试总数之间的差异，其中结果值为 0 表示奶牛对 IMI 完全是阴性的时间点。

As of now, my code computes this for each individual CowId.到目前为止，我的代码为每个单独的 CowId 计算这个。 How can I augment this to compute the proportion described above?我怎样才能增加它来计算上述比例？

fp %>%
  filter(Result != "Contaminated") %>%
  group_by(CowId) %>%
  summarise(negative = (sum(IMI == 0) - length(IMI)))

Answer 1

We can count how many CowId 's have tested negative at all points and calculate their ratio.我们可以计算有多少CowId在all点测试为阴性并计算它们的比率。

library(dplyr)

fp %>%
  filter(Result != "Contaminated") %>%
  group_by(CowId) %>%
  summarise(negative = all(IMI == 0)) %>%
  summarise(total_percent = mean(negative) * 100)

# total_percent
#          <dbl>
#1            25

In base R, we can use aggregate在基础 R 中，我们可以使用aggregate

temp <- aggregate(IMI~CowId, subset(fp, Result != "Contaminated"), 
                  function(x) all(x == 0))

mean(temp$IMI) * 100

data数据

fp <- structure(list(CowId = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 4L), 
Result = structure(c(5L, 4L, 4L, 4L, 3L, 4L, 3L, 1L, 2L), .Label = 
c("Bacillus_sp.","Contaminated", "E.coli", "No_growth", "S.aureus"), 
class = "factor"),IMI = c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L)), 
class = "data.frame", row.names = c(NA, -9L))

Answer 2

With data.table带数据data.table

library(data.table)
setDT(fp)[Result != "Contaminated", .(negative = all(IMI == 0)), 
      .(CowId)][, .(total_percent = mean(negative)* 100 )]
#   total_percent
#1:            25

data数据

fp <- structure(list(CowId = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 4L), 
Result = structure(c(5L, 4L, 4L, 4L, 3L, 4L, 3L, 1L, 2L), .Label = 
c("Bacillus_sp.","Contaminated", "E.coli", "No_growth", "S.aureus"), 
class = "factor"),IMI = c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L)), 
class = "data.frame", row.names = c(NA, -9L))

计算重复测量设计的结果比例

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-03-29 07:03:37

解决方案2
0 2020-03-29 18:16:24

data数据

计算重复测量设计的结果比例

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-03-29 07:03:37

解决方案2 0 2020-03-29 18:16:24

data数据

解决方案1
0 已采纳 2020-03-29 07:03:37

解决方案2
0 2020-03-29 18:16:24