I have a table in the following format:
CowId Result IMI
1 S. aureus 1
1 No growth 0
2 No growth 0
2 No growth 0
3 E. coli 1
3 No growth 0
3 E. coli 0
4 Bacillus sp. 1
4 Contaminated 0
From this table, I would like to compute the proportion of CowIds that are negative for an IMI (0 = negative; 1 = positive) at all sampling time points.
In this example, 25% of cows [CowId = 2] tested negative for an IMI at all sampling time points.
To compute this proportion, my initial approach was to group each CowId, then compute the difference between the number of negative IMIs and the total number of IMI tests, where a resulting value of 0 would indicate that the cow was negative for an IMI at all time points.
As of now, my code computes this for each individual CowId. How can I augment this to compute the proportion described above?
fp %>%
filter(Result != "Contaminated") %>%
group_by(CowId) %>%
summarise(negative = (sum(IMI == 0) - length(IMI)))
We can count how many CowId
's have tested negative at all
points and calculate their ratio.
library(dplyr)
fp %>%
filter(Result != "Contaminated") %>%
group_by(CowId) %>%
summarise(negative = all(IMI == 0)) %>%
summarise(total_percent = mean(negative) * 100)
# total_percent
# <dbl>
#1 25
In base R, we can use aggregate
temp <- aggregate(IMI~CowId, subset(fp, Result != "Contaminated"),
function(x) all(x == 0))
mean(temp$IMI) * 100
data
fp <- structure(list(CowId = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 4L),
Result = structure(c(5L, 4L, 4L, 4L, 3L, 4L, 3L, 1L, 2L), .Label =
c("Bacillus_sp.","Contaminated", "E.coli", "No_growth", "S.aureus"),
class = "factor"),IMI = c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA, -9L))
With data.table
library(data.table)
setDT(fp)[Result != "Contaminated", .(negative = all(IMI == 0)),
.(CowId)][, .(total_percent = mean(negative)* 100 )]
# total_percent
#1: 25
fp <- structure(list(CowId = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 4L),
Result = structure(c(5L, 4L, 4L, 4L, 3L, 4L, 3L, 1L, 2L), .Label =
c("Bacillus_sp.","Contaminated", "E.coli", "No_growth", "S.aureus"),
class = "factor"),IMI = c(1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA, -9L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.