[英]calculating incidence of disease in R using start and end date and disease occurrence date
I have cohort study data with start and end dates for each patient.我有每个患者的开始和结束日期的队列研究数据。 I would like to calculate the incidence of a disease in each year and each month from the first of January 2014 till the end of August 2021. How can I calculate person-months and person-years using the start and end date for each patient so I can get the incidence using the equation: number of new cases/ total population during time frame?
我想计算从 2014 年 1 月 1 日到 2021 年 8 月末每年和每个月的疾病发病率。如何使用每个患者的开始和结束日期计算人月和人年,以便我可以使用以下公式得出发病率:时间范围内的新病例数/总人口数?
This is how my data currently looks like:这是我的数据目前的样子:
patid![]() |
start_date![]() |
end_date![]() |
disease![]() |
disease_date![]() |
---|---|---|---|---|
1 ![]() |
01/03/1993 ![]() |
31/08/2021 ![]() |
yes![]() |
15/11/2017 ![]() |
2 ![]() |
24/03/2000 ![]() |
31/08/2021 ![]() |
no![]() |
NA![]() |
3 ![]() |
01/03/2020 ![]() |
23/08/2021 ![]() |
yes![]() |
15/08/2020 ![]() |
4 ![]() |
24/03/2016 ![]() |
01/08/2019 ![]() |
no![]() |
NA![]() |
5 ![]() |
24/03/2001 ![]() |
17/08/2020 ![]() |
no![]() |
NA![]() |
6 ![]() |
01/03/1999 ![]() |
04/08/2014 ![]() |
yes![]() |
01/01/2014 ![]() |
7 ![]() |
01/03/2016 ![]() |
31/08/2018 ![]() |
yes![]() |
18/03/2017 ![]() |
Please try the below code where i used the formula number of events/(end_date-start_date+1/365.25)*100请尝试下面的代码,其中我使用了公式事件数/(end_date-start_date+1/365.25)*100
df2 <- df %>% mutate(start_date=as.Date(start_date,'%d/%m/%Y'),
end_date=as.Date(end_date,'%d/%m/%Y'), disease_date=as.Date(disease_date,'%d/%m/%Y'),
person_year=as.numeric(end_date-start_date+1/365.25)
) %>% group_by(patid) %>% mutate(n=n(),
per_year2=(n/person_year)*100)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.