[英]Time series analysis of diabetic data
I have a dataset that looks like this我有一个看起来像这样的数据集
data=
**ID HbA1cRes Year**
1 65 2003
2 125 2008
3 40 2010
4 110 2007
5 125 2006
6 136 2011
7 20 2012
8 58 2009
9 12 2006
10 123 2008
The patients with HbA1cRes > 65 are classified as 'High risk' and the ones below that are classified as 'Low Risk'. HbA1cRes > 65 的患者被归类为“高风险”,以下被归类为“低风险”。 I am trying to do a time series analysis using the following code (to see the rise and fall of high risk and low-risk cases over time) and Year <- data$REport_YrMonth
我正在尝试使用以下代码进行时间序列分析(以查看高风险和低风险案例随时间的上升和下降)和 Year <- data$REport_YrMonth
library(tidyverse)
data$risk <- factor( ifelse( data$HbA1cRes > 65 ,"High risk patients", "Low risk patients") )
ggplot(data, aes(x=Year)) +
geom_line(aes(y=risk)) +
labs(title="Analysis of diabetes' patients status over time",
y="Returns %")
However, the output returned is as follows:但是,返回的 output 如下:
Any guess what I am doing wrong here?猜猜我在这里做错了什么?
Count how many "High risk patients" and "Low risk patients" you have every Year
and then plot the data.计算每年有多少“高风险患者”和“低风险患者”,然后计算
Year
数据。
library(ggplot2)
library(dplyr)
data %>%
mutate(risk = factor(ifelse(HbA1cRes > 65 ,
"High risk patients", "Low risk patients"))) %>%
count(Year, risk) %>%
ggplot(aes(x=Year, y = n, color = risk)) +
geom_line() +
labs(title="Analysis of diabetes' patients status over time")
case_when function may be an elegant solution for data classification. case_when function 可能是数据分类的优雅解决方案。
Instead of geom_line, maybe, geom_col or geom_density may provide better options. geom_col 或 geom_density 可能会提供更好的选择,而不是 geom_line。
df <- tibble(
id = 1:10,
hb = c(65,125,40,110,125,136,20,58,12,123),
year = c(2003,2008,2010,2007,2006,2011,2012,2009,2006,2008)
)
df <- df %>%
mutate(
risk = case_when(
hb > 65 ~"high risk",
TRUE ~"low risk"
)
) %>%
count(
year,
risk
)
df %>%
ggplot(aes(x=year, y = n, group = risk, fill = risk)) +
geom_col(position = "dodge") +
labs(
title="Analysis of diabetes' patients status over time",
y="Returns %",
fill = "Risk Status")
df %>%
ggplot(aes(x=year, fill = risk)) +
geom_density(position = "fill") +
labs(
title="Analysis of diabetes' patients status over time",
y="Returns %",
fill = "Risk Status")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.