[英]How to calculate rates between observations with R
考虑到我有一个按日期排序的数据框,并且对于每个数据框我都有一些数量,我该如何计算每一行的X天/X天-1指数?
我的数据集: https : //raw.githubusercontent.com/imdevskp/covid_19_jhu_data_web_scrap_and_cleaning/master/covid_19_clean_complete.csv
我的过程数据集(R 代码):
library(tidyverse)
library(lubridate)
covid19 <- read.table(file = "covid_19_clean_complete.csv",
header = TRUE,
stringsAsFactors = FALSE,
sep = ",",
dec = ".",
quote = "\"")
covid19$Date <- mdy(covid19$Date)
brasil <- covid19 %>%
filter(Country.Region == "Brazil") %>%
group_by(Country.Region, Date) %>%
summarise(Cases = sum(Confirmed))
我的费率将根据Cases变量计算。
我们可以采用“案例”的lag
并用它来划分“案例”
library(dplyr)
out <- covid19 %>%
group_by(Country.Region, Date) %>%
summarise(Cases = sum(Confirmed)) %>%
mutate(Ratio = Cases/lag(Cases))
out %>%
filter(Country.Region == "Brazil") %>%
tail
# A tibble: 6 x 4
# Groups: Country.Region [1]
# Country.Region Date Cases Ratio
# <chr> <date> <int> <dbl>
#1 Brazil 2020-03-08 20 1.54
#2 Brazil 2020-03-09 25 1.25
#3 Brazil 2020-03-10 31 1.24
#4 Brazil 2020-03-11 38 1.23
#5 Brazil 2020-03-12 52 1.37
#6 Brazil 2020-03-13 151 2.90
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.