简体   繁体   English

R中时间序列数据的增长率

[英]Growth rate of time-series data in R

library(ggplot2)
library(data.table)

set.seed(100)

# Making data table
date <- rep(1:10, each=10)
id <- rep(1:10, 10)
grp <- rep(1:2, each=5)

# Adding random body size per group and averaging
dt <- as.data.table(cbind(date, grp, id))
dt[grp==1, bodysize:=rnorm(50, mean=6)]
dt[grp==2, bodysize:=rnorm(50, mean=7)]
dt <- dt[, mean.Body:=mean(bodysize), list(date, grp)]

# Plot
ggplot(data=dt, aes(x=date, y=mean.Body, group=grp)) + 
  geom_line(position="identity", aes(color=as.factor(grp)), size= 2, linetype= 2) +
  geom_point(size=2) +
  theme_minimal() + 
  labs(x= "Date", y= "Body size (mm)", color="Group" )

体型随时间变化

My question is how to implement a function to calculate the growth rate over several days for individuals in a data table.我的问题是如何实现一个函数来计算数据表中个人几天内的增长率。 This is morphological data, so growth rate will be calculated as log(body size day(i)) - log(body size day (i-1)).这是形态学数据,因此增长率将计算为 log(body size day(i)) - log(body size day (i-1))。 In other words, (body size today) - (body size yesterday).换句话说,(今天的体型)-(昨天的体型)。 I have 5 individuals per group for 10 days.我每组有 5 个人,为期 10 天。 Finding the growth rate per day for each individual is the goal of this post and to recreate the graph posted but for growth rate per day.找到每个人每天的增长率是这篇文章的目标,并重新创建张贴的图表,但每天的增长率。 Attached is some mock data.附上一些模拟数据。

Any suggestions will be greatly appreciated.任何建议将不胜感激。

Okay, I'm no good at data.table , but here's an attempt in tidyverse .好吧,我不擅长data.table ,但这是tidyverse的尝试。

First, I'll remake your data.首先,我将重新制作您的数据。

library(tidyverse)

set.seed(100)

# Making data
date <- rep(1:10, each=10)
id <- rep(1:10, 10)
grp <- rep(1:2, each=5)

df <- cbind(date, grp, id) %>% 
  as_tibble %>%
  rowwise %>%
  mutate(bodysize = rnorm(1, mean = 5 + grp)) %>%
  ungroup

I couldn't come up with a better solution than doing a pivot_wider , doing the lag by individual, and the pivoting back to long format, in order to get the lag to work properly:pivot_wider一个pivot_wider更好的解决方案,按个人做滞后,然后旋转回长格式,以使滞后正常工作:

result <- df %>% 
  pivot_wider(names_from = c(grp, id), 
              values_from = bodysize) %>%
  mutate_at(vars(-date), 
            list(growth = ~. - lag(.))) %>%
  pivot_longer(-date, names_to = c("grp", "id"), 
               names_pattern = "([0-9]+)_([0-9]+)",
               values_to = "growth") %>%
  filter(!is.na(growth))

Now, I'm a little unsure about what your desired plot is.现在,我有点不确定你想要的情节是什么。 You mentioned 5 individuals, but you have 10 ids.您提到了 5 个个人,但您有 10 个 ID。 If we plot them each, the plot gets a little messy, but you could play around with aes to separate each line.如果我们分别绘制它们,情节会变得有点混乱,但您可以使用aes来分隔每条线。

# Plot
ggplot(result, 
       aes(x = date, y = growth, group = id)) + 
  geom_line(position = "identity", 
            aes(color = as.factor(grp)), size = 2, linetype = 2) +
  geom_point(size = 2) +
  theme_minimal() + 
  labs(x = "Date", y = "Body size (mm)", color = "Group" )

Alternatively, we can of course average per group over each id to get a neater plot, if that's what you prefer:或者,我们当然可以在每个 id 上对每个组进行平均以获得更整洁的图,如果这是您喜欢的:

# Alternative plot
ggplot(result %>% group_by(date, grp) %>% summarise(grp_mean = mean(growth)), 
       aes(x = date, y = grp_mean, group = grp)) + 
  geom_line(position = "identity", 
            aes(color = as.factor(grp)), size = 2, linetype = 2) +
  geom_point(size = 2) +
  theme_minimal() + 
  labs(x = "Date", y = "Body size (mm)", color = "Group")

Created on 2019-12-06 by the reprex package (v0.2.1)reprex 包(v0.2.1) 于 2019 年 12 月 6 日创建

(Completely edited for a better attempt.) (完全编辑以进行更好的尝试。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM