[英]Calculate percent changes in "long" dataframe
I have a dataframe that contains GDP values by country with an accompanying date column.我有一个数据框,其中包含按国家/地区划分的 GDP 值以及随附的日期列。 The following code reproduces a sample dataset for two countries (FR and DE) and six years(2005-2010):以下代码再现了两个国家(法国和德国)和六年(2005-2010)的样本数据集:
df <- structure(list(geo = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c("DE", "FR"), class = "factor"),
date = structure(c(12784, 13149, 13514, 13879, 14245, 14610,
12784, 13149, 13514, 13879, 14245, 14610), class = "Date"),
GDP = c(2147975, 2249584.4, 2373993.1, 2382892.6, 2224501.8,
2371033.2, 1557584.8, 1621633.2, 1715655.4, 1713157.1, 1636336.3,
1707966.5)), .Names = c("geo", "date", "GDP"), row.names = c(NA,
-12L), class = "data.frame")
Now I would like to calculate an additional column that shows the percent differences year over year.现在我想计算一个额外的列,显示每年的百分比差异。 I try the following:我尝试以下操作:
library(quantmod)
# provides the Delt() function to calculate percent differences
df$dtGDP <- as.numeric(Delt(df$GDP))
This is erroneous, because it calculates a value for FR in 2005 by using the DE-value from 2010. Is there a way to apply the function "per factor level"?这是错误的,因为它使用 2010 年的 DE 值计算了 2005 年的 FR 值。有没有办法应用“每因子水平”函数?
> df$dtGDP <-with(df, ave(GDP, geo, FUN=Delt))
> df
geo date GDP dtGDP
1 DE 2005-01-01 2147975 NA
2 DE 2006-01-01 2249584 0.047304741
3 DE 2007-01-01 2373993 0.055302971
4 DE 2008-01-01 2382893 0.003748747
5 DE 2009-01-01 2224502 -0.066469970
6 DE 2010-01-01 2371033 0.065871558
7 FR 2005-01-01 1557585 NA
8 FR 2006-01-01 1621633 0.041120329
9 FR 2007-01-01 1715655 0.057979943
10 FR 2008-01-01 1713157 -0.001456178
11 FR 2009-01-01 1636336 -0.044841655
12 FR 2010-01-01 1707966 0.043774742
Try this:尝试这个:
foo <- aggregate(GDP~geo, df, function(x) list(Delt(x)))
df <- cbind(df, dtGDP = as.numeric(unlist(foo[,-1])))
df
Assuming you have already run this:假设你已经运行了这个:
library(quantmod)
df <- structure(list(geo = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c("DE", "FR"), class = "factor"),
date = structure(c(12784, 13149, 13514, 13879, 14245, 14610,
12784, 13149, 13514, 13879, 14245, 14610), class = "Date"),
GDP = c(2147975, 2249584.4, 2373993.1, 2382892.6, 2224501.8,
2371033.2, 1557584.8, 1621633.2, 1715655.4, 1713157.1, 1636336.3,
1707966.5)), .Names = c("geo", "date", "GDP"), row.names = c(NA,
-12L), class = "data.frame")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.