简体   繁体   English

计算从基准年(t0)到随后的BUT LIMITED系列年(t1,...,tk)的百分比变化

[英]Calculate percent change from a baseline year (t0) to a subsequent BUT LIMITED series of years (t1, …, tk)

Imagine you have yearly data for some sort of expenses. 想象一下,您有某种费用的年度数据。 You are interested in the percent difference between the first value (t0) and each subsequent value (t1, ... -> tx) BUT only for a specific group of observations, ie with the next group, a new series of subsequent years starts. 您感兴趣的是第一个值(t0)和每个后续值(t1,... - > tx)之间的百分比差异,但仅针对特定的观察组,即对于下一组,随后几年的新系列开始。

Example: 例:

    value <- c(10225,10287,10225,10087,10344,10387,10387,14567,13992,15432)
    case <- c(A,A,A,B,B,B,B,B,C,C)

    year    value   case   change
    1989    10225   A      0.00
    1990    10287   A      0.61 # ((100/10225)*10287)-100
    1991    10262   A      0.36
    1995    10087   B      0.00
    1996    10344   B      2.55 # ((100/10087)*10344)-100
    1997    10387   B      2.97 
    1978    10387   B      2.97
    1979    14567   B      ...
    1980    13992   C
    1981    15432   C

How can I calculate the percent change in R? 如何计算R的百分比变化?

The answers to my earlier post and similar posts (eg, this post on calculating relative difference ) were very helpful. 我之前的帖子和类似帖子的答案(例如, 关于计算相对差异的这篇文章 )非常有帮助。 Thanks again! 再次感谢!

However, I had to realize that my case is more complex and edited my question accordingly. 但是,我必须意识到我的情况更复杂,并相应地编辑了我的问题。 The problem is that I do not have ONE series of subsequent years but A NUMBER of limited series of subsequent years, one per group of cases. 问题在于,我没有随后几年的一系列,而是随后几年的有限系列数,每组一例。

Any ideas are highly appreciated! 任何想法都非常感谢!

Many thanks. 非常感谢。

What about this? 那这个呢?

((value[-1]/value[1])-1)*100
[1]  0.6063570  0.0000000 -1.3496333  1.1638142  1.5843521  0.7334963

Another alternative 另一种选择

((value - value[1]) / value[1]) * 100
[1]  0.0000000  0.6063570  0.0000000 -1.3496333  1.1638142  1.5843521  0.7334963

For your updated question, here's two R base solutions: 对于您更新的问题,这里有两个R基础解决方案:

transform(df, Change = unlist(sapply(split(value, case), function(x) ((x - x[1]) / x[1]) * 100)))
   value case    Change
A1 10225    A  0.000000
A2 10287    A  0.606357
A3 10225    A  0.000000
B1 10087    B  0.000000
B2 10344    B  2.547834
B3 10387    B  2.974125
B4 10387    B  2.974125
B5 14567    B 44.413602
C1 13992    C  0.000000
C2 15432    C 10.291595

 transform(df, Change = unlist(aggregate(value ~ case, function(x) ((x - x[1]) / x[1]) * 100, data=df)$value))
   value case    Change
01 10225    A  0.000000
02 10287    A  0.606357
03 10225    A  0.000000
11 10087    B  0.000000
12 10344    B  2.547834
13 10387    B  2.974125
14 10387    B  2.974125
15 14567    B 44.413602
21 13992    C  0.000000
22 15432    C 10.291595

To answer your expanded question, use transform combined with ddply from the plyr package: 要回答你的问题扩大,使用transform结合ddply从plyr包:

ddply(df, .(case), transform, change = ((100 / value[1]) * value) - 100)

In regard to your comment on the NA and Inf values, this is expected behavior as you are dividing by zero, making the change meaningless. 关于你对NA和Inf值的评论,这是预期的行为,因为你除以零,使得变化毫无意义。 You could delete those entries. 您可以删除这些条目。

If your data frame is called, say, df , try something like this: 如果您的数据框被调用,比如df ,请尝试以下方法:

transform(df, change = 100*(value/value[year==1989] - 1))

noting that this will give a value of 0 for 1989 not NA : 注意到这将给出1989年的值0而不是NA

#   year value     change
# 1 1989 10225  0.0000000
# 2 1990 10287  0.6063570
# 3 1991 10225  0.0000000
# 4 1992 10087 -1.3496333
# 5 1993 10344  1.1638142
# 6 1994 10387  1.5843521
# 7 1995 10300  0.7334963

If you know you want the first record to be the base you can simply use 如果您知道您想要第一条记录作为基础,您可以简单地使用

transform(df, change = 100*(value/value[1] - 1))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM