[英]Change column values based on factors of other columns
For example, if I have a data frame like this: 例如,如果我有一个像这样的数据框:
df <- data.frame(profit=c(10,10,10), year=c(2010,2011,2012))
profit year
10 2010
10 2011
10 2012
I want to change the value of profit
according to the year
. 我想根据
year
更改profit
值。 For year 2010, I multiple the profit by 3, for year 2011, multiple the profit by 4, for year 2012, multiple by 5, which should result like this: 对于2010年,我的利润乘以3,对于2011年,利润乘以4,对于2012年,乘以5,结果应为:
profit year
30 2010
40 2011
50 2012
How should I approach this? 我应该如何处理? I tried:
我试过了:
inflationtransform <- function(k,v) {
switch(k,
2010,v<-v*3,
2011,v<-v*4,
2012,v<-v*5,
)
}
df$profit <- sapply(df$year,df$profit,inflationtransform)
But it doesn't work. 但这是行不通的。 Can someone tell me what to do?
有人可以告诉我该怎么做吗?
For this particular example, since your factors and years are both ordered and incremented by 1, you could just subtract 2007 from the year
column and multiply it by profit
. 对于此特定示例,由于您的因子和年份都是有序的并且都增加了1,因此您可以从
year
列中减去2007,然后乘以profit
。
transform(df, profit = profit * (year - 2007))
# profit year
# 1 30 2010
# 2 40 2011
# 3 50 2012
Otherwise, you could use a lookup vector. 否则,您可以使用查找向量。 This will cover all cases.
这将涵盖所有情况。
lookup <- c("2010" = 3, "2011" = 4, "2012" = 5)
transform(df, profit = profit * lookup[as.character(year)])
# profit year
# 1 30 2010
# 2 40 2011
# 3 50 2012
I wouldn't use switch()
unless you really need to. 除非您确实需要,否则我不会使用
switch()
。 It's not vectorized, and that's where R is most efficient. 它不是向量的,这是R最有效的地方。 However, since you ask for it in the comments, here's one way.
但是,由于您在注释中要求它,所以这是一种方法。 I find it easier to use a
for()
loop with switch()
. 我发现将
for()
循环与switch()
一起使用更容易。
for(i in seq_len(nrow(df))) {
df$profit[i] <- with(df, switch(as.character(year[i]),
"2010" = 3 * profit[i],
"2011" = 4 * profit[i],
"2012" = 5 * profit[i]
))
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.