[英]Using Variable in lag function of dplyr - R
我寫了一個帶有變量的函數。 我正在嘗試為數據框的給定列計算延遲。 我無法這樣做。 以下是我的代碼段:
calculateLag <- function(df,lagCol,lagInterval){
df <- df %>%
group_by(grp = cumsum(c(TRUE, diff(t)!=1))) %>%
mutate(val_lag = lag(df[,lagCol],lagInterval)) %>%
ungroup() %>%
select(-grp)
return(df)
}
我收到以下錯誤消息:
Error in `[.data.table`(df, , lagCol) :
j (the 2nd argument inside [...]) is a single symbol but column name 'lagCol' is not found. Perhaps you intended DT[,..lagCol] or DT[,lagCol,with=FALSE]. This difference to data.frame is deliberate and explained in FAQ 1.1.
預期結果:
t val val_lag val_lag2
2005-01-17 17:30:00 14.3 NA NA
2005-01-17 18:30:00 14.0 14.3 NA
2005-01-17 19:30:00 14.3 14.0 14.3
2005-01-17 22:30:00 14.9 NA NA
2005-01-17 23:30:00 14.2 14.9 NA
2005-01-18 00:30:00 14.1 14.2 14.9
有人可以幫我嗎?
謝謝
一個可重現的例子會有所幫助
用mtcars
看這個例子
library(dplyr)
calculateLag <- function(df,lagCol,lagInterval){
lagCol <- enquo(lagCol) # need to quote
df <- df %>%
group_by(cyl) %>%
mutate(val_lag = lag(!!lagCol, lagInterval)) %>% # !! unquotes
ungroup()
return(df)
}
calculateLag(select(mtcars,cyl,gear), gear, 2)
請參閱此鏈接了解非標准評估
calculateLag <- function(df,lagCol,lagInterval){
lagCol <- enquo(lagCol)
df <- df %>%
group_by(grp = cumsum(c(TRUE, diff(t)!=1))) %>%
mutate(val_lag = lag(!!lagCol, lagInterval)) %>%
ungroup() %>%
select(-grp)
return(df)
}
calculateLag(df, val, 2)
t val val_lag
1 2005-01-17 06:00:00 10.8 NA
2 2005-01-17 07:00:00 10.8 NA
3 2005-01-17 08:00:00 10.7 10.8
4 2005-01-17 09:00:00 10.6 10.8
5 2005-01-17 10:00:00 10.6 10.7
6 2005-01-17 11:00:00 10.7 10.6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.