I would like to create variable names dynamically while using dplyr; although, I'd be fine with a non-dplyr solution as well.
For Example:
data(iris)
library(dplyr)
iris <- iris %>%
group_by(Species) %>%
mutate(
lag_Sepal.Length = lag(Sepal.Length),
lag_Sepal.Width = lag(Sepal.Width),
lag_Petal.Length = lag(Petal.Length)
) %>%
ungroup
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species lag_Sepal.Length lag_Sepal.Width
(dbl) (dbl) (dbl) (dbl) (fctr) (dbl) (dbl)
1 5.1 3.5 1.4 0.2 setosa NA NA
2 4.9 3.0 1.4 0.2 setosa 5.1 3.5
3 4.7 3.2 1.3 0.2 setosa 4.9 3.0
4 4.6 3.1 1.5 0.2 setosa 4.7 3.2
5 5.0 3.6 1.4 0.2 setosa 4.6 3.1
6 5.4 3.9 1.7 0.4 setosa 5.0 3.6
Variables not shown: lag_Petal.Length (dbl)
But, instead of doing this three times, I want to create 100 of these “lag” variables that take in the name: lag_original variable name. I'm trying to figure out how to do this without typing the new variable name 100 times, but I'm coming up short.
I've looked into this example and this example elsewhere on SO. They are similar, but I'm not quite able to piece together the specific solution I need. Any help is appreciated!
Edit
Thanks to @BenFasoli for the inspiration. I took his answer and tweaked it just a bit to get the solution I needed. I also used This RStudio Blog and This SO post . The "lag" in the variable name is trailing instead of leading, but I can live with that.
My final code is posted here in case it's helpful to anyone else:
lagged <- iris %>%
group_by(Species) %>%
mutate_at(
vars(Sepal.Length:Petal.Length),
funs("lag" = lag)) %>%
ungroup
# A tibble: 6 x 8
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_lag Sepal.Width_lag
<dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl>
1 5.1 3.5 1.4 0.2 setosa NA NA
2 4.9 3.0 1.4 0.2 setosa 5.1 3.5
3 4.7 3.2 1.3 0.2 setosa 4.9 3.0
4 4.6 3.1 1.5 0.2 setosa 4.7 3.2
5 5.0 3.6 1.4 0.2 setosa 4.6 3.1
6 5.4 3.9 1.7 0.4 setosa 5.0 3.6
# ... with 1 more variables: Petal.Length_lag <dbl>
You can use mutate_all
(or mutate_at
for specific columns) then prepend lag_
to the column names.
data(iris)
library(dplyr)
lag_iris <- iris %>%
group_by(Species) %>%
mutate_all(funs(lag(.))) %>%
ungroup
colnames(lag_iris) <- paste0('lag_', colnames(lag_iris))
head(lag_iris)
lag_Sepal.Length lag_Sepal.Width lag_Petal.Length lag_Petal.Width lag_Species
<dbl> <dbl> <dbl> <dbl> <fctr>
1 NA NA NA NA setosa
2 5.1 3.5 1.4 0.2 setosa
3 4.9 3.0 1.4 0.2 setosa
4 4.7 3.2 1.3 0.2 setosa
5 4.6 3.1 1.5 0.2 setosa
6 5.0 3.6 1.4 0.2 setosa
Here is a data.table approach. I chose columns with numbers in this case. What you want to do is to choose column names and create new column names in advance. Then, you apply shift()
, which works like lag()
and lead()
in the dplyr package, to each of the columns you chose.
library(data.table)
# Crate a df for this demo.
mydf <- iris
# Choose columns that you want to apply lag() and create new colnames.
cols = names(iris)[sapply(iris, is.numeric)]
anscols = paste("lag_", cols, sep = "")
# Apply shift() to each of the chosen columns.
setDT(mydf)[, (anscols) := shift(.SD, 1, type = "lag"),
.SDcols = cols]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species lag_Sepal.Length lag_Sepal.Width
1: 5.1 3.5 1.4 0.2 setosa NA NA
2: 4.9 3.0 1.4 0.2 setosa 5.1 3.5
3: 4.7 3.2 1.3 0.2 setosa 4.9 3.0
4: 4.6 3.1 1.5 0.2 setosa 4.7 3.2
5: 5.0 3.6 1.4 0.2 setosa 4.6 3.1
---
146: 6.7 3.0 5.2 2.3 virginica 6.7 3.3
147: 6.3 2.5 5.0 1.9 virginica 6.7 3.0
148: 6.5 3.0 5.2 2.0 virginica 6.3 2.5
149: 6.2 3.4 5.4 2.3 virginica 6.5 3.0
150: 5.9 3.0 5.1 1.8 virginica 6.2 3.4
lag_Petal.Length lag_Petal.Width
1: NA NA
2: 1.4 0.2
3: 1.4 0.2
4: 1.3 0.2
5: 1.5 0.2
---
146: 5.7 2.5
147: 5.2 2.3
148: 5.0 1.9
149: 5.2 2.0
150: 5.4 2.3
Since you're also happy with a non-dplyr, try this:
lagger <- function(x, n) c(rep(NA,n), head(x,-n) )
iris[paste0("lag_", names(iris) )] <- lapply(iris, lagger, n=1)
head(iris,2)[-(1:5)]
# lag_Sepal.Length lag_Sepal.Width lag_Petal.Length lag_Petal.Width lag_Species
#1 NA NA NA NA NA
#2 5.1 3.5 1.4 0.2 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.