简体   繁体   English

使用dplyr和purrr重复变量变量

[英]Repeatedly mutate variable using dplyr and purrr

I'm self-taught in R and this is my first StackOverflow question. 我在R中自学成才,这是我的第一个StackOverflow问题。 I apologize if this is an obvious issue; 如果这是一个明显的问题,我道歉; please be kind. 请善待。

Short Version of my Question 我的问题的简短版本
I wrote a custom function to calculate the percent change in a variable year over year. 我编写了一个自定义函数来计算变量年中变化百分比。 I would like to use purrr 's map_at function to apply my custom function to a vector of variable names. 我想使用purrrmap_at函数将我的自定义函数应用于变量名的向量。 My custom function works when applied to a single variable, but fails when I chain it using map_a 我的自定义函数在应用于单个变量时有效,但在使用map_a时失败

My custom function 我的自定义功能

calculate_delta <- function(df, col) {

  #generate variable name
  newcolname = paste("d", col, sep="")

  #get formula for first difference.
  calculate_diff <- lazyeval::interp(~(a + lag(a))/a, a = as.name(col))

  #pass formula to mutate, name new variable the columname generated above
  df %>% 
        mutate_(.dots = setNames(list(calculate_diff), newcolname)) }

When I apply this function to a single variable in the mtcars dataset, the output is as expected (although obviously the meaning of the result is non-sensical). 当我将此函数应用于mtcars数据集中的单个变量时,输出与预期一致(尽管显然结果的含义是非敏感的)。

calculate_delta(mtcars, "wt")

Attempt to Apply the Function to a Character Vector Using Purrr 尝试使用Purrr将函数应用于字符向量

I think that I'm having trouble conceptualizing how map_at passes arguments to the function. 我认为我无法概念化map_at如何将参数传递给函数。 All of the example snippets I can find online use map_at with functions like is.character , which don't require additional arguments. 我可以在网上找到的所有示例片段都使用map_at和is.characteris.character ,它们不需要额外的参数。 Here are my attempts at applying the function using purrr . 以下是我尝试使用purrr应用该函数。

vars <- c("wt", "mpg")
mtcars %>% map_at(vars, calculate_delta)

This gives me this error message 这给了我这个错误信息

Error in paste("d", col, sep = "") : argument "col" is missing, with no default 粘贴错误(“d”,col,sep =“”):缺少参数“col”,没有默认值

I assume this is because map_at is passing vars as the df , and not passing an argument for col . 我假设这是因为map_at将vars作为df传递,而不传递col的参数。 To get around that issue, I tried the following: 为了解决这个问题,我尝试了以下方法:

vars <- c("wt", "mpg") 
mtcars %>% map_at(vars, calculate_delta, df = .)

That throws me this error: 这引发了我这个错误:

Error: unrecognised index type

I've monkeyed around with a bunch of different versions, including removing the df argument from the calculate_delta function, but I have had no luck. 我和一堆不同的版本一起玩,包括从calculate_delta函数中删除df参数,但我没有运气。

Other potential solutions 其他潜在解决方案

1) A version of this using sapply , rather than purrr . 1)使用sapply而不是purrr I've tried solving the problem that way and had similar trouble. 我试过这样解决问题并遇到类似麻烦。 And my goal is to figure out a way to do this using purrr, if that is possible. 我的目标是找出一种方法来使用purrr,如果可能的话。 Based on my understanding of purrr , this seems like a typical use case. 基于我对purrr理解,这似乎是一个典型的用例。

2) I can obviously think of how I would implement this using a for loop, but I'm trying to avoid that if possible for similar reasons. 2)我显然可以想到如何使用for循环来实现它,但是我试图避免这种情况,如果可能的话。

Clearly I'm thinking about this wrong. 显然我在考虑这个错误。 Please help! 请帮忙!

EDIT 1 编辑1

To clarify, I am curious if there is a method of repeatedly transforming variables that accomplishes two things. 为了澄清,我很好奇是否有一种方法可以反复转换完成两件事的变量。

1) Generates new variables within the original tbl_df without replacing replace the columns being mutated (as is the case when using dplyr 's mutate_at ). 1)在原始tbl_df生成新变量而不替换替换被变异的列(如使用dplyrmutate_at时的情况)。

2) Automatically generates new variable labels. 2)自动生成新的变量标签。

3) If possible, accomplishes what I've described by applying a single function using map_at . 3)如果可能,通过使用map_at应用单个函数来完成我所描述的map_at

It may be that this is not possible, but I feel like there should be an elegant way to accomplish what I am describing. 这可能是不可能的,但我觉得应该有一种优雅的方式来完成我所描述的内容。

Try simplifying the process: 尝试简化流程:

delta <- function(x) (x + dplyr::lag(x)) /x
cols <- c("wt", "mpg")

#This
library(dplyr)
mtcars %>% mutate_at(cols, delta)
#Or
library(purrr)
mtcars %>% map_at(cols, delta)

#If necessary, in a function
f <- function(df, cols) {
  df %>% mutate_at(cols, delta)
}

f(iris, c("Sepal.Width", "Petal.Length"))
f(mtcars, c("wt", "mpg"))

Edit 编辑

If you would like to embed new names after, we can write a custom pipe-ready function: 如果您想在之后嵌入新名称,我们可以编写一个自定义管道就绪函数:

Rename <- function(object, old, new) {
  names(object)[names(object) %in% old] <- new
  object
}

mtcars %>% 
  mutate_at(cols, delta) %>% 
  Rename(cols, paste0("lagged",cols))

If you want to rename the resulting lagged variables: 如果要重命名结果滞后变量:

mtcars %>% mutate_at(cols, funs(lagged = delta))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM