简体   繁体   English

将包含“group_by”的dplyr function应用于多列

[英]Apply dplyr function which contains "group_by" to multiple columns

Data:数据:

structure(list(Reversal_MSignal = c(0, 0, 0.5, 0, -0.5, 0, 0, 
0, 0, 0), Reversal_MSignal_time = c(1L, 2L, 1L, 2L, 1L, 2L, 3L, 
4L, 5L, 6L)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", 
"data.frame"))

I have a column called Reversal_MSignal and I would like to create a new column called Reversal_MSignal_time which is a number sequence that restarts from 1 when Reversal_MSignal != 0 .我有一个名为 Reversal_MSignal 的列,我想创建一个名为 Reversal_MSignal_time 的新列,它是一个数字序列,当Reversal_MSignal != 0时从 1 重新开始。

The following code creates the wanted column.以下代码创建所需的列。 However, the problem is that I need to create this type of column for a multiple columns但是,问题是我需要为多列创建这种类型的列

data %>% group_by(gr = cumsum(Reversal_MSignal != 0)) %>% 
  mutate(Reversal_MSignal = row_number()) %>% ungroup() %>% 
  select(-gr)
}

I was thinking about creating a function and then applying it through "apply" or through mutate(across(everything(), Timeframe)) .我正在考虑创建一个 function,然后通过“应用”或通过mutate(across(everything(), Timeframe))应用它。

Timeframe <- function(Column){
group_by(gr = cumsum(Column != 0)) %>% 
  mutate(Column = row_number()) %>% ungroup() %>% 
  select(-gr) %>% 
  rename(Reversal_MSignal_time = Reversal_MSignal)
}

df %>% lapply(., Timeframe)

however, I get this error with lapply and with mutate it doesn't behave as wanted但是,我在lapplymutate中遇到了这个错误,它的行为不如预期

Error in UseMethod("group_by") : no applicable method for 'group_by' applied to an object of class "c('integer', 'numeric')"

I'm also open to faster solutions however I would prefer those using dplyr package as it is easier for me to understand what is going on我也愿意接受更快的解决方案,但我更喜欢那些使用 dplyr package 的解决方案,因为我更容易理解发生了什么

Your main problem was that the group_by inside your function was not being passed the data frame as the first argument.您的主要问题是 function 中的group_by没有作为第一个参数传递给数据框。 You can try this:你可以试试这个:

library(tidyverse)

data <- structure(list(Reversal_MSignal = c(0, 0, 0.5, 0, -0.5, 0, 0, 0, 0, 0), 
                       Reversal_MSignal_time = c(1L, 2L, 1L, 2L, 1L, 2L, 3L, 4L, 5L, 6L)), 
                  row.names = c(NA, -10L), 
                  class = c("tbl_df", "tbl", "data.frame"))

Timeframe <- function(data, Column){
    data %>% 
        group_by(gr = cumsum(.data[[Column]] != 0)) %>% 
        mutate("{{Column}}.time" := row_number()) %>% 
        ungroup() %>% 
        select(-gr) %>% 
        rename_with(~str_remove_all(., "\\\""))
}

Timeframe(data, "Reversal_MSignal")
Timeframe(data, "Reversal_MSignal_time")

reduce(names(data), Timeframe, .init = data) 

The reduce statement will loop through all of the columns of data and apply the Timeframe function, and then use the output as the next input. reduce语句将遍历所有data列并应用Timeframe function,然后使用 output 作为下一个输入。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM