簡體   English   中英

不能在 r 中的 cross() 中使用 dplyr 編程語法

[英]can not use dplyr programming syntax in across() in r

我想使用 dplyr 編程語法(結合!!:= )來評估.fn參數中的 function 但失敗了。 像這樣的代碼:

library(zoo)
library(glue)

aa = structure(list(region = c(1, 2, 3, 4), co_mean = c(5, 5, 5, 5
), o3_mean = c(5, 5, 5, 5), pm2.5_mean = c(5, 5, 5, 5)), row.names = c(NA, 
                                                                       -4L), class = c("tbl_df", "tbl", "data.frame"))


for (i in 1:3) {
  
fun_name_1 = glue('lag{i}')
fun_name_2 = glue('lag0{i}')
aa = aa %>% group_by(region) %>% 
  mutate(across(.cols = contains('mean'), 
                .fns = list(!!fun_name_1 := ~lag(., i), # ERROR OCCUR AT HERE
                            !!fun_name_2 := ~ rollmeanr(., i)),
                .names = '{.col}_{.fn}'))
aa
}

我不知道如何解決它。

任何幫助將不勝感激!

======更新========

我的新代碼和新錯誤:

library(zoo)
library(glue)

aa = structure(list(region = c(1, 2, 3, 4), co_mean = c(5, 5, 5, 5
), o3_mean = c(5, 5, 5, 5), pm2.5_mean = c(5, 5, 5, 5)), row.names = c(NA, 
                                                                       -4L), class = c("tbl_df", "tbl", "data.frame"))


for (i in 1:3) {
 # i <- 1
  fun_name_1 = glue('lag{i}')
  fun_name_2 = glue('lag0{i}')
  aa %>%
    group_by(region) %>% 
    mutate(across(.cols = contains('mean'), 
                  .fns = setNames(list(~lag(., i),
                                       ~ rollmeanr(., i)), c(fun_name_1, fun_name_2)),
                  .names = '{.col}_{.fn}'))
aa
}

# Error: Problem with `mutate()` input `..1`.
# x 'names' attribute [6] must be the same length as the vector [5]
# i Input `..1` is `across(...)`.
# i The error occurred in group 1: region = 1.
# Run `rlang::last_error()` to see where the error occurred.

它將作為命名list工作。 首先傳遞一個組是非常有意義的(假設 OP 的原始示例數據每組有多行)

i <- 1
fun_name_1 = glue('lag{i}')
fun_name_2 = glue('lag0{i}')
aa %>%
  group_by(region) %>% 
  mutate(across(.cols = contains('mean'), 
               .fns = setNames(list(~lag(., i),
                         ~ rollmeanr(., i)), c(fun_name_1, fun_name_2)),
                .names = '{.col}_{.fn}'))

-輸出

# A tibble: 4 x 10
# Groups:   region [4]
#  region co_mean o3_mean pm2.5_mean co_mean_lag1 co_mean_lag01 o3_mean_lag1 o3_mean_lag01 pm2.5_mean_lag1 pm2.5_mean_lag01
#   <dbl>   <dbl>   <dbl>      <dbl>        <dbl>         <dbl>        <dbl>         <dbl>           <dbl>            <dbl>
#1      1       5       5          5           NA             5           NA             5              NA                5
#2      2       5       5          5           NA             5           NA             5              NA                5
#3      3       5       5          5           NA             5           NA             5              NA                5
#4      4       5       5          5           NA             5           NA             5              NA         

可以在rollmean中指定fill = TRUE

aa %>%
   group_by(region) %>% 
   mutate(across(.cols = contains('mean'), 
                .fns = setNames(list(~lag(., i),
                          ~ rollmeanr(., i, fill = TRUE)), c(fun_name_1, fun_name_2)),
                 .names = '{.col}_{.fn}'))

首先,我認為您的數據不應該被分組,至少對於共享的數據,在組中只有 1 行然后計算lag值和滾動平均值是沒有意義的。

您可以使用map_dfc .names across內容組合成一個 dataframe。

library(dplyr)
library(purrr)
library(zoo)

map_dfc(1:3, function(x) {
  aa %>% 
    transmute(across(.cols = contains('mean'), 
                  .fns = list(lag = ~lag(., x), 
                              lag0 = ~rollmeanr(., x, fill = NA)), 
                   .names = sprintf('{fn}_{col}_%d', x)))
  })

如果您在另一個數據集上嘗試它,您可以添加group_by(Region)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM