简体   繁体   English

如何在 dplyr::across 中使用字符串操作函数 inside.names 参数

[英]How to use string manipulation functions inside .names argument in dplyr::across

Though I tried to search whether it is duplicate, but I cannot find similar question.虽然我试图搜索它是否重复,但我找不到类似的问题。 (though a similar one is there, but that is somewhat different from my requirement) (虽然有一个类似的,但这与我的要求有些不同)

My question is that whether we can use string manipulation function such substr or stringr::str_remove inside .names argument of dplyr::across .我的问题是,我们是否可以在 dplyr dplyr::across across 的.names参数中使用字符串操作 function 这样的substrstringr::str_remove As a reproducible example consider this作为一个可重复的例子,考虑这个

library(dplyr)
iris %>%
  summarise(across(starts_with('Sepal'), mean, .names = '{.col}_mean'))

  Sepal.Length_mean Sepal.Width_mean
1          5.843333         3.057333

Now my problem is that I want to rename output columns say str_remove(.col, 'Sepal') so that my output column names are just Length.mean and Width.mean .现在我的问题是我想重命名 output 列说str_remove(.col, 'Sepal')以便我的 output 列名只是Length.meanWidth.mean Why I am asking because, the description of this argument states that为什么我要问,因为,这个论点的描述表明

.names .names
A glue specification that describes how to name the output columns.描述如何命名 output 列的粘合规范。 This can use {.col} to stand for the selected column name, and {.fn} to stand for the name of the function being applied.这可以使用 {.col} 代表选定的列名,使用 {.fn} 代表正在应用的 function 的名称。 The default (NULL) is equivalent to "{.col}" for the single function case and "{.col}_{.fn}" for the case where a list is used for.fns.对于单个 function 情况,默认值 (NULL) 等效于“{.col}”,对于将列表用于.fns 的情况,默认值 (NULL) 等效于“{.col}_{.fn}”。

I have tried many possibilities including the following, but none of these work我尝试了很多可能性,包括以下,但这些都不起作用

library(tidyverse)
library(glue)
iris %>%
  summarise(across(starts_with('Sepal'), mean, 
                   .names = glue('{xx}_mean', xx = str_remove(.col, 'Sepal'))))

Error: Problem with `summarise()` input `..1`.
x argument `str` should be a character vector (or an object coercible to)
i Input `..1` is `(function (.cols = everything(), .fns = NULL, ..., .names = NULL) ...`.
Run `rlang::last_error()` to see where the error occurred.


#OR
iris %>%
  summarise(across(starts_with('Sepal'), mean, 
                   .names = glue('{xx}_mean', xx = str_remove(glue('{.col}'), 'Sepal'))))

I know that this can be solved by adding another step using rename_with so I am not looking after that answer.我知道这可以通过使用rename_with添加另一个步骤来解决,所以我不关心那个答案。

This works, but with probably a few caveats.这可行,但可能有一些警告。 You can use functions inside a glue specification, so you could clean up the strings that way.您可以在胶水规范中使用函数,因此您可以通过这种方式清理字符串。 However, when I tried escaping the "."但是,当我尝试 escaping 时, "." , I got an error, which I assume has something to do with how across parses the string. ,我得到了一个错误,我认为这与如何across解析字符串有关。 If you need something more dynamic, you might want to dig into the source code at that point.如果您需要更动态的东西,您可能想在那时深入研究源代码。

In order to use the {.fn} helper, at least in conjunction with creating the glue string on the fly like this, the function needs a name;为了使用{.fn}助手,至少在像这样动态创建粘合字符串时,function 需要一个名称; otherwise you get a number for the function's index in the .fns argument.否则,您将在.fns参数中获得函数索引的数字。 I tested this out with a second function and using lst for automatic naming.我用第二个 function 测试了这一点,并使用lst进行自动命名。

library(dplyr)
iris %>%
  summarise(across(starts_with('Sepal'), .fns = lst(mean, max), 
                   .names = '{stringr::str_remove(.col, "^[A-Za-z]+.")}_{.fn}'))
#>   Length_mean Length_max Width_mean Width_max
#> 1    5.843333        7.9   3.057333       4.4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM