[英]Create new variables with mutate_at while keeping the original ones
Consider this simple example:考虑这个简单的例子:
library(dplyr)
dataframe <- data_frame(helloo = c(1,2,3,4,5,6),
ooooHH = c(1,1,1,2,2,2),
ahaaa = c(200,400,120,300,100,100))
# A tibble: 6 x 3
helloo ooooHH ahaaa
<dbl> <dbl> <dbl>
1 1 1 200
2 2 1 400
3 3 1 120
4 4 2 300
5 5 2 100
6 6 2 100
Here I want to apply the function ntile
to all the columns that contains oo
, but I would like these new columns to be called cat
+ the corresponding column.在这里,我想将函数
ntile
应用于包含oo
所有列,但我希望将这些新列称为cat
+ 相应的列。
I know I can do this我知道我可以做到这一点
dataframe %>% mutate_at(vars(contains('oo')), .funs = funs(ntile(., 2)))
# A tibble: 6 x 3
helloo ooooHH ahaaa
<int> <int> <dbl>
1 1 1 200
2 1 1 400
3 1 1 120
4 2 2 300
5 2 2 100
6 2 2 100
But what I need is this但我需要的是这个
# A tibble: 8 x 5
helloo ooooHH ahaaa cat_helloo cat_ooooHH
<dbl> <dbl> <dbl> <int> <int>
1 1 1 200 1 1
2 2 1 400 1 1
3 3 1 120 1 1
4 4 2 300 2 2
5 5 2 100 2 2
6 5 2 100 2 2
7 6 2 100 2 2
8 6 2 100 2 2
Is there a solution that does NOT require to store the intermediate data, and merge back to the original dataframe?是否有不需要存储中间数据并合并回原始数据帧的解决方案?
Update 2020-06 for dplyr 1.0.0 dplyr 1.0.0 更新 2020-06
Starting in dplyr 1.0.0 , the across()
function supersedes the "scoped variants" of functions such as mutate_at()
.开始在dplyr 1.0.0,所述
across()
函数取代版本的函数的“范围的变体”如mutate_at()
The code should look pretty familiar within across()
, which is nested inside mutate()
.代码应内相当熟悉
across()
这是嵌套在mutate()
Adding a name to the function(s) you give in the list adds the function name as a suffix.为您在列表中给出的函数添加名称会将函数名称添加为后缀。
dataframe %>%
mutate( across(contains('oo'),
.fns = list(cat = ~ntile(., 2))) )
# A tibble: 6 x 5
helloo ooooHH ahaaa helloo_cat ooooHH_cat
<dbl> <dbl> <dbl> <int> <int>
1 1 1 200 1 1
2 2 1 400 1 1
3 3 1 120 1 1
4 4 2 300 2 2
5 5 2 100 2 2
6 6 2 100 2 2
Changing the new columns names is a little easier in 1.0.0 with the .names
argument in across()
.在 1.0.0 中使用
.names
across()
的.names
参数更容易更改新列名称。 Here is an example of adding the function name as a prefix instead of a suffix.这是将函数名称添加为前缀而不是后缀的示例。 This uses glue syntax.
这使用胶水语法。
dataframe %>%
mutate( across(contains('oo'),
.fns = list(cat = ~ntile(., 2)),
.names = "{fn}_{col}" ) )
# A tibble: 6 x 5
helloo ooooHH ahaaa cat_helloo cat_ooooHH
<dbl> <dbl> <dbl> <int> <int>
1 1 1 200 1 1
2 2 1 400 1 1
3 3 1 120 1 1
4 4 2 300 2 2
5 5 2 100 2 2
6 6 2 100 2 2
Original answer with mutate_at() mutate_at() 的原始答案
Edited to reflect changes in dplyr.编辑以反映 dplyr 中的更改。 As of dplyr 0.8.0,
funs()
is deprecated and list()
with ~
should be used instead.作为dplyr 0.8.0的,
funs()
已过时, list()
与~
应改为使用。
You can give names to the functions to the list you pass to .funs
to make new variables with the names as suffixes attached.您可以为传递给
.funs
的列表中的函数命名,以创建带有后缀名称的新变量。
dataframe %>% mutate_at(vars(contains('oo')), .funs = list(cat = ~ntile(., 2)))
# A tibble: 6 x 5
helloo ooooHH ahaaa helloo_cat ooooHH_cat
<dbl> <dbl> <dbl> <int> <int>
1 1 1 200 1 1
2 2 1 400 1 1
3 3 1 120 1 1
4 4 2 300 2 2
5 5 2 100 2 2
6 6 2 100 2 2
If you want it as a prefix instead, you could then use rename_at
to change the names.如果您希望将其作为前缀,则可以使用
rename_at
更改名称。
dataframe %>%
mutate_at(vars(contains('oo')), .funs = list(cat = ~ntile(., 2))) %>%
rename_at( vars( contains( "_cat") ), list( ~paste("cat", gsub("_cat", "", .), sep = "_") ) )
# A tibble: 6 x 5
helloo ooooHH ahaaa cat_helloo cat_ooooHH
<dbl> <dbl> <dbl> <int> <int>
1 1 1 200 1 1
2 2 1 400 1 1
3 3 1 120 1 1
4 4 2 300 2 2
5 5 2 100 2 2
6 6 2 100 2 2
Previous code with funs()
from earlier versions of dplyr : dplyr早期版本中带有 funs
funs()
先前代码:
dataframe %>%
mutate_at(vars(contains('oo')), .funs = funs(cat = ntile(., 2))) %>%
rename_at( vars( contains( "_cat") ), funs( paste("cat", gsub("_cat", "", .), sep = "_") ) )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.