[英]Selecting variables that are in a vector with a substitution string in R
I have this dataset:我有这个数据集:
df <- data.frame(kgs_chicken = c(0,1,2,1,2,3,0,1,2,8),
kgs_total = c(2,4,8,2,3,4,2,4,6,20),
price = c(0.81, 1.42, 2.85, 0.73, 1.07,
1.52, 0.72, 1.42, 1.94, 7.44))
And I applied some transformations:我应用了一些转换:
df_trans <- df %>%
mutate(ratio = kgs_chicken / kgs_total,
kgs_chicken_ln = log(kgs_chicken - min(kgs_chicken) + 1),
kgs_total_ln = log(kgs_total - min(kgs_total) + 1),
ratio_price_kgs_total = price / kgs_total)
Then, after running an algorithm I am recommended to pick up some variables.然后,在运行算法后,建议我选择一些变量。 This algorithm return just the vector with the names of the variables (which are hardcoded here):
这个算法只返回带有变量名称的向量(这里是硬编码的):
filter_vector <- c("kgs_chicken_ln", "kgs_total")
Ok, I want to select only the variables applying that vector, but if one of the elements of the vector has a "_ln" string, I want the variable without the "_ln".好的,我只想选择应用该向量的变量,但是如果向量的元素之一具有“_ln”字符串,则我想要没有“_ln”的变量。 I have tried this:
我试过这个:
df %>%
select(across(ends_with("_ln"), .fns = function (x) gsub("_ln","",names(x))))
But I get an error:但我收到一个错误:
Error: `across()` must only be used inside dplyr verbs.
The expected result is:预期的结果是:
kgs_chicken kgs_total
1 0 2
2 1 4
3 2 8
4 1 2
5 2 3
6 3 4
7 0 2
8 1 4
9 2 6
10 8 20
Consider that I have a dataset with hundreds of variables so a solution could help me to automate that selection.考虑到我有一个包含数百个变量的数据集,因此解决方案可以帮助我自动进行选择。 Any help would be greatly appreciated.
任何帮助将不胜感激。
We may use我们可能会使用
library(dplyr)
df %>%
select(starts_with(trimws(filter_vector, whitespace = "_.*")))
kgs_chicken kgs_total
1 0 2
2 1 4
3 2 8
4 1 2
5 2 3
6 3 4
7 0 2
8 1 4
9 2 6
10 8 20
Will this work:这是否有效:
library(dplyr)
library(stringr)
df_trans %>% select(filter_vector) %>%
rename_at(vars(ends_with('_ln')), ~ str_remove(., '_ln'))
kgs_chicken kgs_total
1 0.0000000 2
2 0.6931472 4
3 1.0986123 8
4 0.6931472 2
5 1.0986123 3
6 1.3862944 4
7 0.0000000 2
8 0.6931472 4
9 1.0986123 6
10 2.1972246 20
You may remove _ln
string from the vector and select the column.您可以从向量中删除
_ln
字符串并选择列。
df[sub('_ln$', '', filter_vector)]
# kgs_chicken kgs_total
#1 0 2
#2 1 4
#3 2 8
#4 1 2
#5 2 3
#6 3 4
#7 0 2
#8 1 4
#9 2 6
#10 8 20
In dplyr
, you can use it within select
-在
dplyr
,您可以在select
使用它 -
library(dplyr)
df %>% select(sub('_ln$', '', filter_vector))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.