[英]How to deselect many variables without removing specific variables in dplyr
Say there is a data frame that has a structure like this:假设有一个具有如下结构的数据框:
df <- data.frame(x.1 = rnorm(n=100),
x.2 = rnorm(n=100),
x.3 = rnorm(n=100),
x.special = rnorm(n=100),
x.y.z = rnorm(n=100))
Inspecting the head, we get this output:检查头部,我们得到这个 output:
x.1 x.2 x.3 x.special x.y.z
1 1.01014580 -1.4047666 1.50374721 -0.8339784 -0.0831983
2 0.44307253 -0.4695634 -0.71951820 1.5758893 1.2163749
3 -0.87051845 0.1793721 -0.26838489 -1.0477929 -1.0813926
4 -0.28491936 0.4186763 -0.07494088 -0.2177471 0.3490200
5 -0.03769566 -0.3656822 0.12478667 -0.7975811 -0.4481193
6 -0.83808036 0.6842561 0.71231627 -0.3348798 1.7418141
Suppose I want to remove all the numbered variables but keep the x.special
and xyz
variables.假设我想删除所有编号的变量,但保留x.special
和xyz
变量。 I know that I can easily deselect with:我知道我可以很容易地取消选择:
df %>%
select(-x.1,
-x.2,
-x.3)
However for something like 50 or 100 variables like this, it would become cumbersome.但是对于像这样的 50 或 100 个变量,它会变得很麻烦。 Similarly, I know I can pick patterns like so:同样,我知道我可以选择这样的模式:
df %>%
select(-contains("x."))
But this of course removes everything because the special variables have the .
但这当然会删除所有内容,因为特殊变量具有.
name.姓名。 Is there a more intelligent way of picking these variables?有没有更智能的方法来选择这些变量? I feel like there is an option for finding the numeric variable in the name.我觉得有一个选项可以在名称中找到数字变量。
# use regex to remove these colums...
colsBool <- !grepl(x=names(df), pattern="\\d")
Result:结果:
> head(df[, colsBool])
x.special x.y.z
1 1.1145156 -0.4911891
2 0.7059937 0.4500111
3 -0.6566422 1.6085353
4 -0.6322514 -0.8017260
5 0.4785106 0.6014765
6 -0.8508830 -0.5078307
Regular expressions are your best friend in this situation.在这种情况下,正则表达式是你最好的朋友。
For instance, if you wanted to remove columns whose last value is a number, just do ,grepl(pattern = "\\d$"....)
, the $
sign at the end of the expression will match only columns ending with a number.例如,如果您想删除最后一个值为数字的列,只需执行,grepl(pattern = "\\d$"....)
,表达式末尾的$
符号将仅匹配以结尾的列一个号码。 The !
!
sign in front of the grepl()
expression negates the values in the match, that is, a TRUE
becomes FALSE
and vice-versa. grepl()
表达式前面的符号否定匹配中的值,即TRUE
变为FALSE
,反之亦然。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.