基于列总和的子集小标题，同时保留字符列

Question

我觉得这是一个非常愚蠢的问题，但我也无法找到解决方案

我有一个小标题，其中每一行都是一个样本，第一列是一个包含样本ID的字符变量，所有后续列都是带有数字变量的变量。

例如：

id <- c("a", "b", "c", "d", "e")
x1 <- rep(1,5)
x2 <- seq(1,5,1)
x3 <- rep(2,5)    
x4 <- seq(0.1, 0.5, 0.1)
tb <- tibble(id, x1, x2, x3, x4)

我想对此进行子集化，以仅包括总和大于5的列和id列。 使用旧的数据框结构，我知道以下工作方式：

df <- as.data.frame(tb)
df2 <- cbind(df$id, df[,colSums(df[,2:5])>5)
colnames(df2)[1] <- "id"

但是，当我尝试以这种方式对子集进行细化时，出现错误消息：

Error: Length of logical index vector must be 1 or 5, got: 4

有谁知道如何在不转换为旧数据帧格式的情况下完成此任务？ 最好不要创建缺少id变量的中间小标题，因为将我的id与数据分开只是在路上麻烦。

谢谢！

Answer 1

# install.packages(c("tidyverse"), dependencies = TRUE)
library(tibble)
df <- tibble(id = letters[1:5], x1 = 1, x2 = 1:5, x3 = 2, x4 = seq(.1, .5, len = 5))
### two additional examples of how to generate the Tibble data
### exploiting that its arguments are evaluated lazily and sequentially
# df <- tibble(id = letters[1:5], x1 = 1, x2 = 1:5, x3 = x1 + 1, x4 = x2/10)
# df <- tibble(x2 = 1:5, id = letters[x2], x3 = 2, x1 = x3-1, x4 = x2/10) %>%
#              select(id, num_range("x", 1:4))

base R解决方案，请参阅。 HubertL的上述评论，

###  HubertL's base solution
df[c(TRUE,colSums(df[2:5])>5)]
#> # A tibble: 5 x 3
#>      id    x2    x3
#>   <chr> <int> <dbl>
#> 1     a     1     2
#> 2     b     2     2
#> 3     c     3     2
#> 4     d     4     2
#> 5     e     5     2

dplyr解决方案，请dplyr David Klotz的评论，

### Klotz's dplyr solution
library(dplyr)
df %>% select_if(function(x) is.character(x) || sum(x) > 5)
#> # A tibble: 5 x 3
#>      id    x2    x3
#>   <chr> <int> <dbl>
#> 1     a     1     2
#> 2     b     2     2
#> 3     c     3     2
#> 4     d     4     2
#> 5     e     5     2

基于列总和的子集小标题，同时保留字符列

问题描述

1 个解决方案

解决方案1
0 2017-10-18 00:00:33

基于列总和的子集小标题，同时保留字符列

问题描述

1 个解决方案

解决方案1 0 2017-10-18 00:00:33

解决方案1
0 2017-10-18 00:00:33