只保留符合条件的列

Question

I have a large data frame whose values are either TRUE , FALSE , or NA .我有一个大数据框，其值为TRUE 、 FALSE或NA 。 I want to keep only the columns that contains at least one TRUE value.我只想保留至少包含一个TRUE值的列。 How do achieve this?如何做到这一点？

Here's a minimal example:这是一个最小的例子：

df <- data.frame(
   c1 = c(FALSE,FALSE,FALSE,FALSE),
   c2 = c(FALSE,TRUE,FALSE,NA),
   c3 = c(FALSE,NA,TRUE,NA),
   c4 = c(FALSE,FALSE,NA,NA)
 )
> df
     c1    c2    c3    c4
1 FALSE FALSE FALSE FALSE
2 FALSE  TRUE    NA FALSE
3 FALSE FALSE  TRUE    NA
4 FALSE    NA    NA    NA

I want to remove columns c1 and c4 , and keep only c2 and c3 .我想删除列c1和c4 ，只保留c2和c3 。 I know that TRUE values exist in my original larger data frame (using table(df==TRUE) ), but I don't know which function(s) to use to identify their columns.我知道TRUE值存在于我原来的较大数据框中（使用table(df==TRUE) ），但我不知道使用哪个函数来标识它们的列。

Answer 1

We can use select with any我们可以将select与any

library(dplyr)
df %>%
   select(where(~ is.logical(.x) && any(.x, na.rm = TRUE)))

-output -输出

  c2    c3
1 FALSE FALSE
2  TRUE    NA
3 FALSE  TRUE
4    NA    NA

Or in base R with colSums on the columns and check if the sum is greater than 1 ( TRUE -> 1 and FALSE -> 0)或者在base R中，列上有colSums并检查总和是否大于 1（ TRUE -> 1 和FALSE -> 0）

df[colSums(df, na.rm = TRUE) > 0]

-output -输出

   c2    c3
1 FALSE FALSE
2  TRUE    NA
3 FALSE  TRUE
4    NA    NA

只保留符合条件的列

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-04-02 14:59:45

只保留符合条件的列

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-04-02 14:59:45

解决方案1
1 已采纳 2022-04-02 14:59:45