Subsetting a column where all values are the same character value in r

Question

I am trying to identify data frame columns where the columns have a single character value tree .

Here is an example dataset.

df <- data.frame(id = c(1,2,3,4,5),
                 var.1 = c(5,6,7,"tree",4),
                 var.2 = c("tree","tree","tree","tree","tree"),
                 var.3 = c(4,5,8,9,1))

> df
  id var.1 var.2 var.3
1  1     5  tree     4
2  2     6  tree     5
3  3     7  tree     8
4  4  tree  tree     9
5  5     4  tree     1

I would flag the Var.2 variable since it has all "tree values in it.

flagged [1] "var.2"

Any ideas? Thanks!

Answer 1

Using dplyr, you could do

library(dplyr)

flagged <- df %>%
  select(where(~n_distinct(.x) == 1 && unique(.x) == "tree")) %>%
  names()

where you select all columns that only have one distinct value which equals "tree", and then extract the column names.

Answer 2

For each column, check if all elements equal the first element.

df <- data.frame(id = c(1,2,3,4,5),
                 var.1 = c(5,6,7,"tree",4),
                 var.2 = c("tree","tree","tree","tree","tree"),
                 var.3 = c(4,5,8,9,1))


names(df)[sapply(df, function(x) all(x == x[1]))]
#> [1] "var.2"

^{Created on 2022-02-17 by the reprex package (v2.0.1)}

Subsetting a column where all values are the same character value in r

Question

2 answers

solution1
3 2022-02-17 18:00:34

solution2
1 ACCPTED 2022-02-17 17:53:05

Subsetting a column where all values are the same character value in r

Question

2 answers

solution1 3 2022-02-17 18:00:34

solution2 1 ACCPTED 2022-02-17 17:53:05

solution1
3 2022-02-17 18:00:34

solution2
1 ACCPTED 2022-02-17 17:53:05