简体   繁体   中英

Subsetting a column where all values are the same character value in r

I am trying to identify data frame columns where the columns have a single character value tree .

Here is an example dataset.

df <- data.frame(id = c(1,2,3,4,5),
                 var.1 = c(5,6,7,"tree",4),
                 var.2 = c("tree","tree","tree","tree","tree"),
                 var.3 = c(4,5,8,9,1))

> df
  id var.1 var.2 var.3
1  1     5  tree     4
2  2     6  tree     5
3  3     7  tree     8
4  4  tree  tree     9
5  5     4  tree     1

I would flag the Var.2 variable since it has all "tree values in it.

flagged [1] "var.2"

Any ideas? Thanks!

Using dplyr, you could do

library(dplyr)

flagged <- df %>%
  select(where(~n_distinct(.x) == 1 && unique(.x) == "tree")) %>%
  names()

where you select all columns that only have one distinct value which equals "tree", and then extract the column names.

For each column, check if all elements equal the first element.

df <- data.frame(id = c(1,2,3,4,5),
                 var.1 = c(5,6,7,"tree",4),
                 var.2 = c("tree","tree","tree","tree","tree"),
                 var.3 = c(4,5,8,9,1))


names(df)[sapply(df, function(x) all(x == x[1]))]
#> [1] "var.2"

Created on 2022-02-17 by the reprex package (v2.0.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM