I am trying to identify data frame columns where the columns have a single character value tree
.
Here is an example dataset.
df <- data.frame(id = c(1,2,3,4,5),
var.1 = c(5,6,7,"tree",4),
var.2 = c("tree","tree","tree","tree","tree"),
var.3 = c(4,5,8,9,1))
> df
id var.1 var.2 var.3
1 1 5 tree 4
2 2 6 tree 5
3 3 7 tree 8
4 4 tree tree 9
5 5 4 tree 1
I would flag the Var.2
variable since it has all "tree
values in it.
flagged [1] "var.2"
Any ideas? Thanks!
Using dplyr, you could do
library(dplyr)
flagged <- df %>%
select(where(~n_distinct(.x) == 1 && unique(.x) == "tree")) %>%
names()
where you select all columns that only have one distinct value which equals "tree", and then extract the column names.
For each column, check if all elements equal the first element.
df <- data.frame(id = c(1,2,3,4,5),
var.1 = c(5,6,7,"tree",4),
var.2 = c("tree","tree","tree","tree","tree"),
var.3 = c(4,5,8,9,1))
names(df)[sapply(df, function(x) all(x == x[1]))]
#> [1] "var.2"
Created on 2022-02-17 by the reprex package (v2.0.1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.