简体   繁体   中英

R - subset rows based on column names (in a vector) and specific values in those columns

This is how my df looks like:

df <- data.frame(WoS = c(1L, NA, 1L, NA, 1L, NA), Scopus = c(1L, 1L, 1L, 1L, NA, NA), Dim = c(NA, NA, 1L, 1L, 1L, 1L), Lens = c(NA, NA, NA, 1L, NA, 1L))

or:

| WoS| Scopus| Dim| Lens| # (+ various other columns)
|---:|------:|---:|----:|
|   1|      1|  NA|   NA|
|  NA|      1|  NA|   NA|
|   1|      1|   1|   NA|
|  NA|      1|   1|    1|
|   1|     NA|   1|   NA|
|  NA|     NA|   1|    1|

# (+ hundreds of other rows in which 1 and NAs are distributed among these four columns)

I want to subset df based on a vector in which column names are stored; the values of at least one of these columns should equal 1 .

The other columns not mentioned in vec should be NA .

Example:

Say that I have a vector vec <- c("WoS", "Scopus") .

Then I want to select all rows where df$WoS = 1 OR df$Scopus = 1 , and where is.na(df$Dim) and is.na(df$Lens) :

| WoS| Scopus| Dim| Lens| # (+ keep all other columns ...)
|---:|------:|---:|----:|
|   1|      1|  NA|   NA|
|  NA|      1|  NA|   NA|
|   1|     NA|  NA|   NA|
|  NA|      1|  NA|   NA|
|   1|      1|  NA|   NA|

How to do it in the best way?

We can store the column names into vectors, and then apply filter for different conditions.

library(dplyr)

target1 <- c("WoS", "Scopus")
target2 <- c("Dim", "Lens")

df2 <- df %>%
  filter(rowSums(select(., all_of(target1)), na.rm = TRUE) <= 2) %>%
  filter(across(all_of(target2), .fns = is.na))
df2
#   WoS Scopus Dim Lens
# 1   1      1  NA   NA
# 2  NA      1  NA   NA

If you don't like to use rowSums as the values in some columns may not be strictly one, we can change to the following, using filter and if_any .

df2 <- df %>%
  filter(if_any(all_of(target1), .fns = function(x) x == 1)) %>%
  filter(across(all_of(target2), .fns = is.na))
df2
#   WoS Scopus Dim Lens
# 1   1      1  NA   NA
# 2  NA      1  NA   NA

We can also change the across in the second filter function to if_all .

df2 <- df %>%
  filter(if_any(all_of(target1), .fns = function(x) x == 1)) %>%
  filter(if_all(all_of(target2), .fns = is.na))
df2
#   WoS Scopus Dim Lens
# 1   1      1  NA   NA
# 2  NA      1  NA   NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM