简体   繁体   中英

How to call column names from an object in dplyr?

I am trying to replace all zeros in multiple columns with NA using dplyr. However, since I have many variables, I do not want to call them all by one, but rather store them in an object that I can call afterwards.

This is a minimal example of what I did:

library(dplyr)

Data <- data.frame(var1=c(1:10), var2=rep(c(0,4),5), var3 = rep(c(2,0,3,4,5),2), var4 = rep(c(7,0),5))

col <- Data[,c(2:4)]

Data <- Data %>%
  mutate(across(col , na_if, 0))

However, if I do this, I get the following error message:

Error: Problem with 'mutate()' input '..1'.
x Must subset columns with a valid subscript vector.
x Subscript has the wrong type 'data.frame<

  var2: double 

  var3: double

  var4: double>'.

i It must be numeric or character.

i Input '..1' is '(function (.cols = everything(), .fns = NULL, ..., .names = NULL) ...'.

I have tried to change the format of col to a tibble, but that did not help.

Could anyone tell me how to make this work?

In case you wanted to target numeric columns only, then try helper functions like where() , which will select any variable where the function returns TRUE . I suppose the only benefit here is targeting a specific type of variable.

library(dplyr)

# The where() function will select var2, var3, and var4
# Note: var1 is an integer so the function returns FALSE
# Useful when you want to completely ignore a specific type of variable

Data <- data.frame(
  var1 = c(1:10),   
  var2 = rep(c(0, 4),5), 
  var3 = rep(c(2, 0 ,3, 4, 5), 2), 
  var4 = rep(c(7, 0), 5)
  )

Data %>%
  mutate(across(where(is.numeric), ~na_if(., 0)))

Here is the output:

   var1 var2 var3 var4
1     1   NA    2    7
2     2    4   NA   NA
3     3   NA    3    7
4     4    4    4   NA
5     5   NA    5    7
6     6    4    2   NA
7     7   NA   NA    7
8     8    4    3   NA
9     9   NA    4    7
10   10    4    5   NA

The other answer you'll find here is great and allows you to select any arbitrary number of columns.

Here, the col should be names of the Data. As there is a function name with col , we can name the object differently, wrap with all_of and replace the 0 to NA within across

library(dplyr)
col1 <- names(Data)[2:4]
Data <- Data %>%
   mutate(across(all_of(col1) , na_if, 0))

-output

Data
#   var1 var2 var3 var4
#1     1   NA    2    7
#2     2    4   NA   NA
#3     3   NA    3    7
#4     4    4    4   NA
#5     5   NA    5    7
#6     6    4    2   NA
#7     7   NA   NA    7
#8     8    4    3   NA
#9     9   NA    4    7
#10   10    4    5   NA

NOTE: Here the OP asked about looping based on either the index or the column names

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM