简体   繁体   中英

create columns in dataframe with absent from string in R

I have a dataframe and a string.
I need to check if the elements in the string are in the colnames of the dataframe.
If not in the dataframe, I need to create a new column,
If they are in the dataframe, do nothing

Here is my reprex:

# dataframe


df <- data.frame(name = c("jon", "dan", "jim"), 
                 age = c(44, 33, 33))

# string

st <- c("house", "car", "pet")


# for just one element in the string, this works  

df %>%
          mutate(house = if (exists('house', where = .)) house else "not there")  


however, my function to apply to multiple elements is not working.. any help much appreciated..


make_missing_cols <- function(df, cols){

          for(i in cols) {
              
                    df <- df %>%      
                              mutate(cols[i] = if(exists(cols[i], where = df)) cols[i] else "not there")      
      
          }
          return(df)
          
}


 

In the function, we need an assignment operator as := and evaluation ( !! )

make_missing_cols <- function(df, cols){

          for(i in seq_along(cols) ){
           df <- df %>%      
             mutate(!!cols[i] := if(exists(cols[i], 
                    where = df)) cols[i] else "not there")      
          }
          return(df)
}

-testing

make_missing_cols(df, st)
#  name age     house       car       pet
#1  jon  44 not there not there not there
#2  dan  33 not there not there not there
#3  jim  33 not there not there not there

A different option could be:

df %>%
 add_column(!!!setNames(rep("not there", length(setdiff(st, names(.)))), setdiff(st, names(.))))

 name age     house       car       pet
1  jon  44 not there not there not there
2  dan  33 not there not there not there
3  jim  33 not there not there not there

You can use setdiff in base R:

make_missing_cols <- function(df, cols){
  df[setdiff(cols, names(df))] <- 'not there'
  df
}

make_missing_cols(df, st)

#  name age     house       car       pet
#1  jon  44 not there not there not there
#2  dan  33 not there not there not there
#3  jim  33 not there not there not there

make_missing_cols(df, c('house', 'name'))
#  name age     house
#1  jon  44 not there
#2  dan  33 not there
#3  jim  33 not there

make_missing_cols(df, 'name')

#  name age
#1  jon  44
#2  dan  33
#3  jim  33

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM